Tagging biomedical grants with 29K tags

In a previous post we spoke about a neural architecture we developed for classifying our grants with ~5K disease tags from the MeSH (Medical subject Headings) hierarchy. In this post we will touch on the techniques needed to scale to a model to classify all ~29K MeSH tags. Our dataset consists of 14M biomedical publications labelled with one or more MeSH tags (on average 12 tags per publications), so the challenge is both the thousands of outputs our model needs to recognise and the millions of examples it needs to learn from....

December 13, 2021 · 9 min