07 Dec 2022

EMNLP 2022

Authors: Maciej Wiatrak, Eirini Arvaniti, Angus Brayne, Jonas Vetterle, Aaron Sim

Abstract

A recent advancement in the domain of biomedical Entity Linking is the development of powerful two-stage algorithms, an initial candidate retrieval stage that generates a shortlist of entities for each mention, followed by a candidate ranking stage. However, the effectiveness of both stages are inextricably dependent on computationally expensive components. Specifically, in candidate retrieval via dense representation retrieval it is important to have hard negative samples, which require repeated forward passes and nearest neighbour searches across the entire entity label set throughout training. In this work, we show that pairing a proxy-based metric learning loss with an adversarial regularizer provides an efficient alternative to hard negative sampling in the candidate retrieval stage. In particular, we show competitive performance on the recall@1 metric, thereby providing the option to leave out the expensive candidate ranking step. Finally, we demonstrate how the model can be used in a zero-shot setting to discover out of knowledge base biomedical entities.


Back to publications

Latest publications

01 Jun 2024
arXiv Computer Science
Retrieve to Explain: Evidence-driven Predictions with Language Models
Read more
01 May 2024
Journal of Biomedical Semantics, volume 15, Article number: 5 (2024)
Elucidating the Semantics-Topology Trade-off for Knowledge Inference-Based Pharmacological Discovery
Read more
12 Oct 2023
Translational Neurodegeneration. 2023; 12: 47
Janus kinase inhibitors are potential therapeutics for amyotrophic lateral sclerosis
Read more