2019 EMNLP EMNLP 2019

An Overview of the Active Gene Annotation Corpus and the BioNLP OST 2019 AGAC Track Tasks

Abstract

AbstractThe active gene annotation corpus (AGAC) was developed to support knowledge discovery for drug repurposing. Based on the corpus, the AGAC track of the BioNLP Open Shared Tasks 2019 was organized, to facilitate cross-disciplinary collaboration across BioNLP and Pharmacoinformatics communities, for drug repurposing. The AGAC track consists of three subtasks: 1) named entity recognition, 2) thematic relation extraction, and 3) loss of function (LOF) / gain of function (GOF) topic classification. The AGAC track was participated by five teams, of which the performance are compared and analyzed. The the results revealed a substantial room for improvement in the design of the task, which we analyzed in terms of “imbalanced data”, “selective annotation” and “latent topic annotation”.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio