2019 EMNLP EMNLP 2019

BioReddit: Word Embeddings for User-Generated Biomedical NLP

Abstract

AbstractWord embeddings, in their different shapes and iterations, have changed the natural language processing research landscape in the last years. The biomedical text processing field is no stranger to this revolution; however, scholars in the field largely trained their embeddings on scientific documents only, even when working on user-generated data. In this paper we show how training embeddings from a corpus collected from user-generated text from medical forums heavily influences the performance on downstream tasks, outperforming embeddings trained both on general purpose data or on scientific papers when applied on user-generated content.

🌉 Interdisciplinary Bridge — Deep Learning and Healthcare & Medicine and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — user-generated text
🐣 Hot Topic Early Bird — biomedical nlp
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio