A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

Aditya Joshi; Sarvnaz Karimi; Ross Sparks; Cecile Paris; C Raina MacIntyre

2019 ACL ACL 2019

A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

Abstract

AbstractDistributed representations of text can be used as features when training a statistical classifier. These representations may be created as a composition of word vectors or as context-based sentence vectors. We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health mention classification. For statistical classifiers trained for each of these problems, context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two adapted using the MESH ontology. There is an improvement of 2-4% in the accuracy when these context-based representations are used instead of word-based representations.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Healthcare & Medicine and Machine Learning

🧭 Keyword Pioneer — context-based representation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Aditya Joshi , Sarvnaz Karimi , Ross Sparks , Cecile Paris , C Raina MacIntyre

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Representation Learning Deep Learning > Architectures > Neural Networks Healthcare & Medicine > Clinical > Medical NLP

Keywords

transfer learning text classification distributed representation word embedding contextual embedding health informatics context-based representation

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019