Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

Denis Newman-Griffis; Eric Fosler-Lussier

2019 EMNLP EMNLP 2019

Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

Abstract

AbstractNatural language processing techniques are being applied to increasingly diverse types of electronic health records, and can benefit from in-depth understanding of the distinguishing characteristics of medical document types. We present a method for characterizing the usage patterns of clinical concepts among different document types, in order to capture semantic differences beyond the lexical level. By training concept embeddings on clinical documents of different types and measuring the differences in their nearest neighborhood structures, we are able to measure divergences in concept usage while correcting for noise in embedding learning. Experiments on the MIMIC-III corpus demonstrate that our approach captures clinically-relevant differences in concept usage and provides an intuitive way to explore semantic characteristics of clinical document collections.

🌉 Interdisciplinary Bridge — Deep Learning and Healthcare & Medicine and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — document type

🐣 Hot Topic Early Bird — electronic health record

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Denis Newman-Griffis , Eric Fosler-Lussier

Topics

Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Resources & Methods > Text Representation Healthcare & Medicine > Clinical > Clinical NLP Healthcare & Medicine > Research > Medical AI Machine Learning > Learning Types > Representation Learning Deep Learning > Techniques > Representation Learning

Keywords

semantic analysis nearest neighbor semantic similarity electronic health record word embedding concept embedding clinical concept clinical document clinical concept embedding document type sublanguage analysis

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019