Learning and Evaluating Sparse Interpretable Sentence Embeddings

Valentin Trifonov; Octavian-Eugen Ganea; Anna Potapenko; Thomas Hofmann

2018 EMNLP EMNLP 2018

Learning and Evaluating Sparse Interpretable Sentence Embeddings

Abstract

AbstractPrevious research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data. In this paper, we transfer this idea to sentence embeddings and explore several approaches to obtain a sparse representation. We further introduce a novel, quantitative and automated evaluation metric for sentence embedding interpretability, based on topic coherence methods. We observe an increase in interpretability compared to dense models, on a dataset of movie dialogs and on the scene descriptions from the MS COCO dataset.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Valentin Trifonov , Octavian-Eugen Ganea , Anna Potapenko , Thomas Hofmann

Topics

Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Resources & Methods > Text Representation

Keywords

sparse representation topic coherence word embedding sentence embedding

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018