2018
ACL
ACL 2018
Comparison of Representations of Named Entities for Document Classification
Abstract
AbstractWe explore representations for multi-word names in text classification tasks, on Reuters (RCV1) topic and sector classification. We find that: the best way to treat names is to split them into tokens and use each token as a separate feature; NEs have more impact on sector classification than topic classification; replacing NEs with entity types is not an effective strategy; representing tokens by different embeddings for proper names vs. common nouns does not improve results. We highlight the improvements over state-of-the-art results that our CNN models yield.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
📈
Trend Setter
— Text Representation
🧭
Keyword Pioneer
— named entity representation
🐣
Hot Topic Early Bird
— word embedding
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Embedding Learning
Natural Language Processing > Understanding > Named Entity Recognition
Natural Language Processing > Applications > Text Classification
Natural Language Processing > Resources & Methods > Text Representation
Deep Learning > Architectures > Convolutional Neural Networks