2021
EMNLP
EMNLP 2021
A Text Editing Approach to Joint Japanese Word Segmentation, POS Tagging, and Lexical Normalization
Abstract
AbstractLexical normalization, in addition to word segmentation and part-of-speech tagging, is a fundamental task for Japanese user-generated text processing. In this paper, we propose a text editing model to solve the three task jointly and methods of pseudo-labeled data generation to overcome the problem of data deficiency. Our experiments showed that the proposed model achieved better normalization performance when trained on more diverse pseudo-labeled data.
🌉
Interdisciplinary Bridge
— Deep Learning and Interdisciplinary and Natural Language Processing
🧭
Keyword Pioneer
— japanese text processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio
Authors
Topics
Natural Language Processing > Understanding > Part-of-Speech Tagging
Natural Language Processing > Understanding > Parsing
Interdisciplinary > Linguistics
Natural Language Processing > Applications > Text Generation
Natural Language Processing > Understanding > Morphology
Natural Language Processing > Applications > Text Processing
Deep Learning > Learning Types > Sequence Modeling