2024
EMNLP
EMNLP 2024
MultiLS: An End-to-End Lexical Simplification Framework
Abstract
AbstractLexical Simplification (LS) automatically replaces difficult to read words for easier alternatives while preserving a sentence’s original meaning. Several datasets exist for LS and each of them specialize in one or two sub-tasks within the LS pipeline. However, as of this moment, no single LS dataset has been developed that covers all LS sub-tasks. We present MultiLS, the first LS framework that allows for the creation of a multi-task LS dataset. We also present MultiLS-PT, the first dataset created using the MultiLS framework. We demonstrate the potential of MultiLS-PT by carrying out all LS sub-tasks of (1) lexical complexity prediction (LCP), (2) substitute generation, and (3) substitute ranking for Portuguese.
🌉
Interdisciplinary Bridge
— Computer Science and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— multi-task dataset
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Learning Types > Semi-Supervised Learning
Computer Science > Applications > Document Analysis
Interdisciplinary > Linguistics > Computational Linguistics
Natural Language Processing > Resources & Methods > Language Modeling
Natural Language Processing > Applications > Summarization
Natural Language Processing > Applications > Text Generation
Machine Learning > Learning Paradigms > Multi-Task Learning
Deep Learning > Learning Types > Multi-Task Learning
Natural Language Processing > Applications > Text Simplification