2020
COLING
COLING 2020
Semi-supervised Fine-grained Approach for Arabic dialect detection task
Abstract
AbstractArabic being a language with numerous different dialects, it becomes extremely important to device a technique to distinguish each dialect efficiently. This paper focuses on the fine-grained country level and province level classification of Arabic dialects. The experiments in this paper are submissions done to the NADI 2020 shared Dialect detection task. Various text feature extraction techniques such as TF-IDF, AraVec, multilingual BERT and Fasttext embedding models are studied. We thereby, propose an approach of text embedding based model with macro average F1 score of 0.2232 for task1 and 0.0483 for task2, with the help of semi supervised learning approach.
🌉
Interdisciplinary Bridge
— Deep Learning and Machine Learning and Natural Language Processing
🐣
Hot Topic Early Bird
— text embedding
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Learning Types > Semi-Supervised Learning
Natural Language Processing > Applications > Text Classification
Natural Language Processing > Resources & Methods > Large Language Models
Deep Learning > Learning Types > Self-Supervised Learning
Machine Learning > Learning Paradigms > Semi-Supervised Learning