Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers

Roy Xie; Orevaoghene Ahia; Yulia Tsvetkov; Antonios Anastasopoulos

2024 NAACL NAACL 2024

Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers

Abstract

AbstractIdentifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis. This is largely due to the complexity and nuance involved in studying various dialects. We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. We explore both post-hoc and intrinsic approaches to interpretability, conduct experiments on Mandarin, Italian, and Low Saxon, and experimentally demonstrate that our method successfully identifies key language-specific lexical features that contribute to dialectal variations.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — lexical feature extraction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio

Authors

Roy Xie , Orevaoghene Ahia , Yulia Tsvetkov , Antonios Anastasopoulos

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Classification

Keywords

interpretable classifier dialect classification lexical feature extraction post-hoc interpretability

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024