Surprisal Predicts Code-Switching in Chinese-English Bilingual Text

Jesús Calvillo; Le Fang; Jeremy Cole; David Reitter

2020 EMNLP EMNLP 2020

Surprisal Predicts Code-Switching in Chinese-English Bilingual Text

Abstract

AbstractWhy do bilinguals switch languages within a sentence? The present observational study asks whether word surprisal and word entropy predict code-switching in bilingual written conversation. We describe and model a new dataset of Chinese-English text with 1476 clean code-switched sentences, translated back into Chinese. The model includes known control variables together with word surprisal and word entropy. We found that word surprisal, but not entropy, is a significant predictor that explains code-switching above and beyond other well-known predictors. We also found sentence length to be a significant predictor, which has been related to sentence complexity. We propose high cognitive effort as a reason for code-switching, as it leaves fewer resources for inhibition of the alternative language. We also corroborate previous findings, but this time using a computational model of surprisal, a new language pair, and doing so for written language.

🌉 Interdisciplinary Bridge — Interdisciplinary and Machine Learning and Mathematics & Optimization and Natural Language Processing

🧭 Keyword Pioneer — cognitive effort

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing

Authors

Jesús Calvillo , Le Fang , Jeremy Cole , David Reitter

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Resources & Methods > Multilingual NLP Mathematics & Optimization > Mathematics > Information Theory Interdisciplinary > Linguistics Interdisciplinary > Linguistics > Computational Linguistics Machine Learning > Optimization & Theory > Information Theory Natural Language Processing > Applications > Text Generation Machine Learning > Learning Types > Information Theory

Keywords

bilingual text word entropy word surprisal cognitive effort

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020