The cognitive status of simple and complex models

Janet B. Pierrehumbert

2020 INTERSPEECH INTERSPEECH 2020

The cognitive status of simple and complex models

Abstract

Human languages are extraordinarily rich systems. They have extremely large lexical inventories, and the elements in these inventories can be combined to generate a potentially unbounded set of distinct messages. Regularities at many different levels of representation — from the phonetic level through the syntax and semantics — support people's ability to process mappings between the physical reality of speech, and the objects, events, and relationships that speech refers to. However, human languages also simplify reality. The phonological system establishes equivalence classes amongst articulatory-acoustic events that have considerable variation at the parametric level. The semantic system similarly establishes equivalence classes amongst real-world phenomena having considerable variation. The tension between simplicity and complexity is a recurring theme of research on language modelling. In this talk, I will present three case studies in which a pioneering simple model omitted important complexities that were either included in later models, or that remain as challenges to this day. The first is the acoustic theory of speech production, as developed by Gunnar Fant, the inaugural Medal recipient in 1989. By approximating the vocal tract as a half-open tube, it showed that the first three formants of vowels (which are the most important for the perception of vowel quality) can be computed as a linear systems problem. The second is the autosegmental-metrical theory of intonation, to which I contributed early in my career. It made the simplifying assumption that the correct model of phonological representation will support the limited set of observed non-local patterns, while excluding non-local patterns that do not naturally occur. The third case concerns how word-formation patterns are generalised in forming new words, whether though inflectional morphology (as in “one wug; two wugs”) or derivational morphology (as in “nickname, unnicknameable”). Several early mode

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio

Authors

Janet B. Pierrehumbert

Topics

Interdisciplinary > Linguistics > Phonetics

Keywords

speech perception language modelling formant frequency acoustic analysis phonological system morphological generalization

Download PDF

Related papers

Memory Controlled Sequential Self Attention for Sound Recognition 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer 2020

A Noise Robust Technique for Detecting Vowels in Speech Signals 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody 2020