Automated Classification of Children’s Linguistic versus Non-Linguistic Vocalisations

Zixing Zhang; Alejandrina Cristia; Anne Warlaumont; Björn Schuller

2018 INTERSPEECH INTERSPEECH 2018

Automated Classification of Children’s Linguistic versus Non-Linguistic Vocalisations

Abstract

A key outstanding task for speech technology involves dealing with non-standard speakers, notably young children. Distinguishing children's linguistic from non-linguistic vocalisations is crucial for a number of applied and fundamental research goals and yet there are few systems available for such a classification. This paper investigates two large-scale frame-level acoustic feature sets (eGeMAPS and ComParE16) followed by a dynamic model (GRU-RNN) and two kinds of derived static feature sets on the segment level (functional-based and Bag of Audio Words) combined with a static model (SVM) and automatically learnt representations directly from original raw voice signals by using an end-to-end system, which are compared against a simple phonetically-inspired baseline. These are applied to a large database of children's vocalisations (total N = 6,298) drawn from daylong recordings gathered in Namibia, Bolivia and Vanuatu. All of the systems outperform the baseline, with the highest performance in the test set for GRU-RNN using ComParE16 features. We identify promising paths of further research, including the application of a finer-grained classification of children's vocalisations onto these data and the exploration of other feature systems.

🧭 Keyword Pioneer — vocalisation classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zixing Zhang , Alejandrina Cristia , Anne Warlaumont , Björn Schuller

Topics

Machine Learning > Core Methods > Classification

Keywords

speech classification acoustic feature extraction support vector machine recurrent neural network gated recurrent unit end-to-end system vocalisation classification

Download PDF

Related papers

HoloCompanion: An MR Friend for EveryOne 2018

Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley 2018

Deep Learning Techniques for Koala Activity Detection 2018

An Exploration of Local Speaking Rate Variations in Mandarin Read Speech 2018

Acoustic Analysis of Whispery Voice Disguise in Mandarin Chinese 2018