2017 INTERSPEECH INTERSPEECH 2017

The LENA System Applied to Swedish: Reliability of the Adult Word Count Estimate

Abstract

The Language Environment Analysis system LENA is used to capture day-long recordings of children’s natural audio environment. The system performs automated segmentation of the recordings and provides estimates for various measures. One of those measures is Adult Word Count (AWC), an approximation of the number of words spoken by adults in close proximity to the child. The LENA system was developed for and trained on American English, but it has also been evaluated on its performance when applied to Spanish, Mandarin and French. The present study is the first evaluation of the LENA system applied to Swedish, and focuses on the AWC estimate. Twelve five-minute segments were selected at random from each of four day-long recordings of 30-month-old children. Each of these 48 segments was transcribed by two transcribers, and both number of words and number of vowels were calculated (inter-transcriber reliability for words: r = .95, vowels: r = .93). Both counts correlated with the LENA system’s AWC estimate for the same segments (words: r = .67, vowels: r = .66). The reliability of the AWC as estimated by the LENA system when applied to Swedish is therefore comparable to its reliability for Spanish, Mandarin and French.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🧭 Keyword Pioneer — cross-linguistic evaluation
🐣 Hot Topic Early Bird — language acquisition
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Security & Privacy, Speech & Audio