2017 INTERSPEECH INTERSPEECH 2017

HomeBank: A Repository for Long-Form Real-World Audio Recordings of Children

Abstract

HomeBank is a new component of the TalkBank system, focused on long-form (i.e., multi-hour, typically daylong) real-world recordings of children’s language experiences, and it is linked to a GitHub repository in which tools for analyzing those recordings can be shared. HomeBank constitutes not only a rich resource for researchers interested in early language acquisition specifically, but also for those seeking to study spontaneous speech, media exposure, and audio environments more generally. This Show and Tell describes the procedures for accessing and contributing HomeBank data and code. It also overviews the current contents of the repositories, and provides some examples of audio recordings, available transcriptions, and currently available analysis tools.

🧭 Keyword Pioneer — child language
🐣 Hot Topic Early Bird — language acquisition
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio