2020 INTERSPEECH INTERSPEECH 2020

VCTUBE : A Library for Automatic Speech Data Annotation

Abstract

We introduce an open-source Python library, VCTUBE, which can automatically generate <audio, text> pair of speech data from a given Youtube URL. We believe VCTUBE is useful for collecting, processing, and annotating speech data easily toward developing speech synthesis systems.

🧭 Keyword Pioneer — speech data annotation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio
🌉 Interdisciplinary Bridge — Computer Science and Speech & Audio
🐣 Hot Topic Early Bird — data annotation