Unsupervised Phonetic and Word Level Discovery for Speech to Speech Translation for Unwritten Languages

Steven Hillis; Anushree Prasanna Kumar; Alan W. Black

2019 INTERSPEECH INTERSPEECH 2019

Unsupervised Phonetic and Word Level Discovery for Speech to Speech Translation for Unwritten Languages

Abstract

We experiment with unsupervised methods for deriving and clustering symbolic representations of speech, working towards speech-to-speech translation for languages without regular (or any) written representations. We consider five low-resource African languages, and we produce three different segmental representations of text data for comparisons against four different segmental representations derived solely from acoustic data for each language. The text and speech data for each language comes from the CMU Wilderness dataset introduced in [1], where speakers read a version of the New Testament in their language. Our goal is to evaluate the translation performance not only of acoustically derived units but also of discovered sequences or “words” made from these units, with the intuition that such representations will encode more meaning than phones alone. We train statistical machine translation models for each representation and evaluate their outputs on the basis of BLEU-1 scores to determine their efficacy. Our experiments produce encouraging results: as we cluster our atomic phonetic representations into more word-like units, the amount information retained generally approaches that of the actual words themselves.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — phonetic discovery

🐣 Hot Topic Early Bird — speech-to-speech translation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Steven Hillis , Anushree Prasanna Kumar , Alan W. Black

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Learning Types > Unsupervised Learning

Keywords

unsupervised learning unsupervised clustering low-resource language speech-to-speech translation word discovery phonetic discovery

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019