2016 INTERSPEECH INTERSPEECH 2016

Using Clinician Annotations to Improve Automatic Speech Recognition of Stuttered Speech

Abstract

In treating people who stutter, clinicians often have their clients read a story in order to determine their stuttering frequency. As the client is speaking, the clinician annotates each disfluency. For further analysis of the client’s speech, it is useful to have a word transcription of what was said. However, as these are real-time annotations, they are not always correct, and they usually lag where the actual disfluency occurred. We have built a tool that rescores a word lattice taking into account the clinician’s annotations. In the paper, we describe how we incorporate the clinician’s annotations, and the improvement over a baseline version. This approach of leveraging clinician annotations can be used for other clinical tasks where a word transcription is useful for further or richer analysis.

🚀 Conference Pioneer — INTERSPEECH 2016
📈 Trend Setter — Clinical Speech Analysis
🧭 Keyword Pioneer — stuttered speech
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio