Implementation of a Radiology Speech Recognition System for Estonian Using Open Source Software

Tanel Alumäe; Andrus Paats; Ivo Fridolin; Einar Meister

2017 INTERSPEECH INTERSPEECH 2017

Implementation of a Radiology Speech Recognition System for Estonian Using Open Source Software

Abstract

Speech recognition has become increasingly popular in radiology reporting in the last decade. However, developing a speech recognition system for a new language in a highly specific domain requires a lot of resources, expert knowledge and skills. Therefore, commercial vendors do not offer ready-made radiology speech recognition systems for less-resourced languages. This paper describes the implementation of a radiology speech recognition system for Estonian, a language with less than one million native speakers. The system was developed in partnership with a hospital that provided a corpus of written reports for language modeling purposes. Rewrite rules for pre-processing training texts and postprocessing recognition results were created manually based on a small parallel corpus created by the hospital’s radiologists, using the Thrax toolkit. Deep neural network based acoustic models were trained based on 216 hours of out-of-domain data and adapted on 14 hours of spoken radiology data, using the Kaldi toolkit. The current word error rate of the system is 5.4%. The system is in active use in real clinical environment.

🧭 Keyword Pioneer — radiology reporting

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tanel Alumäe , Andrus Paats , Ivo Fridolin , Einar Meister

Topics

Speech & Audio > Recognition > Speech Recognition Speech & Audio > Analysis > Clinical Speech Analysis

Keywords

speech recognition acoustic modeling deep neural network speech technology radiology reporting

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017