2016
INTERSPEECH
INTERSPEECH 2016
Enhancing Data-Driven Phone Confusions Using Restricted Recognition
Abstract
This paper presents a novel approach to address data sparseness in standard confusion matrices and demonstrates how enhanced matrices, which capture additional similarities, can impact the performance of spoken term detection. Using the same training data as for the standard phone confusion matrix, an enhanced confusion matrix is created by iteratively restricting the recognition process to exclude one acoustic model per iteration. Since this results in a greater amount of confusion data for each phone, the enhanced confusion matrix encodes more similarities. The enhanced phone confusion matrices perform demonstrably better than standard confusion matrices on a spoken term detection task which uses both HMMs and DNNs.
π
Conference Pioneer
β INTERSPEECH 2016
π§
Keyword Pioneer
β phone confusion matrix
π£
Hot Topic Early Bird
β data augmentation
π
Cross-Pollinator
β Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio