2023
INTERSPEECH
INTERSPEECH 2023
NeMo Forced Aligner and its application to word alignment for subtitle generation
Abstract
We present NeMo Forced Aligner (NFA): an efficient and accurate forced aligner which is part of the NeMo conversational AI open-source toolkit. NFA can produce token, word, and segment-level alignments, and can generate subtitle files for highlighting words or tokens as they are spoken. We present a demo which shows this functionality, and demonstrate that NFA has the best word alignment accuracy and speed of alignment generation compared with other aligners.
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Security & Privacy, Speech & Audio