2022
INTERSPEECH
INTERSPEECH 2022
FFM: A Frame Filtering Mechanism To Accelerate Inference Speed For Conformer In Speech Recognition
Abstract
This paper proposes a frame filtering mechanism (FFM) to accelerate inference speed for speech recognition. The FFM consists of three parts: one frame invalid indicator distinguishing whether the frame is invalid or not, one filtering strategy removing invalid frames, and one extractor attention block recalling useful information from filtered frames. The feature sequence will become shorter after FFM block. As a result, the inference is accelerated. Compared to other downsampling approaches on LibriSpeech, our method can achieve best WER with lowest RTF. Experiments on Aishell-1 show that our approach reduces the sequence length by up to 73% and achieves 21.1%--34.5% relative RTF reduction with relative WER increasing no more than 5.8\%.
🌉
Interdisciplinary Bridge
— Machine Learning and Speech & Audio
🧭
Keyword Pioneer
— frame filtering
🐣
Hot Topic Early Bird
— inference acceleration
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio