Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition

Liming Wang; Junrui Ni; Heting Gao; Jialu Li; Kai Chieh Chang; Xulin Fan; Junkai Wu; Mark Hasegawa-Johnson; Chang Yoo

2023 ACL ACL 2023

Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition

Abstract

AbstractExisting supervised sign language recognition systems rely on an abundance of well-annotated data. Instead, an unsupervised speech-to-sign language recognition (SSR-U) system learns to translate between spoken and sign languages by observing only non-parallel speech and sign-language corpora. We propose speech2sign-U, a neural network-based approach capable of both character-level and word-level SSR-U. Our approach significantly outperforms baselines directly adapted from unsupervised speech recognition (ASR-U) models by as much as 50% recall@10 on several challenging American sign language corpora with various levels of sample sizes, vocabulary sizes, and audio and visual variability. The code is available at https://github.com/cactuswiththoughts/UnsupSpeech2Sign.gitcactuswiththoughts/UnsupSpeech2Sign.git.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Machine Learning and Speech & Audio

🐣 Hot Topic Early Bird — sign language recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio