Spoken Document Retrieval for an Unwritten Language: A Case Study on Gormati

Sanjay Booshanam; Kelly Chen; Ondrej Klejch; Thomas Reitmaier; Dani Kalarikalayil Raju; Electra Wallington; Nina Markl; Jennifer Pearson; Matt Jones; Simon Robinson; Peter Bell

2025 EMNLP EMNLP 2025

Spoken Document Retrieval for an Unwritten Language: A Case Study on Gormati

Abstract

AbstractSpeakers of unwritten languages have the potential to benefit from speech-based automatic information retrieval systems. This paper proposes a speech embedding technique that facilitates such a system that we can be used in a zero-shot manner on the target language. After conducting development experiments on several written Indic languages, we evaluate our method on a corpus of Gormati – an unwritten language – that was previously collected in partnership with an agrarian Banjara community in Maharashtra State, India, specifically for the purposes of information retrieval. Our system achieves a Top 5 retrieval rate of 87.9% on this data, giving the hope that it may be useable by unwritten language speakers worldwide.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing and Speech & Audio

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sanjay Booshanam , Kelly Chen , Ondrej Klejch , Thomas Reitmaier , Dani Kalarikalayil Raju , Electra Wallington , Nina Markl , Jennifer Pearson , Matt Jones , Simon Robinson , Peter Bell

Topics

Machine Learning > Learning Types > Zero-Shot Learning Natural Language Processing > Applications > Information Retrieval Speech & Audio > Recognition > Automatic Speech Recognition Machine Learning > Learning Paradigms > Zero-Shot Learning

Keywords

zero-shot learning speech recognition cross-lingual transfer information retrieval speech embedding cross-lingual retrieval spoken document retrieval

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025