Papers
8,761 papers found
A Pilot Study of GSLM-based Simulation of Foreign Accentuation Only Using Native Speech Corpora
Kentaro Onda, Joonyong Park, Nobuaki Minematsu et al.
A powerful and modern AAC composition tool for impaired speakers
Aanchan Mohan, Monideep Chakraborti, Katelyn Eng et al.
Applying Reinforcement Learning and Multi-Generators for Stage Transition in an Emotional Support Dialogue System
Jeremy Chang, Kuan-Yu Chen, Chung-Hsien Wu
AraOffence: Detecting Offensive Speech Across Dialects in Arabic Media
Youssef Nafea, Shady Shehata, Zeerak Talat et al.
Are Articulatory Feature Overlaps Shrouded in Speech Embeddings?
Erfan A. Shams, Iona Gessinger, Patrick Cormac English et al.
Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?
Orchid Chetia Phukan, Gautam Siddharth Kashyap, Arun Balaji Buduru et al.
Are Recent Deep Learning-Based Speech Enhancement Methods Ready to Confront Real-World Noisy Environments?
Candy Olivia Mawalim, Shogo Okada, Masashi Unoki
Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition
Oliver Schrüfer, Manuel Milling, Felix Burkhardt et al.
AR-NLU: A Framework for Enhancing Natural Language Understanding Model Robustness against ASR Errors
Emmy Phung, Harsh Deshpande, Ahmad Emami et al.
Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers
Marvin Tammen, Tsubasa Ochiai, Marc Delcroix et al.
Articulatory Configurations across Genders and Periods in French Radio and TV archives
Benjamin Elie, David Doukhan, Rémi Uro et al.
Articulatory synthesis using representations learnt through phonetic label-aware contrastive loss
Jesuraj Bandekar, Sathvik Udupa, Prasanta Kumar Ghosh
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection
Rong Gong, Hongfei Xue, Lezhi Wang et al.
ASA: An Auditory Spatial Attention Dataset with Multiple Speaking Locations
Zijie Lin, Tianyu He, Siqi Cai et al.
As Biased as You Measure: Methodological Pitfalls of Bias Evaluations in Speaker Verification Research
Wiebke Hutiri, Tanvina Patel, Aaron Yi Ding et al.
ASGIR: audio spectrogram transformer guided classification and information retrieval for birds
Yashwardhan Chaudhuri, Paridhi Mundra, Arnesh Batra et al.
A Small and Fast BERT for Chinese Medical Punctuation Restoration
Tongtao Ling, Yutao Lai, Lei Chen et al.
ASoBO: Attentive Beamformer Selection for Distant Speaker Diarization in Meetings
Théo Mariotte, Anthony Larcher, Silvio Montrésor et al.
Assessing the impact of contextual framing on subjective TTS quality
Jens Edlund, Christina Tånnander, Sébastien Le Maguer et al.
ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Neeraj Gaur, Rohan Agrawal, Gary Wang et al.
A Study on the Information Mechanism of the 3rd Tone Sandhi Rule in Mandarin Disyllabic Words
Liu Xiaowang, Jinsong Zhang
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding
Rui Wang, Liping Chen, Kong Aik Lee et al.
A toolkit for joint speaker diarization and identification with application to speaker-attributed ASR
Giovanni Morrone, Enrico Zovato, Fabio Brugnara et al.
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition
Yangze Li, Xiong Wang, Songjun Cao et al.
A Transformer-Based Voice Activity Detector
Biswajit Karan, Joshua Jansen van Vüren, Febe de Wet et al.