Research Explorer

Bridging Emotions Across Languages: Low Rank Adaptation for Multilingual Speech Emotion Recognition

Lucas Goncalves, Donita Robinson, Elizabeth Richerson et al.

2024 INTERSPEECH

Bridging Language Gaps in Audio-Text Retrieval

Zhiyong Yan, Heinrich Dinkel, Yongqing Wang et al.

2024 INTERSPEECH

BS-PLCNet 2: Two-stage Band-split Packet Loss Concealment Network with Intra-model Knowledge Distillation

Zihan Zhang, Xianjun Xia, Chuanzeng Huang et al.

2024 INTERSPEECH

BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification

June-Woo Kim, Miika Toikkanen, Yera Choi et al.

2024 INTERSPEECH

CALL system using pitch-accent feature representations reflecting listeners’ subjective adequacy

Ikuyo Masuda-Katsuse, Ayako Shirose

2024 INTERSPEECH

Can Large Language Models Understand Spatial Audio?

Changli Tang, Wenyi Yu, Guangzhi Sun et al.

2024 INTERSPEECH

Can Modelling Inter-Rater Ambiguity Lead To Noise-Robust Continuous Emotion Predictions?

Ya-Tse Wu, Jingyao Wu, Vidhyasaharan Sethu et al.

2024 INTERSPEECH

Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Tiantian Feng, Dimitrios Dimitriadis, Shrikanth S. Narayanan

2024 INTERSPEECH

Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung et al.

2024 INTERSPEECH

CaptainA self-study mobile app for practising speaking: task completion assessment and feedback with generative AI

Nhan Phan, Anna von Zansen, Maria Kautonen et al.

2024 INTERSPEECH

Cascaded Transfer Learning Strategy for Cross-Domain Alzheimer's Disease Recognition through Spontaneous Speech

Guanlin Chen, Yun Jin

2024 INTERSPEECH

CDSD: Chinese Dysarthria Speech Database

Yan Wan, Mengyi Sun, Xinchen Kang et al.

2024 INTERSPEECH

CEC: A Noisy Label Detection Method for Speaker Recognition

Yao Shen, Yingying Gao, Yaqian Hao et al.

2024 INTERSPEECH

Centroid Estimation with Transformer-Based Speaker Embedder for Robust Target Speaker Extraction

Woon-Haeng Heo, Joongyu Maeng, Yoseb Kang et al.

2024 INTERSPEECH

Challenge of Singing Voice Synthesis Using Only Text-To-Speech Corpus With FIRNet Source-Filter Neural Vocoder

Takuma Okamoto, Yamato Ohtani, Sota Shimizu et al.

2024 INTERSPEECH

Challenges of German Speech Recognition: A Study on Multi-ethnolectal Speech Among Adolescents

Martha Schubert, Daniel Duran, Ingo Siegert

2024 INTERSPEECH

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

Themos Stafylakis, Anna Silnova, Johan Rohdin et al.

2024 INTERSPEECH

Characterizing code-switching: Applying Linguistic Principles for Metric Assessment and Development

Jie Chi, Electra Wallington, Peter Bell

2024 INTERSPEECH

Children’s Speech Recognition through Discrete Token Enhancement

Vrunda N. Sukhadia, Shammur Absar Chowdhury

2024 INTERSPEECH

ClariTTS: Feature-ratio Normalization and Duration Stabilization for Code-mixed Multi-speaker Speech Synthesis

Changhwan Kim

2024 INTERSPEECH

Classification of Room Impulse Responses and its application for channel verification and diarization

Yuri Khokhlov, Tatiana Prisyach, Anton Mitrofanov et al.

2024 INTERSPEECH

Clever Hans Effect Found in Automatic Detection of Alzheimer's Disease through Speech

Yin-Long Liu, Rui Feng, Jia-Hong Yuan et al.

2024 INTERSPEECH

CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge

Chen Chen, Zehua Liu, Xiaolou Li et al.

2024 INTERSPEECH

Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations

Kunal Dhawan, Nithin Rao Koluguri, Ante Jukić et al.

2024 INTERSPEECH

Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio

Yi Lu, Yuankun Xie, Ruibo Fu et al.

2024 INTERSPEECH

Papers