Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Synthesis
Speech & Audio
›
Synthesis
›
Text-to-Speech
835 directly classified papers
Papers per year
2010: 1
2016: 49
2017: 44
2018: 50
2019: 59
2020: 95
2021: 90
2022: 138
2023: 126
2024: 117
2025: 61
2026: 5
Papers
Learning from Scarcity: Building and Benchmarking Speech Technology for Sukuma.
EACL 2026
Eyaa-Tom 26, Yodi-Mantissa and Lom Bench: A Community Benchmark for TTS in Local Languages
EACL 2026
WenetSpeech-Yue: A Large-Scale Cantonese Speech Corpus with Multi-dimensional Annotation
AAAI 2026
BanglaIPA: Towards Robust Text-to-IPA Transcription with Contextual Rewriting in Bengali
EACL 2026
IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech
AAAI 2026
Finding A Voice: Exploring the Potential of African American Dialect and Voice Generation for Chatbots
ACL 2025
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
ACL 2025
Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis
ACL 2025
O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
EMNLP 2025
UniCoM: A Universal Code-Switching Speech Generator
EMNLP 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
ACL 2025
SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
ACL 2025
Scaling Under-Resourced TTS: A Data-Optimized Framework with Advanced Acoustic Modeling for Thai
ACL 2025
LLaMA-Omni 2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis
ACL 2025
Intoner: For Chinese Poetry Intoning Synthesis
IJCAI 2025
Continuous Speech Tokenizer in Text To Speech
NAACL 2025
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
ACL 2025
BnTTS: Few-Shot Speaker Adaptation in Low-Resource Setting
NAACL 2025
Text-to-speech system for low-resource languages: A case study in Shipibo-Konibo (a Panoan language from Peru)
NAACL 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
EMNLP 2025
Multimodal Fine-grained Context Interaction Graph Modeling for Conversational Speech Synthesis
EMNLP 2025
BRSpeech-DF: A Deep Fake Synthetic Speech Dataset for Portuguese Zero-Shot TTS
EMNLP 2025
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
AAAI 2025
Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment
ACL 2025
Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches
IJCNLP 2025
<
1
2
3
4
5
…
34
>