Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Synthesis
Speech & Audio
›
Synthesis
›
Speech Synthesis
164 directly classified papers
Papers per year
2007: 1
2012: 2
2013: 1
2016: 1
2017: 5
2018: 3
2019: 10
2020: 14
2021: 7
2022: 23
2023: 24
2024: 28
2025: 45
Papers
Word-Conditioned 3D American Sign Language Motion Generation
EMNLP 2024
On the Semantic Latent Space of Diffusion-Based Text-To-Speech Models
ACL 2024
Aligning Speech Segments Beyond Pure Semantics
ACL 2024
FT-GAN: Fine-Grained Tune Modeling for Chinese Opera Synthesis
AAAI 2024
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
AAAI 2024
StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
ACL 2024
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
AAAI 2024
Audio Generation with Multiple Conditional Diffusion Model
AAAI 2024
Knowledge-Preserving Pluggable Modules for Multilingual Speech Translation Tasks
INTERSPEECH 2024
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
ACL 2024
SpeechAlign: Aligning Speech Generation to Human Preferences
NIPS 2024
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
AAAI 2024
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
AAAI 2024
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
AAAI 2024
A Two-Step Approach for Data-Efficient French Pronunciation Learning
EMNLP 2024
Contextual Interactive Evaluation of TTS Models in Dialogue Systems
INTERSPEECH 2024
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
EMNLP 2024
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
CVPR 2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
EMNLP 2024
CTC-based Non-autoregressive Textless Speech-to-Speech Translation
ACL 2024
Speechworthy Instruction-tuned Language Models
EMNLP 2024
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
NIPS 2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
ACL 2024
Learning To Dub Movies via Hierarchical Prosody Models
CVPR 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
ACL 2023
<
1
2
3
4
5
6
7
>