Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Synthesis
Speech & Audio
›
Synthesis
›
Text-to-Speech
835 directly classified papers
Papers per year
2010: 1
2016: 49
2017: 44
2018: 50
2019: 59
2020: 95
2021: 90
2022: 138
2023: 126
2024: 117
2025: 61
2026: 5
Papers
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
ICML 2023
VC-T: Streaming Voice Conversion Based on Neural Transducer
INTERSPEECH 2023
DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model
INTERSPEECH 2023
DC CoMix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
INTERSPEECH 2023
Intonation Control for Neural Text-to-Speech Synthesis with Polynomial Models of F0
INTERSPEECH 2023
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
INTERSPEECH 2023
FACTSpeech: Speaking a Foreign Language Pronunciation Using Only Your Native Characters
INTERSPEECH 2023
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
INTERSPEECH 2023
RAD-MMM: Multilingual Multiaccented Multispeaker Text To Speech
INTERSPEECH 2023
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions
ACL 2023
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
INTERSPEECH 2023
Rethinking Transfer and Auxiliary Learning for Improving Audio Captioning Transformer
INTERSPEECH 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
INTERSPEECH 2023
ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks
ACL 2023
Adapter-Based Extension of Multi-Speaker Text-To-Speech Model for New Speakers
INTERSPEECH 2023
STEN-TTS: Improving Zero-shot Cross-Lingual Transfer for Multi-Lingual TTS with Style-Enhanced Normalization Diffusion Framework
INTERSPEECH 2023
SASPEECH: A Hebrew Single Speaker Dataset for Text To Speech and Voice Conversion
INTERSPEECH 2023
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
AAAI 2023
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech
AAAI 2023
Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech
INTERSPEECH 2023
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages
INTERSPEECH 2023
RWEN-TTS: Relation-Aware Word Encoding Network for Natural Text-to-Speech Synthesis
AAAI 2023
Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation
INTERSPEECH 2023
CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
INTERSPEECH 2023
Expressive Machine Dubbing Through Phrase-level Cross-lingual Prosody Transfer
INTERSPEECH 2023
<
1
…
9
10
11
…
34
>