Papers
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model
INTERSPEECH 2024
All Neural Low-latency Directional Speech Extraction
INTERSPEECH 2024
Novel-view Acoustic Synthesis From 3D Reconstructed Rooms
INTERSPEECH 2024
RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement
INTERSPEECH 2024
Hear Your Face: Face-based voice conversion with F0 estimation
INTERSPEECH 2024