Optimal Unit Stitching in a Unit Selection Singing Synthesis System

Marius Cotescu

2016 INTERSPEECH INTERSPEECH 2016

Optimal Unit Stitching in a Unit Selection Singing Synthesis System

Abstract

Unit Selection based speech synthesis systems are currently the best performing, producing natural sounding speech with minimal CPU load. One of the important reasons behind their success is the amount of recordings that are now commonly used in synthesis applications. However, in the case of singing applications, it is quite hard for a database to cover a large phonetic space due to the relative inefficiency of the recording process. Thus, due to the reduced catalogue of units, singing unit selection systems are more likely to produce spectral discontinuity artefacts. Taking advantage of the quasi stable nature of articulation during singing, we propose a novel unit stitching method. The method was implemented into the system that was used for the “Fill-In the Gap” Singing Synthesis Challenge.

🚀 Conference Pioneer — INTERSPEECH 2016

🧭 Keyword Pioneer — spectral discontinuity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio