2024 INTERSPEECH INTERSPEECH 2024

2.5D Vocal Tract Modeling: Bridging Low-Dimensional Efficiency with 3D Accuracy

Abstract

We introduce an extended 2D (2.5D) wave solver that blends the computational efficiency of low-dimensional models with the accuracy of 3D approaches tailored for simulating tube geometries similar to vocal tracts. Unlike 1D and 2D models limited to radial symmetry, our lightweight 2.5D finite-difference time-domain solver handles irregular geometries bound only to mid-sagittal symmetry. We validated our model against state-of-the-art 2D and 3D solvers for three different vocal tract geometries, each having a unique cross-sectional shape. Results show that the frequency response of 2.5D simulations closely aligns with 3D up to 12 kHz with a Pearson correlation coefficient greater than 0.8 for all geometries. The proposed model also produces effects of higher-order modes associated with non-cylindrical vocal tracts, surpassing the limitations of the advanced 1D and 2D solvers. Moreover, it achieved a speed-up factor close to an order of magnitude compared to the 2D and 3D models.

🧭 Keyword Pioneer — wave propagation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio