Project Details
Projekt Print View

Wideband acoustic modeling of speech

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2019 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 418848246
 
The acoustic properties of speech and singing have been mainly studied at low frequencies of up to 5 kHz, while higher frequencies have been considered essentially irrelevant. More recently, however, evidence has emerged that the high-frequency components of speech play a more significant role than previously believed, and carry, for example, paralinguistic information. The current knowledge about the high-frequency components of speech, their directivity-dependent radiation, and their relation to articulation are rather limited, partly due to the difficulties of their investigation. A potentially suitable approach to study these issues is based on the wideband 3D acoustic simulation of the vocal tract. However, current 3D simulation methods are either computationally very expensive (e.g., finite element methods), or require strongly simplified geometries of the vocal tract (multimodal method). Here, we propose the development of a hybrid method for the 3D acoustic simulation of the vocal tract that covers the whole audible frequency range and is fast and physically accurate at the same time. The basic idea is to combine the analytical multimodal approach with a numerical 2D finite element approach to compute the modal basis functions. This hybrid acoustic model is implemented and optimized in the framework of the articulatory speech synthesizer VocalTractLab, where it is combined with an existing 3D articulatory model of the vocal tract for the physically accurate wideband synthesis of connected speech utterances. The proposed method is evaluated objectively by means of wideband acoustic measurements of 3D-printed replicas of the vocal tract, and subjectively by a perception test with human listeners.
DFG Programme Research Grants
International Connection Belgium, France
 
 

Additional Information

Textvergrößerung und Kontrastanpassung