Breitbandige akustische Modellierung von Sprache

Antragsteller Professor Dr. Peter Birkholz

Fachliche Zuordnung Allgemeine und Vergleichende Sprachwissenschaft, Experimentelle Linguistik, Typologie, Außereuropäische Sprachen
Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing

Förderung Förderung von 2019 bis 2023

Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 418848246

Erstellungsjahr 2022

Zusammenfassung der Projektergebnisse

In this project, an e cient simulation framework for 3D vocal tract acoustics has been developed and implemented as a special version of the articulatory synthesizer VocalTractLab: VocalTract-Lab3D. It is open source and freely available for the speech science community. It allows to run 3D acoustic simulations (acoustic eld and transfer functions computation) of the vocal tract on a standard laptop computer and without any speci c knowledge in programming or physics simulation. In this project, the simulations performed with this software con rmed and further documented the acoustic impact of the vocal tract curvature and the ne cross-sectional area variation (e.g. the shift of resonance frequencies or additional resonances at high frequencies, above 4-5 kHz). In combination with the experimental part of the project, the impact of the vocal tract on the speech radiation pattern was furthermore highlighted. In particular, the eﬀect of higher-order modes, which cause substantial changes of radiation patterns within small frequency intervals, was con rmed and further documented. VocalTractLab3D also provides transfer functions to synthesize stimuli in order to investigate the perceptual impact of the acoustic model (1D vs. 3D). This tool can be used in the future by the speech science community to investigate various questions requiring an accurate modeling of vocal tract acoustics, such as questions of speaker identity or the perceptual consequences of various articulatory parameters. The experimental investigation of the speech directivity mechanisms allowed us to better understand the role of various anatomical elements (torso, head, lips, vocal tract) on the radiation of speech sounds. As an example, the torso diﬀraction pattern was further documented, and it was found that the lips enhance this pattern. These results are valuable to build e cient speech radiation models, which can be used for auralisation of speakers in virtual reality, or with loudspeaker arrays. The perceptual studies showed that phonemes synthesized with 1D and 3D acoustic models can be perceptually discriminated and that a 3D acoustic model generates more natural sounding stimuli, at least for speci c phonemes. Thus, increasing the accuracy of the acoustic model could potentially increase the naturalness of articulatory synthesis. Furthermore, the acoustic eld visualisation tool of VocalTractLab3D facilitates the understanding of the cause of this diﬀerence by identifying transverse resonances, which potentially have a strong perceptual impact. This is promising for the development of synthesis tools with an even better tradeoﬀ between accuracy and e ciency by targeting the perceptually relevant phenomena.

Projektbezogene Publikationen (Auswahl)

A pilot study on the influence of mouth configuration and torso on singing voice directivity. The Journal of the Acoustical Society of America, 148(3), 1169-1180.
Brandner, Manuel; Blandin, Remi; Frank, Matthias & Sontacchi, Alois
Printable 3D vocal tract shapes from MRI data and their acoustic and aerodynamic properties. Scientific Data, 7(1).
Birkholz, Peter; Kürbis, Steffen; Stone, Simon; Häsner, Patrick; Blandin, Rémi & Fleischer, Mario
Comparison of the Finite Element Method, the Multimodal Method and the Transmission-Line Model for the Computation of Vocal Tract Transfer Functions. Interspeech 2021, 3330-3334. ISCA.
Blandin, Rémi; Arnela, Marc; Félix, Simon; Doc, Jean-Baptiste & Birkholz, Peter
Efficient 3D Acoustic Simulation of the Vocal Tract by Combining the Multimodal Method and Finite Elements. IEEE Access, 10, 69922-69938.
Blandin, Remi; Arnela, Marc; Felix, Simon; Doc, Jean-Baptiste & Birkholz, Peter

Servicenavigation

Hauptnavigation

Breitbandige akustische Modellierung von Sprache

Zusammenfassung der Projektergebnisse

Projektbezogene Publikationen (Auswahl)

Zusatzinformationen

Servicenavigation

Hauptnavigation

Breitbandige akustische Modellierung von Sprache

Zusammenfassung der Projektergebnisse

Projektbezogene Publikationen (Auswahl)

Zusatzinformationen

Textvergrößerung und Kontrastanpassung