Project Details
BIG VOICE DATA – a comprehensive documentation and analysis of human voice production with Laryngeal Voice Range Profiles and Deep Learning
Applicant
Professor Dr.-Ing. Jörg Lohscheller
Subject Area
Medical Informatics and Medical Bioinformatics
Term
since 2025
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 544877652
The underlying physical and physiological principles of voice production in human speech and song have been the target of research for the recent decades. However, despite some excellent recent investigations, our current understanding of the causal relation between vocal fold vibration dynamics and the quality of the radiated sound is remarkably limited to selective vocalization stereotypes, mostly concerning speech. This is partially due to methodological constraints of data acquisition, as well as challenges in the analysis and aggregation of vast amounts of data produced by laryngeal high-speed video-endoscopy (HSV). Consequently and surprisingly, there is to date no comprehensive documentation of the causative connection between laryngeal sound production and the resulting acoustic output across the entire human voice range. Addressing this current dearth of data, it is the aim of this study to investigate comprehensively and quantitatively, for the first time, the causal relationship between vocal fold vibratory dynamics and the resulting acoustic output across the entire, physiologically possible voice range of adult humans. To achieve this goal, we will conduct the most comprehensive quantitative in vivo high-speed video data acquisition and deep learning data analysis attempted to date. Innovative laryngeal voice range profiles (L-VRP) - covering the entire achievable voice range - will be generated for a cohort of 40 participants (20 female, 20 male), collecting simultaneous acoustic, accelerometer, electroglottographic (EGG) and endoscopic HSV data. The HSV data collection is powered by an unprecedented data acquisition paradigm, allowing to document the entire voice range in one working day. The acquired L-VRP data sets will be segmented and analyzed with artificial-intelligence-(AI)-powered including deep learning, to identify laryngeal factors interrelating with acoustic voice measures. The increased knowledge of the causal relation between voice production dynamics and acoustic output, produced with our innovative methodological approach of unprecedented scope, will promote a more profound understanding of the development and treatment of voice disorders, as well as the advancement of pedagogical approaches for both speech and singing. This project is conducted as an international collaboration between Christian T. Herbst, Mozarteum University Salzburg, Austria, and Jörg Lohscheller, University of Applied Sciences, Trier, Germany.
DFG Programme
Research Grants
International Connection
Austria
Cooperation Partner
Christian Herbst, Ph.D.
