Project Details
Projekt Print View

Acoustic-aware deep learning for speech processing with distributed microphone arrays

Subject Area Communication Technology and Networks, High-Frequency Technology and Photonic Systems, Signal Processing and Machine Learning for Information Technology
Acoustics
Term since 2025
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 568930428
 
Over the past decade, deep learning has profoundly transformed audio signal processing, solving tasks such as enhancing speech recorded in environments with mild noise and reverberation. However, applying deep-learning-based techniques in harsher and less predictable real-world conditions remains a major challenge, as large amounts of data closely matching the targeted use cases are typically required. Creating such datasets involves balancing competing requirements: achieving sufficient quantity, diversity, and realism within limited computational resources and recording time. While approaches such as acoustic simulation, data augmentation, and transfer learning have been explored, a clear understanding of how to optimally balance the aforementioned requirements to improve generalizability to real-world conditions is still lacking. This challenge is particularly evident in indoor scenarios involving spatially distributed microphones at unknown and potentially time-varying positions. The main objective of this project is to develop and evaluate new methods to address these scenarios by integrating prior geometrical and acoustical knowledge, physical modeling, and deep learning, with a focus on real-world applicability under challenging acoustical conditions. Particular attention will be given to the acoustical and geometrical properties of sound sources, microphones, and reflecting surfaces within a room - an area that has been largely overlooked in the literature. The project advances will not only benefit applications such as hearing aids, conferencing systems, and smart speakers, but will also significantly contribute to the scientific understanding of how to balance methods based on physical models with purely data-driven approaches in the field of audio processing. The project success will be driven by the strong complementarity of the 3 partners, who bring expertise in deep learning, acoustic modeling, distributed array processing, and audio data acquisition.
DFG Programme Research Grants
International Connection France
 
 

Additional Information

Textvergrößerung und Kontrastanpassung