Project Details
Projekt Print View

Robust noise reduction by novel means of incorporating phase processing

Subject Area Acoustics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Electronic Semiconductors, Components and Circuits, Integrated Systems, Sensor Technology, Theoretical Electrical Engineering
Term from 2014 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 247465126
 
Today, speech communication devices including smart-phones, hearing aids and acoustic human-machine interfaces are ubiquitous. However, in many everyday situations, speech signals are distorted by acoustic noise, for example in a cafeteria or on a busy street. To reduce the negative impact of these disturbances on speech communication, speech enhancement algorithms are used. However, in the acoustically challenging scenarios in which speech enhancement algorithms would be needed most, the performance gain is still limited. Therefore, in this project we aim at making speech enhancement algorithms work more reliably to ease speech communication in acoustically difficult environments.Speech enhancement is usually applied in a spectral transform domain, in which the signal coefficients are complex-valued, i.e. they are represented by spectral amplitudes and phases. Still, most research on single-channel speech enhancement focused on spectral magnitudes while the spectral phase was largely ignored. However, recent research, including our work in the first funding period of this project, indicate that the importance of the spectral phase for speech enhancement might have been underestimated: with instrumental measures and in listening experiments we were able to show that an estimate of the clean speech spectral phase can be used to improve the speech enhancement performance, especially in challenging acoustic scenarios. Now, we aim at building up on these results and derive new and improved estimators of the clean speech spectral phase to improve performance further. For this, we will develop and combine individual phase estimators for voiced, unvoiced, and transient sounds. We will also translate recent advances in pre-trained speech enhancement, e.g., based on deep neural networks (DNNs), to phase processing. Currently, most pre-trained approaches rely only on magnitude features and also only modify the spectral magnitudes. Here, we aim at overcoming both limitations, for which we will investigate new phase features and how they can be employed, e.g. to build novel DNN based phase-aware speech enhancement systems. Our research will provide new and valuable insights into the role and relevance of phase processing for speech enhancement as well as novel algorithms that will boost the performance of speech communication devices.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung