Project Details
Projekt Print View

Reverberation Modelling for Robust Speech Recognition in Reverberant Environments

Subject Area Electronic Semiconductors, Components and Circuits, Integrated Systems, Sensor Technology, Theoretical Electrical Engineering
Acoustics
Term from 2008 to 2016
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 76981564
 
For many years, automatic speech recognition (ASR) has been successfully deployed in everyday-life applications. The main restriction so far is the necessity of close-talking microphones in order to achieve acceptable recognition performance for natural human/machine dialogues. There are, however, numerous scenarios, such as "Ambient Assisted Living" and "Smart Homes", where the employment of distant-talking microphones being installed at fixed positions in the environment would be much more convenient in order to allow the user to interact independently of the microphone positions and environmental noise. Since in such scenarios, the speaker is usually several meters away from the microphone, the received signal is impaired by additive noise and reverberation of the desired signal. These effects significantly reduce the ASR performance if no countermeasures are taken. While a remarkable progress has been achieved in robustifying ASR systems to additive noise over the past decades, reverberation still represents a major challenge. The key idea underlying this research project is, hence, to develop a flexible and theoretically well-founded framework for efficiently adapting state-of-the-art ASR systems to changing reverberation conditions. Such an approach has already been investigated during the first part of this research project and should now be further developed. The concept is based on an explicit reverberation model embedded into an ASR system, which aims at estimating the reverberant part of an observed signal from the preceding signal components. The fundamental structure of the reverberation estimator is inspired by the physical nature of reverberation approximated by a mathematical convolution, while the estimates of the model parameters are obtained by exploiting statistical methods of machine learning. During the first three years of this project, significant progress in terms of recognition rates has been achieved along with the development of the concept. In order to further the increase ASR performance for reverberant speech, the second phase of the project focuses on extending the reverberation model to more powerful speech features and probability models for the feature vectors as they are predominantly employed in state-of-the-art ASR systems. In addition, different statistical estimation techniques are to be investigated allowing for a robust inference of the reverberation model parameters based on only few speech signal observations. The combination of the proposed method with an established robustification procedure, the training of ASR systems on reverberant data, shall also be studied. As ASR systems are frequently connected to microphone arrays and signal enhancement algorithms in practical applications, the given approach is finally to be analyzed for synergies with concepts of microphone array signal processing.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung