Verfahren zur Signalkombination verteilter Mikrofone
Elektronische Halbleiter, Bauelemente und Schaltungen, Integrierte Systeme, Sensorik, Theoretische Elektrotechnik
Zusammenfassung der Projektergebnisse
Recently, research on speech enhancement using so-called acoustic sensor networks consisting of spatially distributed microphones has gained significant interest. Compared with a microphone array at a single position, spatially distributed microphones are able to acquire more information about the sound field. The usage of spatially distributed microphones allows to employ beamforming techniques for speech quality improvement in reverberant and noisy conditions. Several methods were introduced, that use a reference channel. These include the relative transfer function - generalized sidelobe canceler (RTF-GSC), the minimum variance distortionless response (MVDR) beamformer and the speech distortion weighted - multichannel Wiener filter (MWF). The MWF is a well-established technique for speech enhancement. It produces a minimum-mean-squared error (MMSE) estimate of an unknown desired signal. The desired signal of the standard MWF (S-MWF) is usually the speech component in one of the microphone signals, referred to as the reference microphone signal. However, for spatially distributed microphones, the selection of the reference microphone may have a large influence on the performance of the MWF depending on the positions of the speech/noise sources and the microphones. With the S-MWF, the overall transfer function from the speakers mouth to the output of the MWF equals the acoustic transfer function (ATF) from the speaker to the reference microphone. Hence, the reference selection determines the speech distortion. Moreover, the overall transfer function has an impact on the broadband output signal-to-noise ratio (SNR) of the beamformer. In the first period of this project, an MWF formulation with partial equalization (P-MWF) was presented, where the overall transfer function was chosen as the envelope of the individual transfer functions with an arbitrary phase reference. This results in a partial equalization of the acoustic system and an improved broadband output SNR. Furthermore, methods to estimate the required beamformer parameters were investigated, and applications in hands-free speech communication were proposed. In the second project phase, we investigated the influence of the phase reference on the linear distortion and the optimization of the broadband output SNR. The sub-projects regarding multiple speaker scenarios and implementation issues could not be considered.We considered MWF designs that are more robust against errors in the parameter estimation. This work was conducted in cooperation with Prof. Dr. Simon Doclo and Toby Christian Lawin-Ore from the Signal Processing Group, University of Oldenburg. A generalized MWF (G-MWF) is proposed, where the speech reference is chosen by a weighted sum of the magnitudes of all ATF. It was shown that a special case of the G-MWF, the alternative (A-MWF) formulation, is less sensitive to estimation errors of the speech correlation matrix compared with the S-MWF. While the P-MWF approach has advantages with respect to background noise reduction, the reverberation caused by the acoustic environment is not reduced. We further present a G-MWF approach, where the speech reference is the weighted sum of the complex-valued ATF with complex-valued weights. We demonstrate that the phase of the speech reference shapes the overall transfer function and hence impacts the speech distortion. Moreover, the overall transfer function influences the broadband output SNR. We propose two speech references that achieve a better signal-to-reverberation ratio and an improvement in broadband output SNR. The proposed references are based on the phase of a delay-and-sum beamformer (DSB). We show that both approaches can improve the signal-to-reverberation ratio (SRR) and SNR compared with the S-MWF and P-MWF.
Projektbezogene Publikationen (Auswahl)
- A multichannel Wiener filter with partial equalization for distributed microphones. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Mohonk Mountain House, New Paltz, NY, 2013
S. Stenzel, T.C. Lawin-Ore, J. Freudenberger, and S. Doclo
- Alternative formulation and robustness analysis of the multichannel Wiener filter for spatially distributed microphones. In Proceedings of the International Workshop on Acoustic Signal Enhancement (IWAENC), Antibes, France, pages 208–212, Sep 2014
T. C. Lawin-Ore, S. Stenzel, J. Freudenberger, and S. Doclo
(Siehe online unter https://doi.org/10.1109/IWAENC.2014.6954008) - Generalized multichannel wiener filter for spatially distributed microphones. In Proc. ITG Conference on Speech Communication, Erlangen, Germany, pages 1–4, Sep 2014
T. C. Lawin-Ore, S. Stenzel, J. Freudenberger, and S. Doclo
- Multichannel Signal Processing for Spatially Distributed Microphones. Shaker, 2014
S. Stenzel
- Time-frequency dependent multichannel voice activity detection. In Speech Communication; 11. ITG Symposium; Proceedings of, pages 1–4, Sept 2014
S. Stenzel and J. Freudenberger
- A phase reference for a multichannel Wiener filter by a delay and sum beamformer. In Jahrestagung für Akustik (DAGA), Nürnberg, pages 208–212, Mar 2015
S. Grimm and J. Freudenberger
- Phase reference for the generalized multichannel wiener filter. EURASIP Journal on Advances in Signal Processing, 2016, 2016:78
S. Grimm, J. Freudenberger, T. C. Lawin-Ore, and S. Doclo
(Siehe online unter https://doi.org/10.1186/s13634-016-0375-6)