Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition

Sivasankaran, Sunit
Vincent, Emmanuel
Fohr, Dominique

Publication date

November 2019

Publisher

HAL CCSD

Abstract

Submitted to ICASSP 2020We investigate the effect of speaker localization on the performance of speech recognition systems in a multispeaker, multichannel environment. Given the speaker location information , speech separation is performed in three stages. In the first stage, a simple delay-and-sum (DS) beamformer is used to enhance the signal impinging from the speaker location which is then used to estimate a time-frequency mask corresponding to the localized speaker using a neural network. This mask is used to compute the second order statistics and to derive an adaptive beamformer in the third stage. We generated a multichannel, multispeaker, reverberated, noisy dataset inspired from the well studied WSJ0-2mix and study the performance ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition

Abstract

Extracted data

Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition

Abstract

Extracted data

Related items

Related items