We present the NVIDIA NeMo team's multi-channel speech recognition system for the 7th CHiME Challenge Distant Automatic Speech Recognition (DASR) Task, focusing on the development of a multi-channel, multi-speaker speech recognition system tailored to transcribe speech from distributed microphones and microphone arrays. The system predominantly comprises of the following integral modules: the Speaker Diarization Module, Multi-channel Audio Front-End Processing Module, and the ASR Module. These components collectively establish a cascading system, meticulously processing multi-channel and multi-speaker audio input. Moreover, this paper highlights the comprehensive optimization process that significantly enhanced our system's performance. Our...
This paper introduces the MERL/SRI system designed for the 3rd CHiME speech separation and recogniti...
This paper proposes a neural network based system for multi-channel speech enhancement and dereverbe...
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberati...
International audienceThe CHiME challenge series aims to advance robust automatic speech recognition...
International audienceThe CHiME challenge series aims to advance far field speech recognition techno...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
International audienceThis paper presents the design and outcomes of the CHiME-3 challenge, the firs...
This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krak...
This paper presents the design and outcomes of the CHiME-3 challenge, the first open speech recognit...
Recognizing speech under noisy condition is an ill-posed problem. The CHiME 3 challenge targets robu...
The paper describes a system for automatic speech recognition (ASR) that is benchmarked with data of...
International audienceDistant-microphone automatic speech recognition (ASR) remains a challenging go...
International audienceMulti-microphone signal processing techniques have the potential to greatly im...
International audienceDistant-microphone automatic speech recognition (ASR) remains a challenging go...
Submitted to ICASSP 2020International audienceWe consider the problem of robust automatic speech rec...
This paper introduces the MERL/SRI system designed for the 3rd CHiME speech separation and recogniti...
This paper proposes a neural network based system for multi-channel speech enhancement and dereverbe...
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberati...
International audienceThe CHiME challenge series aims to advance robust automatic speech recognition...
International audienceThe CHiME challenge series aims to advance far field speech recognition techno...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
International audienceThis paper presents the design and outcomes of the CHiME-3 challenge, the firs...
This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krak...
This paper presents the design and outcomes of the CHiME-3 challenge, the first open speech recognit...
Recognizing speech under noisy condition is an ill-posed problem. The CHiME 3 challenge targets robu...
The paper describes a system for automatic speech recognition (ASR) that is benchmarked with data of...
International audienceDistant-microphone automatic speech recognition (ASR) remains a challenging go...
International audienceMulti-microphone signal processing techniques have the potential to greatly im...
International audienceDistant-microphone automatic speech recognition (ASR) remains a challenging go...
Submitted to ICASSP 2020International audienceWe consider the problem of robust automatic speech rec...
This paper introduces the MERL/SRI system designed for the 3rd CHiME speech separation and recogniti...
This paper proposes a neural network based system for multi-channel speech enhancement and dereverbe...
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberati...