In voice-enabled domestic or meeting environments, distributed microphone arrays aim to process distant-speech interaction into text with high accuracy. However, with dynamic corruption of noises and reverberations or human movement present, there is no guarantee that any microphone array (stream) is constantly informative. In these cases, an appropriate strategy to dynamically fuse streams is necessary. The multi-stream paradigm in Automatic Speech Recognition (ASR) considers scenarios where parallel streams carry diverse or complementary task-related knowledge. Such streams could be defined as microphone arrays, frequency bands, various modalities or etc. Hence, a robust stream fusion is crucial to emphasize on more informative streams...
This paper presents a novel streaming automatic speech recognition (ASR) framework for multi-talker ...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In this thesis, a joint optimal method for clean speech estimation and ASR in a mismatched condition...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
International audienceMulti-microphone signal processing techniques have the potential to greatly im...
A multi-stream framework with deep neural network (DNN) classifiers is applied to improve automatic ...
Analysis of data on human auditory processing suggests machine recognition paradigm, in which parall...
Automatic Speech Recognition (ASR) functionality, the automatic translation of speech into text, is ...
When speech is captured with a distant microphone, it includes distortions caused by noise, reverber...
Despite sophisticated present day automatic speech recognition (ASR) techniques, a single recognizer...
Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without o...
This paper describes noisy speech recognition for an augmented reality headset that helps verbal com...
In this thesis, the framework of multi-stream combination has been explored to improve the noise rob...
Many speech technologies, such as automatic speech recognition and speaker identification, are conve...
This thesis takes the classical signal processing problem of separating the speech of a target speak...
This paper presents a novel streaming automatic speech recognition (ASR) framework for multi-talker ...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In this thesis, a joint optimal method for clean speech estimation and ASR in a mismatched condition...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
International audienceMulti-microphone signal processing techniques have the potential to greatly im...
A multi-stream framework with deep neural network (DNN) classifiers is applied to improve automatic ...
Analysis of data on human auditory processing suggests machine recognition paradigm, in which parall...
Automatic Speech Recognition (ASR) functionality, the automatic translation of speech into text, is ...
When speech is captured with a distant microphone, it includes distortions caused by noise, reverber...
Despite sophisticated present day automatic speech recognition (ASR) techniques, a single recognizer...
Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without o...
This paper describes noisy speech recognition for an augmented reality headset that helps verbal com...
In this thesis, the framework of multi-stream combination has been explored to improve the noise rob...
Many speech technologies, such as automatic speech recognition and speaker identification, are conve...
This thesis takes the classical signal processing problem of separating the speech of a target speak...
This paper presents a novel streaming automatic speech recognition (ASR) framework for multi-talker ...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In this thesis, a joint optimal method for clean speech estimation and ASR in a mismatched condition...