In this work, we present our progress in multi-source far field automatic speech-to-text transcription for lecture speech. In particular, we show how the best of several far field channels can be selected based on a signal-to-noise ratio criterion, and how the signals from multiple channels can be combined at either the waveform level using blind channel combination or at the hypothesis level using confusion network techniques to improve the accuracy of a far field lecture transcription system. Using the techniques described here, we ran a series of experiments on the test set used by the US National Institute of Standards and Technologies for the RT-05S evaluation. For the multiple distant microphones (MDM) task of RT-05S, our system achie...
Far-field microphone speech signals cause high error rates for automatic speech recognition systems,...
In this paper we propose a technique for combining hypothe- ses generated in a multi-microphone set...
This paper describes a new corpus of multi-channel audio data designed to study and develop distant...
Interest within the automatic speech recognition (ASR) research community has recently focused on th...
When speech is captured with a distant microphone, it includes distortions caused by noise, reverber...
In this paper, we describe our efforts to develop acoustic models and decoding setups suitable for a...
In a multi-microphone distant speech recognition task, the redundancy of information that results fr...
Automatic transcription of lectures is becoming an important task. Possible applications can be foun...
This paper presents an investigation of far field speech recog-nition using beamforming and channel ...
Die automatische Transkription von Vorträgen, Vorlesungen und Präsentationen wird immer wichtiger un...
Shifting from a single to a multi-microphone setting, distant speech recognition can be benefited fr...
A multi-microphone hypothesis combination approach, suitable for the distant-talking scenario, is pr...
Distant-speech recognition represents a technology of fundamental importance for future development ...
Automatic speech recognition in a room with distant microphones is strongly affected by noise and re...
Abstract-- Human-Machine interaction in meetings requires the localization and identification of the...
Far-field microphone speech signals cause high error rates for automatic speech recognition systems,...
In this paper we propose a technique for combining hypothe- ses generated in a multi-microphone set...
This paper describes a new corpus of multi-channel audio data designed to study and develop distant...
Interest within the automatic speech recognition (ASR) research community has recently focused on th...
When speech is captured with a distant microphone, it includes distortions caused by noise, reverber...
In this paper, we describe our efforts to develop acoustic models and decoding setups suitable for a...
In a multi-microphone distant speech recognition task, the redundancy of information that results fr...
Automatic transcription of lectures is becoming an important task. Possible applications can be foun...
This paper presents an investigation of far field speech recog-nition using beamforming and channel ...
Die automatische Transkription von Vorträgen, Vorlesungen und Präsentationen wird immer wichtiger un...
Shifting from a single to a multi-microphone setting, distant speech recognition can be benefited fr...
A multi-microphone hypothesis combination approach, suitable for the distant-talking scenario, is pr...
Distant-speech recognition represents a technology of fundamental importance for future development ...
Automatic speech recognition in a room with distant microphones is strongly affected by noise and re...
Abstract-- Human-Machine interaction in meetings requires the localization and identification of the...
Far-field microphone speech signals cause high error rates for automatic speech recognition systems,...
In this paper we propose a technique for combining hypothe- ses generated in a multi-microphone set...
This paper describes a new corpus of multi-channel audio data designed to study and develop distant...