Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS rich transcription program. Tasks include detection of sentence boundaries, filler words, and disfluencies. Modeling approaches combine lexical, prosodic, and syntactic information, using various modeling techniques for knowledge source integration. The performance of these methods is evaluated by task, by data source (broadcast news versus spontaneous telephone conversations) and by whether transcriptions come from humans or from an (errorful) automatic speech recognizer. A representative sample of results shows that combining multiple knowledge ...
Some of the major research issues in the field of speech recognition revolve around methods of incor...
After several decades of effort, speech recognition technologies have made significant progress and ...
This thesis introduces a general method for using information at the utterance level and across utte...
Both human and automatic processing of speech require recognition of more than just words. In this p...
Both human and automatic processing of speech require recog-nition of more than just words. In this ...
Both human and automatic processing of speech require recognition of more than just words. In this p...
This project investigated the interaction between parsing and the detection of structural metadata i...
Structural metadata extraction (MDE) research aims to develop techniques for automatic conversion of...
Although speech recognition technology has significantly improved during the past few decades, curre...
Structural metadata extraction (MDE) research aims to develop techniques for automatic conversion of...
We report on the success of a two-pass approach to annotating metadata, speech effects and syntactic...
124 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2004.Prosody (the melody and rhyth...
The regular occurrence of disuencies is a distinguishing characteristic of spontaneous speech. Detec...
This thesis studies Sentence Unit Detection (SUD) that uses lexical information for Automatic Speech...
In this paper we apply speech recognition for automatic tran-script generation for spoken document r...
Some of the major research issues in the field of speech recognition revolve around methods of incor...
After several decades of effort, speech recognition technologies have made significant progress and ...
This thesis introduces a general method for using information at the utterance level and across utte...
Both human and automatic processing of speech require recognition of more than just words. In this p...
Both human and automatic processing of speech require recog-nition of more than just words. In this ...
Both human and automatic processing of speech require recognition of more than just words. In this p...
This project investigated the interaction between parsing and the detection of structural metadata i...
Structural metadata extraction (MDE) research aims to develop techniques for automatic conversion of...
Although speech recognition technology has significantly improved during the past few decades, curre...
Structural metadata extraction (MDE) research aims to develop techniques for automatic conversion of...
We report on the success of a two-pass approach to annotating metadata, speech effects and syntactic...
124 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2004.Prosody (the melody and rhyth...
The regular occurrence of disuencies is a distinguishing characteristic of spontaneous speech. Detec...
This thesis studies Sentence Unit Detection (SUD) that uses lexical information for Automatic Speech...
In this paper we apply speech recognition for automatic tran-script generation for spoken document r...
Some of the major research issues in the field of speech recognition revolve around methods of incor...
After several decades of effort, speech recognition technologies have made significant progress and ...
This thesis introduces a general method for using information at the utterance level and across utte...