This contribution presents the advances of the Fraunhofer IAIS Audiomining system for vocabulary independent spoken term detection since the last SIGIR workshop on searching spontaneous conversational speech in 2007. Based on feedback from archivists involved in the development of the prototype, a set of requirements for spoken term detection systems was established, guiding the development of the overall system. After improving the automatic speech recognition (ASR) baseline with data from the broadcast domain, the syllable error rate on a set of broadcast news and broadcast conversation shows could be improved by 45.6% relative, while the time required for analyzing the data could be reduced by 90%. Based on the new ASR results, the F1 va...
The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore...
As data storage capacities grow to nearly unlimited sizes thanks to ever ongoing hardware and softwa...
Item does not contain fulltextThe components of the Frisian data collection are speech and language ...
Archive departments of large radio broadcasters stand to benefit greatly from speech recognition tec...
Searching for keywords in a collection of spoken documents is a challenging task. The use of Automat...
Abstract. The spoken content of audio/visual collections such as TV or radio archives is an informat...
This paper investigates the detection of English spoken terms in a conversational multi-language sce...
This research has made contributions to the area of spoken term detection (STD), defined as the proc...
The Fraunhofer IAIS AudioMining system for vocabulary independent spoken term detection is able to p...
The application of automatic speech recognition in the broadcast news domain is well studied. Recogn...
A new type of spoken document retrieval (SDR) system is proposed that identifies target spoken docum...
Archivists, journalists and content hosters often face the problem of dealing with vast amounts of a...
In this paper we describe the large-scale German broadcast corpus (GER-TV1000h) containing more than...
Spoken term detection (STD) aims at retrieving data from a speech repository given a textual represe...
The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore...
The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore...
As data storage capacities grow to nearly unlimited sizes thanks to ever ongoing hardware and softwa...
Item does not contain fulltextThe components of the Frisian data collection are speech and language ...
Archive departments of large radio broadcasters stand to benefit greatly from speech recognition tec...
Searching for keywords in a collection of spoken documents is a challenging task. The use of Automat...
Abstract. The spoken content of audio/visual collections such as TV or radio archives is an informat...
This paper investigates the detection of English spoken terms in a conversational multi-language sce...
This research has made contributions to the area of spoken term detection (STD), defined as the proc...
The Fraunhofer IAIS AudioMining system for vocabulary independent spoken term detection is able to p...
The application of automatic speech recognition in the broadcast news domain is well studied. Recogn...
A new type of spoken document retrieval (SDR) system is proposed that identifies target spoken docum...
Archivists, journalists and content hosters often face the problem of dealing with vast amounts of a...
In this paper we describe the large-scale German broadcast corpus (GER-TV1000h) containing more than...
Spoken term detection (STD) aims at retrieving data from a speech repository given a textual represe...
The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore...
The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore...
As data storage capacities grow to nearly unlimited sizes thanks to ever ongoing hardware and softwa...
Item does not contain fulltextThe components of the Frisian data collection are speech and language ...