The CGN corpus (Oostdijk, 2000) (Corpus Gesproken Nederlands/Corpus Spoken Dutch) is a large speech corpus of contemporary Dutch as spoken in Belgium (3.3 million words) and in the Netherlands (5.6 million words). Due to its size, manual phonemic annotation was limited to 10% of the data and automatic systems were used to complement this data. This paper describes the automatic generation of the phonemic annotations and the corresponding segmentations. First, we detail the processes used to generate possible pronunciations for each sentence and to select to most likely one. Next, we identify the remaining difficulties when handling the CGN data and explain how we solved them. We conclude with an evaluation of the quality of the resulting tr...
Each time a word is uttered, even pronounced by one and the same speaker, its pronunciation can diff...
In this paper we present 3 applications in the domain of Automatic Speech Recognition for Dutch, all...
Contains fulltext : 27415.pdf (publisher's version ) (Open Access)Each time a word...
The CGN corpus (Oostdijk, 2000) (Corpus Gesproken Nederlands/Corpus Spoken Dutch) is a large speech ...
This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for ...
This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for ...
In the development of annotations for a spoken database, an important issue is whether the annotatio...
In spontaneous, conversational speech, words are often reduced compared to their citation forms, suc...
The paper discusses the syntactic annotation for the Spoken Dutch Corpus, a Dutch/Flemish cooperatio...
In spontaneous, conversational speech, words are often reduced compared to their citation forms, suc...
In spontaneous, conversational speech, words are often reduced compared to their citation forms, suc...
This paper reports on an experiment aimed at establishing how phonetic transcriptions for the large ...
This paper focuses on the specification of the orthographic transcription task in the Spoken Dutch C...
We describe a method for the automatic production of phonetic transcriptions in large speech corpora...
Each time a word is uttered, even pronounced by one and the same speaker, its pronunciation can diff...
Each time a word is uttered, even pronounced by one and the same speaker, its pronunciation can diff...
In this paper we present 3 applications in the domain of Automatic Speech Recognition for Dutch, all...
Contains fulltext : 27415.pdf (publisher's version ) (Open Access)Each time a word...
The CGN corpus (Oostdijk, 2000) (Corpus Gesproken Nederlands/Corpus Spoken Dutch) is a large speech ...
This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for ...
This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for ...
In the development of annotations for a spoken database, an important issue is whether the annotatio...
In spontaneous, conversational speech, words are often reduced compared to their citation forms, suc...
The paper discusses the syntactic annotation for the Spoken Dutch Corpus, a Dutch/Flemish cooperatio...
In spontaneous, conversational speech, words are often reduced compared to their citation forms, suc...
In spontaneous, conversational speech, words are often reduced compared to their citation forms, suc...
This paper reports on an experiment aimed at establishing how phonetic transcriptions for the large ...
This paper focuses on the specification of the orthographic transcription task in the Spoken Dutch C...
We describe a method for the automatic production of phonetic transcriptions in large speech corpora...
Each time a word is uttered, even pronounced by one and the same speaker, its pronunciation can diff...
Each time a word is uttered, even pronounced by one and the same speaker, its pronunciation can diff...
In this paper we present 3 applications in the domain of Automatic Speech Recognition for Dutch, all...
Contains fulltext : 27415.pdf (publisher's version ) (Open Access)Each time a word...