We describe the compilation of a large corpus of French-Dutch sentence pairs from official Belgian documents which are available in the online version of the publication Belgisch Staatsblad/Moniteur belge, and which have been published between 1997 and 2006. After downloading files in batch, we filtered out documents which have no translation in the other language, documents which contain several languages (by checking on discriminating words), and pairs of documents with a substantial difference in length. We segmented the documents into sentences and aligned the latter, which resulted in 5 million sentence pairs (only one-to-one links were included in the parallel corpus); there are 2.4 million unique pairs. Sample-based evaluation of the...
This paper describes an ongoing effort to build a large-scale monolingual treebank of parallel/ com...
This paper presents the Wablieft corpus, a two million words corpus of a Belgian easy-to-read newspa...
In the Netherlands and in Flanders a 10-million-words corpus of contemporary standard Dutch is now b...
We describe the compilation of a large corpus of French-Dutch sentence pairs from official Belgian d...
This chapter introduces a new, updated version of the Dutch Parallel Corpus, a bidirectional paralle...
The Dutch Parallel Corpus (DPC) is a translation corpus containing Dutch, English and French text sa...
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and ...
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized ...
We investigate the extent to which the detection of phraseological (in)consistency in the translatio...
A corpus called DutchParl is created which aims to contain all digitally available parliamentary doc...
In this article, we present a new corpus spanning 163 years of written Dutch. This Dutch Corpus of C...
The importance of sentence-aligned parallel corpora has been widely acknowledged. Reference corpora ...
Trilingual parallel corpus of EUR-Lex Document Extracts that include terms with colour names (black,...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with...
This paper describes an ongoing effort to build a large-scale monolingual treebank of parallel/ com...
This paper presents the Wablieft corpus, a two million words corpus of a Belgian easy-to-read newspa...
In the Netherlands and in Flanders a 10-million-words corpus of contemporary standard Dutch is now b...
We describe the compilation of a large corpus of French-Dutch sentence pairs from official Belgian d...
This chapter introduces a new, updated version of the Dutch Parallel Corpus, a bidirectional paralle...
The Dutch Parallel Corpus (DPC) is a translation corpus containing Dutch, English and French text sa...
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and ...
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized ...
We investigate the extent to which the detection of phraseological (in)consistency in the translatio...
A corpus called DutchParl is created which aims to contain all digitally available parliamentary doc...
In this article, we present a new corpus spanning 163 years of written Dutch. This Dutch Corpus of C...
The importance of sentence-aligned parallel corpora has been widely acknowledged. Reference corpora ...
Trilingual parallel corpus of EUR-Lex Document Extracts that include terms with colour names (black,...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with...
This paper describes an ongoing effort to build a large-scale monolingual treebank of parallel/ com...
This paper presents the Wablieft corpus, a two million words corpus of a Belgian easy-to-read newspa...
In the Netherlands and in Flanders a 10-million-words corpus of contemporary standard Dutch is now b...