We describe the compilation of a large corpus of French-Dutch sentence pairs from official Belgian documents which are available in the online version of the publication Belgisch Staatsblad/Moniteur belge, and which have been published between 1997 and 2006. After downloading files in batch, we filtered out documents which have no translation in the other language, documents which contain several languages (by checking on discriminating words), and pairs of documents with a substantial difference in length. We segmented the documents into sentences and aligned the latter, which resulted in 5 million sentence pairs (only one-to-one links were included in the parallel corpus); there are 2.4 million unique pairs. Sample-based evaluation of the...
In this paper the ANNO Project ("Een Geannoteerde Publieke Gegevensbank voor het Geschreven Ned...
The construction of a large and richly annotated corpus of written Dutch was identified as one of th...
We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with...
We describe the compilation of a large corpus of French-Dutch sentence pairs from official Belgian d...
The Dutch Parallel Corpus (DPC) is a translation corpus containing Dutch, English and French text sa...
This chapter introduces a new, updated version of the Dutch Parallel Corpus, a bidirectional paralle...
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized ...
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and ...
We investigate the extent to which the detection of phraseological (in)consistency in the translatio...
A corpus called DutchParl is created which aims to contain all digitally available parliamentary doc...
In this article, we present a new corpus spanning 163 years of written Dutch. This Dutch Corpus of C...
The importance of sentence-aligned parallel corpora has been widely acknowledged. Reference corpora ...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
Trilingual parallel corpus of EUR-Lex Document Extracts that include terms with colour names (black,...
This paper describes an ongoing effort to build a large-scale monolingual treebank of parallel/ com...
In this paper the ANNO Project ("Een Geannoteerde Publieke Gegevensbank voor het Geschreven Ned...
The construction of a large and richly annotated corpus of written Dutch was identified as one of th...
We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with...
We describe the compilation of a large corpus of French-Dutch sentence pairs from official Belgian d...
The Dutch Parallel Corpus (DPC) is a translation corpus containing Dutch, English and French text sa...
This chapter introduces a new, updated version of the Dutch Parallel Corpus, a bidirectional paralle...
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized ...
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and ...
We investigate the extent to which the detection of phraseological (in)consistency in the translatio...
A corpus called DutchParl is created which aims to contain all digitally available parliamentary doc...
In this article, we present a new corpus spanning 163 years of written Dutch. This Dutch Corpus of C...
The importance of sentence-aligned parallel corpora has been widely acknowledged. Reference corpora ...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
Trilingual parallel corpus of EUR-Lex Document Extracts that include terms with colour names (black,...
This paper describes an ongoing effort to build a large-scale monolingual treebank of parallel/ com...
In this paper the ANNO Project ("Een Geannoteerde Publieke Gegevensbank voor het Geschreven Ned...
The construction of a large and richly annotated corpus of written Dutch was identified as one of th...
We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with...