The article presents the status of the PEDANT project with parallel corpora at the Language Bank at Göteborg University. The solutions for access to the corpus data are presented. Access is provided by way of the internet and standard applications and SGML-aware programming tools. The SGML format for encoding translation pairs is outlined together. The methods allow working with everything from plain text to texts densely encoded with linguistic information. Keywords: sgml, parallel corpora, morphosyntactic encoding, lemmatization, multiword units, compound words, internet acces
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
We discuss a previously proposed method for augmenting parallel corpora of limited size for the purp...
Parallel corpora are collections of source texts accompanied by one or more translations of that sou...
Abstract: The article presents the status of the PEDANT project with parallel corpora at the Lan-gua...
This paper focuses on investigation of the parallel corpora role as a linguistic recourse. The appli...
There has recently been an increasing awareness of the importance of large collections of texts (cor...
We report on methods to create the largest publicly available parallel corpora by crawling the web, ...
Exchange between the translation studies and the computational linguistics communities has tradition...
Exchange between the translation studies and the computational linguistics communities has tradition...
This paper discusses the role played by parallel corpora in the design and implementation of fully a...
As empirical methods have come to the fore in multilingual language technology and translation studi...
This paper focuses on the description of the corpus «PEST-INTER» in five languages and the process o...
There are so many variables underlying translation that examining anything longer than a few paragra...
Abstract: 2006 saw the start of a project for compiling a multifunctional parallel corpus with Dutch...
This chapter gives an overview of parallel corpora, i.e. corpora containing source texts in a given ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
We discuss a previously proposed method for augmenting parallel corpora of limited size for the purp...
Parallel corpora are collections of source texts accompanied by one or more translations of that sou...
Abstract: The article presents the status of the PEDANT project with parallel corpora at the Lan-gua...
This paper focuses on investigation of the parallel corpora role as a linguistic recourse. The appli...
There has recently been an increasing awareness of the importance of large collections of texts (cor...
We report on methods to create the largest publicly available parallel corpora by crawling the web, ...
Exchange between the translation studies and the computational linguistics communities has tradition...
Exchange between the translation studies and the computational linguistics communities has tradition...
This paper discusses the role played by parallel corpora in the design and implementation of fully a...
As empirical methods have come to the fore in multilingual language technology and translation studi...
This paper focuses on the description of the corpus «PEST-INTER» in five languages and the process o...
There are so many variables underlying translation that examining anything longer than a few paragra...
Abstract: 2006 saw the start of a project for compiling a multifunctional parallel corpus with Dutch...
This chapter gives an overview of parallel corpora, i.e. corpora containing source texts in a given ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
We discuss a previously proposed method for augmenting parallel corpora of limited size for the purp...
Parallel corpora are collections of source texts accompanied by one or more translations of that sou...