Recently there have been several initiatives to create locally accessible large scale corpora based on the contents of the Internet. In this paper we present a survey on several such corpora created for different languages. We compare their distinctive features and the amount of additional annotations provided by the developers of those corpora.
In this paper we present a survey on natural language corpora, with particular focus on corpora of l...
As corpus building is an activity that takes times and costs money, readers may wish to use ready-ma...
The use of text corpora has increased considerably in the past few years, not only in the field of l...
This paper surveys the current state of word-net sense annotated corpora. We look at cor-pora in any...
We investigate the potential of using the web as a huge corpus for language studies. We test the hyp...
This article introduces ukWaC, deWaC and itWaC, three very large corpora of English, German, and Ita...
Comunicació presentada a: EACL '06: Eleventh Conference of the European Chapter of the Association f...
Large textual resources are the basis for a variety of applications in the field of corpus linguisti...
We have built a corpus containing texts in 106 languages from texts available on the Internet and on...
The paper compares systematically the utility of specially-made text corpora and the textual resourc...
In this paper, I present the COW14 tool chain, which comprises a web corpus creation tool called tex...
In this paper we discuss the five requirements for building large publicly available corpora which g...
Over the last decade, methods of web corpus construction and the evaluation of web corpora have been...
The Web contains vast amounts of linguistic data. One key issue for linguists and language technolog...
none1noAdopting the perspective of translators and translation teachers and learners, the paper expl...
In this paper we present a survey on natural language corpora, with particular focus on corpora of l...
As corpus building is an activity that takes times and costs money, readers may wish to use ready-ma...
The use of text corpora has increased considerably in the past few years, not only in the field of l...
This paper surveys the current state of word-net sense annotated corpora. We look at cor-pora in any...
We investigate the potential of using the web as a huge corpus for language studies. We test the hyp...
This article introduces ukWaC, deWaC and itWaC, three very large corpora of English, German, and Ita...
Comunicació presentada a: EACL '06: Eleventh Conference of the European Chapter of the Association f...
Large textual resources are the basis for a variety of applications in the field of corpus linguisti...
We have built a corpus containing texts in 106 languages from texts available on the Internet and on...
The paper compares systematically the utility of specially-made text corpora and the textual resourc...
In this paper, I present the COW14 tool chain, which comprises a web corpus creation tool called tex...
In this paper we discuss the five requirements for building large publicly available corpora which g...
Over the last decade, methods of web corpus construction and the evaluation of web corpora have been...
The Web contains vast amounts of linguistic data. One key issue for linguists and language technolog...
none1noAdopting the perspective of translators and translation teachers and learners, the paper expl...
In this paper we present a survey on natural language corpora, with particular focus on corpora of l...
As corpus building is an activity that takes times and costs money, readers may wish to use ready-ma...
The use of text corpora has increased considerably in the past few years, not only in the field of l...