Wikipedia Corpus is a bilingual—Spanish-English—single-label corpus composed of 3,019 documents about general topics written in English, and 832 documents written in Spanish, classified under three semantically distant categories: Culture and the arts, Geography and places and Mathematics and logic
This archive contains a collection of language corpora. These are text files that contain samples of...
Wikipedia is a valuable resource whose usage goes beyond the encyclopedia itself. In this paper the ...
Spanish text-corpus extracted from Wikipedia, using the platform described on Cadavid Rengifo, Hécto...
Wikipedia Human Medicine Corpus is a bilingual—Spanish-English—single-label corpus composed of 2,143...
This text corpus is composed of texts of English Wikipedia extracted from the Wikipedia dump of 26th...
<p>This text corpus is composed of texts of English Wikipedia extracted from the Wikipedia dump of 2...
Wikipedia has long presented itself as “the biggest multilingual free-content encyclopedia on the In...
Wikipedia, the popular online encyclopedia, has in just six years grown from an adjunct to the now-d...
Wikipedia has long presented itself as “the biggest multilingual free-content encyclopedia on the In...
Producing large language corpora is not only highly work-intensive, but also increasingly a process ...
Producing large language corpora is not only highly work-intensive, but also increasingly a process ...
Producing large language corpora is not only highly work-intensive, but also increasingly a process ...
This is the stand-off GrAF version of Spanish portions of the Wikipedia (based on a 2006 dump). This...
Trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia (based o...
This archive contains a collection of language corpora. These are text files that contain samples of...
This archive contains a collection of language corpora. These are text files that contain samples of...
Wikipedia is a valuable resource whose usage goes beyond the encyclopedia itself. In this paper the ...
Spanish text-corpus extracted from Wikipedia, using the platform described on Cadavid Rengifo, Hécto...
Wikipedia Human Medicine Corpus is a bilingual—Spanish-English—single-label corpus composed of 2,143...
This text corpus is composed of texts of English Wikipedia extracted from the Wikipedia dump of 26th...
<p>This text corpus is composed of texts of English Wikipedia extracted from the Wikipedia dump of 2...
Wikipedia has long presented itself as “the biggest multilingual free-content encyclopedia on the In...
Wikipedia, the popular online encyclopedia, has in just six years grown from an adjunct to the now-d...
Wikipedia has long presented itself as “the biggest multilingual free-content encyclopedia on the In...
Producing large language corpora is not only highly work-intensive, but also increasingly a process ...
Producing large language corpora is not only highly work-intensive, but also increasingly a process ...
Producing large language corpora is not only highly work-intensive, but also increasingly a process ...
This is the stand-off GrAF version of Spanish portions of the Wikipedia (based on a 2006 dump). This...
Trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia (based o...
This archive contains a collection of language corpora. These are text files that contain samples of...
This archive contains a collection of language corpora. These are text files that contain samples of...
Wikipedia is a valuable resource whose usage goes beyond the encyclopedia itself. In this paper the ...
Spanish text-corpus extracted from Wikipedia, using the platform described on Cadavid Rengifo, Hécto...