AbstractEveryone working on general language would like their corpus to be bigger, wider-coverage, cleaner, duplicate-free, and with richer metadata. As a response to that wish, Lexical Computing Ltd. has a programme to develop very large ‘TenTen’ web corpora. In this paper we introduce the Spanish corpus, esTenTen, of 8 billion words and 19 different national varieties of Spanish. We investigate the distance between the national varieties as represented in the corpus, and examine in detail the keywords of Peninsular Spanish vs. American Spanish, finding a wide range of linguistic, cultural and political contrasts
AbstractThis paper considers various morphological processes involved in the creation of pleiosyllab...
Aquest article fa un repàs als plantejaments actuals sobre l'ús del web com a corpus lingüístic i em...
Historical corpora offer many potentialities for linguistic research. Thus, the present article prov...
AbstractEveryone working on general language would like their corpus to be bigger, wider-coverage, c...
This proposal requests Level 1 funding to develop a novel Spanish-language corpus, ACTIV-ES. This el...
AbstractThis paper outlines current work on the construction of a high-quality, richly-annotated and...
In the recent years, transformer-based models have lead to significant advances in language modellin...
Comunicació presentada a: EACL '06: Eleventh Conference of the European Chapter of the Association f...
This paper seeks to describe the creation of a Spanish lexicon with semantic annotation in order to ...
The first annotated corpus of historical and modern Spanish -the 100,000,000 word Corpus del Español...
Iberia is a synchronic corpus of scientific Spanish designed mainly for terminological studies. In t...
Language varieties should be taken into account in order to enhance fluency and naturalness of trans...
ACTRES Project: English-Spanish Corpus-based CA Applications in Translation Step 1: Using semant...
We have built a corpus containing texts in 106 languages from texts available on the Internet and on...
The Web contains vast amounts of linguistic data. One key issue for linguists and language technolog...
AbstractThis paper considers various morphological processes involved in the creation of pleiosyllab...
Aquest article fa un repàs als plantejaments actuals sobre l'ús del web com a corpus lingüístic i em...
Historical corpora offer many potentialities for linguistic research. Thus, the present article prov...
AbstractEveryone working on general language would like their corpus to be bigger, wider-coverage, c...
This proposal requests Level 1 funding to develop a novel Spanish-language corpus, ACTIV-ES. This el...
AbstractThis paper outlines current work on the construction of a high-quality, richly-annotated and...
In the recent years, transformer-based models have lead to significant advances in language modellin...
Comunicació presentada a: EACL '06: Eleventh Conference of the European Chapter of the Association f...
This paper seeks to describe the creation of a Spanish lexicon with semantic annotation in order to ...
The first annotated corpus of historical and modern Spanish -the 100,000,000 word Corpus del Español...
Iberia is a synchronic corpus of scientific Spanish designed mainly for terminological studies. In t...
Language varieties should be taken into account in order to enhance fluency and naturalness of trans...
ACTRES Project: English-Spanish Corpus-based CA Applications in Translation Step 1: Using semant...
We have built a corpus containing texts in 106 languages from texts available on the Internet and on...
The Web contains vast amounts of linguistic data. One key issue for linguists and language technolog...
AbstractThis paper considers various morphological processes involved in the creation of pleiosyllab...
Aquest article fa un repàs als plantejaments actuals sobre l'ús del web com a corpus lingüístic i em...
Historical corpora offer many potentialities for linguistic research. Thus, the present article prov...