In this article we describe two different strategies for the automatic tagging of a Spanish diachronic corpus involving the adaptation of existing NLP tools developed for modern Spanish. In the initial approach we follow a state-of-the-art strategy, which consists on standardizing the spelling and the lexicon. This approach boosts POS-tagging accuracy to 90, which represents a raw improvement of over 20% with respect to the results obtained without any pre-processing. In order to enable non-expert users in NLP to use this new resource, the corpus has been integrated into IAC (Corpora Interface Access). We discuss the shortcomings of the initial approach and propose a new one, which does not consist in adapting the source texts to the tagger...
Phraseology studies have been enhanced by Corpus Linguistics, which has become an interdisciplinary ...
This paper explores the multi-layer annotation of a written domain-restricted English-Spanish compar...
Corpus annotation practices for most major languages have moved the focus, within the NLP research c...
In this article we describe two different strategies for the automatic tagging of a Spanish diachron...
In this article we describe two different strategies for the automatic tagging of a Spanish diachron...
The impact-es diachronic corpus of historical Spanish compiles over one hundred books —containing ap...
This repository is part of the Annotated Corpora of Historical Catalan (HisCat). It contains the fir...
This data collection contains diachronic Word Usage Graphs (WUGs) for Spanish. Find a description of...
[Abstract] One of the most important prior tasks for robust part-of-speech tagging is the correct to...
In this paper, I discuss the theoretical and practical issues raised in the development of Tibidabo,...
Judeo-Spanish differs from late 15th-century Spanish and modern Spanish in several respects, such as...
This data collection contains discovered diachronic Word Usage Graphs (WUGs) for Spanish. Find a des...
In this paper we present a whole Natural Language Processing (NLP) system for Spanish. The core of t...
This paper seeks to describe the creation of a Spanish lexicon with semantic annotation in order to ...
This paper presents an algorithm for identifying noun phrase antecedents of third person personal pr...
Phraseology studies have been enhanced by Corpus Linguistics, which has become an interdisciplinary ...
This paper explores the multi-layer annotation of a written domain-restricted English-Spanish compar...
Corpus annotation practices for most major languages have moved the focus, within the NLP research c...
In this article we describe two different strategies for the automatic tagging of a Spanish diachron...
In this article we describe two different strategies for the automatic tagging of a Spanish diachron...
The impact-es diachronic corpus of historical Spanish compiles over one hundred books —containing ap...
This repository is part of the Annotated Corpora of Historical Catalan (HisCat). It contains the fir...
This data collection contains diachronic Word Usage Graphs (WUGs) for Spanish. Find a description of...
[Abstract] One of the most important prior tasks for robust part-of-speech tagging is the correct to...
In this paper, I discuss the theoretical and practical issues raised in the development of Tibidabo,...
Judeo-Spanish differs from late 15th-century Spanish and modern Spanish in several respects, such as...
This data collection contains discovered diachronic Word Usage Graphs (WUGs) for Spanish. Find a des...
In this paper we present a whole Natural Language Processing (NLP) system for Spanish. The core of t...
This paper seeks to describe the creation of a Spanish lexicon with semantic annotation in order to ...
This paper presents an algorithm for identifying noun phrase antecedents of third person personal pr...
Phraseology studies have been enhanced by Corpus Linguistics, which has become an interdisciplinary ...
This paper explores the multi-layer annotation of a written domain-restricted English-Spanish compar...
Corpus annotation practices for most major languages have moved the focus, within the NLP research c...