Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopic language research. Such corpora have recently been developed for a variety of historical languages, or are still under development. One of those under development is the fully tagged and parsed Corpus of Historical Low German (CHLG), which is aimed at facilitating research into the highly under-researched diachronic syntax of Low German. The present paper reports on a crucial step in creating the corpus, viz. the creation of a part-of-speech tagger for Middle Low German (MLG). Having been transmitted in several non-standardised written varieties, MLG poses a challenge to standard POS taggers, which usually rely on normalized spelling. We ou...
Tagger accuracy deteriorates when applied to texts different from the training corpus, e.g. with res...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopi...
We outline the issues and decisions involved in creating a Penn-style treebank of Middle Low German ...
This paper describes the application of a part-of-speech tagger to a particular configuration of his...
Our paper focuses on the one hand on the challenges posed by the structural variability, flexibility...
In this paper, we present experiments on POS tagging historical texts that contain spelling variatio...
We describe, evaluate, and improve the automatic annotation of diachronic corpora at the levels of w...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Spoken data from language-contact situations is extremely varied. This heterogeneity makes it diffic...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
We describe, evaluate, and improve the automatic annotation of diachronic cor-pora at the levels of ...
Corpora of Early Modern English have been collected and released for research for a number of years....
Tagger accuracy deteriorates when applied to texts different from the training corpus, e.g. with res...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopi...
We outline the issues and decisions involved in creating a Penn-style treebank of Middle Low German ...
This paper describes the application of a part-of-speech tagger to a particular configuration of his...
Our paper focuses on the one hand on the challenges posed by the structural variability, flexibility...
In this paper, we present experiments on POS tagging historical texts that contain spelling variatio...
We describe, evaluate, and improve the automatic annotation of diachronic corpora at the levels of w...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Spoken data from language-contact situations is extremely varied. This heterogeneity makes it diffic...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
We describe, evaluate, and improve the automatic annotation of diachronic cor-pora at the levels of ...
Corpora of Early Modern English have been collected and released for research for a number of years....
Tagger accuracy deteriorates when applied to texts different from the training corpus, e.g. with res...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...