In this paper, we present experiments on POS tagging historical texts that contain spelling variation. The experiments are conducted in a low-resource scenario with a small amount of training data (here: 12,000 tokens). We investigate different ways of dealing with spelling variation in such a situation on different variants of historical German. Firstly, we add character n-grams as features to the tagger to enable it to learn spelling variation. Our tagging experiments show that this improves accuracy when there is enough variation in the data, but leads to a decrease in accuracy if the amount of variation is low. Secondly, we preprocess the data before training and applying the tagger, reducing spelling variation by normalization, rule-ba...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Tagger accuracy deteriorates when applied to texts different from the training corpus, e.g. with res...
To be able to use existing natural language processing tools for analysing historical text, an impor...
This paper describes the application of a part-of-speech tagger to a particular configuration of his...
In this article, we describe the respective approaches we have taken when addressing issues of spell...
We present a study of the adequacy of current methods that are used for POS-tagging historical Dutch...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Large quantities of spelling variation in corpora, such as that found in Early Modern English, can c...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
Natural language processing for historical text imposes a variety of challenges, such as to deal wit...
To be able to use existing natural language processing tools for analysing historical text, an impor...
When applying corpus linguistic techniques to historical corpora, the corpus researcher should be ca...
Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopi...
Language technology tools can be very use-ful for making information concealed in historical documen...
Language technology tools can be very use- ful for making information concealed in historical docume...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Tagger accuracy deteriorates when applied to texts different from the training corpus, e.g. with res...
To be able to use existing natural language processing tools for analysing historical text, an impor...
This paper describes the application of a part-of-speech tagger to a particular configuration of his...
In this article, we describe the respective approaches we have taken when addressing issues of spell...
We present a study of the adequacy of current methods that are used for POS-tagging historical Dutch...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Large quantities of spelling variation in corpora, such as that found in Early Modern English, can c...
This paper presents work on manual and semi-automatic normalization of historical language data. We ...
Natural language processing for historical text imposes a variety of challenges, such as to deal wit...
To be able to use existing natural language processing tools for analysing historical text, an impor...
When applying corpus linguistic techniques to historical corpora, the corpus researcher should be ca...
Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopi...
Language technology tools can be very use-ful for making information concealed in historical documen...
Language technology tools can be very use- ful for making information concealed in historical docume...
In this paper we focus on automatic part-of-speech (POS) annotation, in the context of historical En...
Tagger accuracy deteriorates when applied to texts different from the training corpus, e.g. with res...
To be able to use existing natural language processing tools for analysing historical text, an impor...