This work presents Stagger, a new open-source part of speech tagger for Swedish based on the Averaged Perceptron. By using the SALDO morphological lexicon and semi-supervised learning in the form of Collobert and Weston embeddings, it reaches an accuracy of 96.4 % on the standard Stockholm-Umeå Corpus dataset, making it the best single part of speech tagging system reported for Swedish. Accuracy increases to 96.6 % on the latest version of the corpus, where the annotation has been revised to increase consistency. Stagger is also evaluated on a new corpus of Swedish blog posts, investigating its out-of-domain performance.
State-of-the-art statistical part-of-speech taggers mainly use information on tag bi- or trigrams, d...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
There is an increasing interest in the NLP community in developing tools for annotating historical d...
This work presents Stagger, a new open-source part of speech tagger for Swedish based on the Average...
The field of Part of Speech (POS) tagging has made slow but steady progress during the last decade, ...
HunPoS, a freely available open source part-of-speech tagger—a reimplementa-tion of one of the best ...
In this paper a data-driven method for Part-of-Speech tagging not using any n-grams of tags is prese...
In this paper; we experiment with using Stagger; an open-source implementation of an Averaged Percep...
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Percep...
This thesis describes the work of providing separate morphological processing and part-of-speech tag...
Despite many years of research on Swedish language technology, there is still no well-documented sta...
This paper reports the ongoing work of producing a state of the art part of speech tagger for unedit...
The Talko corpus of Swedish spoken in Finland is a new research tool consisting of audio files li...
We present SWEGRAM, a web-based tool for the automatic linguistic annotation and quantitative analys...
This paper reports on two experiments with a probabilistic part-of-speech tagger, trained on a tagge...
State-of-the-art statistical part-of-speech taggers mainly use information on tag bi- or trigrams, d...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
There is an increasing interest in the NLP community in developing tools for annotating historical d...
This work presents Stagger, a new open-source part of speech tagger for Swedish based on the Average...
The field of Part of Speech (POS) tagging has made slow but steady progress during the last decade, ...
HunPoS, a freely available open source part-of-speech tagger—a reimplementa-tion of one of the best ...
In this paper a data-driven method for Part-of-Speech tagging not using any n-grams of tags is prese...
In this paper; we experiment with using Stagger; an open-source implementation of an Averaged Percep...
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Percep...
This thesis describes the work of providing separate morphological processing and part-of-speech tag...
Despite many years of research on Swedish language technology, there is still no well-documented sta...
This paper reports the ongoing work of producing a state of the art part of speech tagger for unedit...
The Talko corpus of Swedish spoken in Finland is a new research tool consisting of audio files li...
We present SWEGRAM, a web-based tool for the automatic linguistic annotation and quantitative analys...
This paper reports on two experiments with a probabilistic part-of-speech tagger, trained on a tagge...
State-of-the-art statistical part-of-speech taggers mainly use information on tag bi- or trigrams, d...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
There is an increasing interest in the NLP community in developing tools for annotating historical d...