Tölvunarfræði, ThesisIn this thesis, four attempts to improve the tagging accuracy for Icelandic text are presented. All of them were tested on IceTagger, a linguistic rule-based tagger with a tagging accuracy of 91.59%, and TnT, a data-driven tagger with a tagging accuracy of 90.45% for Icelandic. The first attempt was to reduce the number of tags in the Icelandic tagset. Various different reductions were tested. The set which gave the best result improved the tagging accuracy for IceTagger by 1.19% and for TnT by 1.45%. The second attempt was to use a larger dictionary which improved tagging by 0.56% for IceTagger and 0.69% for TnT. The third attempt was to improve tagging accuracy by integrating a lemmatizer for Icelandic into IceTagger ...
This paper reports on a work in progress. Auður Þórunn Rögnvaldsdóttir, Eiríkur Rögnvaldsson, Kristí...
ABSTRACT There is an increasing interest in the NLP community in developing tools for annotating his...
In this paper, we describe the development of a new tagged corpus of Icelandic, consisting of about ...
The Icelandic language is a morphologically complex language, for which a large tagset has been crea...
Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kri...
~aturallanguageprocessing (~LP) is a very young discipline in Iceland. Therefore, there is a lack of...
Þessi ritgerð lýsir þróun nákvæms málfræðimarkara fyrir færeysku. Til að ná slíku fram var íslenski ...
Data driven POS tagging has achieved good performance for English, but can still lag behind linguist...
In this paper; we experiment with using Stagger; an open-source implementation of an Averaged Percep...
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Percep...
We experiment with extending the dic-tionaries used by three open-source part-of-speech taggers, by ...
This paper explores the impact of inconsistencies stemming from human mistakes on the accuracy of pa...
The topic of this thesis is the post-correction of Icelandic OCR (optical character recognized) text...
There is an increasing interest in the NLP community in developing tools for annotating historical d...
Language is a fundamental part of human communication and as technology advances, people expect to b...
This paper reports on a work in progress. Auður Þórunn Rögnvaldsdóttir, Eiríkur Rögnvaldsson, Kristí...
ABSTRACT There is an increasing interest in the NLP community in developing tools for annotating his...
In this paper, we describe the development of a new tagged corpus of Icelandic, consisting of about ...
The Icelandic language is a morphologically complex language, for which a large tagset has been crea...
Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kri...
~aturallanguageprocessing (~LP) is a very young discipline in Iceland. Therefore, there is a lack of...
Þessi ritgerð lýsir þróun nákvæms málfræðimarkara fyrir færeysku. Til að ná slíku fram var íslenski ...
Data driven POS tagging has achieved good performance for English, but can still lag behind linguist...
In this paper; we experiment with using Stagger; an open-source implementation of an Averaged Percep...
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Percep...
We experiment with extending the dic-tionaries used by three open-source part-of-speech taggers, by ...
This paper explores the impact of inconsistencies stemming from human mistakes on the accuracy of pa...
The topic of this thesis is the post-correction of Icelandic OCR (optical character recognized) text...
There is an increasing interest in the NLP community in developing tools for annotating historical d...
Language is a fundamental part of human communication and as technology advances, people expect to b...
This paper reports on a work in progress. Auður Þórunn Rögnvaldsdóttir, Eiríkur Rögnvaldsson, Kristí...
ABSTRACT There is an increasing interest in the NLP community in developing tools for annotating his...
In this paper, we describe the development of a new tagged corpus of Icelandic, consisting of about ...