We present an overview of an ongoing project which has the aim of developing methods for building a treebank of Icelandic. The treebank will contain both written and spoken language, and in addition have a diachronic dimension. Since Icelandic is an example of what has been called a less-resourced language when it comes to computational linguistics and language technology, it is essential to utilize the limited resources available as economically and efficiently as possible. We emphasize the importance of open source software and the interplay between linguistic knowledge and technological skills. We describe the workflow in the construction of the treebank and show how the different software tools work together towards the final representa...
We introduce an Icelandic corpus of more than 250 million running words and de-scribe the methodolog...
~aturallanguageprocessing (~LP) is a very young discipline in Iceland. Therefore, there is a lack of...
Data-driven parsing techniques have a number of advantages over rule-based parsing techniques, such ...
We describe the current status of Icelandic language technology with respect to available language r...
In this paper we outline the Icelandic research plans in the Scandinavian Dialect Syntax project and...
We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to mac...
We describe the Corpus of Spoken Icelandic (ÍS-TAL) which is made up of 15 hours of spontaneous natu...
We give an overview of Icelandic language technology since its inception ten years ago and describe ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
We present the case for an extensive scientific effort to build up large treebanks for the Nordic an...
We describe the background for and building of IcePaHC, a one million word parsed historical corpus ...
The Icelandic language is a morphologically complex language, for which a large tagset has been crea...
We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to mac...
The new POS-tagged Icelandic corpus of the Leipzig Corpora Collection is an extensive resource for t...
dependency treebank on top of the morphologically tagged Danish PAROLE corpus (291.000 words). This ...
We introduce an Icelandic corpus of more than 250 million running words and de-scribe the methodolog...
~aturallanguageprocessing (~LP) is a very young discipline in Iceland. Therefore, there is a lack of...
Data-driven parsing techniques have a number of advantages over rule-based parsing techniques, such ...
We describe the current status of Icelandic language technology with respect to available language r...
In this paper we outline the Icelandic research plans in the Scandinavian Dialect Syntax project and...
We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to mac...
We describe the Corpus of Spoken Icelandic (ÍS-TAL) which is made up of 15 hours of spontaneous natu...
We give an overview of Icelandic language technology since its inception ten years ago and describe ...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
We present the case for an extensive scientific effort to build up large treebanks for the Nordic an...
We describe the background for and building of IcePaHC, a one million word parsed historical corpus ...
The Icelandic language is a morphologically complex language, for which a large tagset has been crea...
We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to mac...
The new POS-tagged Icelandic corpus of the Leipzig Corpora Collection is an extensive resource for t...
dependency treebank on top of the morphologically tagged Danish PAROLE corpus (291.000 words). This ...
We introduce an Icelandic corpus of more than 250 million running words and de-scribe the methodolog...
~aturallanguageprocessing (~LP) is a very young discipline in Iceland. Therefore, there is a lack of...
Data-driven parsing techniques have a number of advantages over rule-based parsing techniques, such ...