This paper provides a first sense-annotated encyclopedia-based Finnish glossary for evaluating disambiguation tasks and a simple yet transparent application-ready NED solution for Finnish NLP pipelines. We aim to develop a Finnish WSD solution to be used on top of an existing NLP pipeline to improve text indexing results by disambiguating similarly named entities. To do this, we gather a data dump of Finnish Wikipedia and process it with TurkuNLP's public models to achieve lemmatization, segmentation, and POS and NER taggings. We utilize Wikipedia's disambiguation pages to link ambiguous and disambiguous entities and create a glossary with articles as the sense definitions. We use the glossary to generate disambiguation tasks and perform a ...
The language model is one of the key components of a large vocabulary continuous speech recognition ...
Sanasto-osaamisen tärkeys vieraan kielen oppimisessa ja hallitsemisessa on nykyään itsestäänselvää....
Tutkimus käsittelee semanttista teoriaa ja siitä erityisesti polysemiaa sekä psykolingvistiikan alaa...
The field of natural language processing (NLP) has developed enormously during the last decades. The...
Tehtävää sanan oikean merkityksen määritämiseksi automattisesti jossakin luonnollisen kielen ilmaisu...
The retrieval of documents from a dataset based on a search query is an important feature of user-fa...
In this doctoral dissertation we propose new methods and frameworks for computer-assisted learning b...
Building WordNets from comparable corpora is a task that is explored, but especially using Wikipedi...
Keywords are used in many document databases to improve search. The process of assigning keywords fr...
In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two language...
There are comprehensive requirements in Finland for procurement by any government organization to go...
This paper presents a simple method for finding new synonym candidates to a bilingual wordnet by usi...
This paper introduces a work in progress for implementing a free full text semantic tagger for Finni...
Erilaiset kieliteknologiasovellukset ovat olleet jo vuosikymmeniä arkipäiväises-sä käytössä. Esimerk...
This paper describes a new Word Sense Disambiguation (WSD) algorithm which extends two well-known va...
The language model is one of the key components of a large vocabulary continuous speech recognition ...
Sanasto-osaamisen tärkeys vieraan kielen oppimisessa ja hallitsemisessa on nykyään itsestäänselvää....
Tutkimus käsittelee semanttista teoriaa ja siitä erityisesti polysemiaa sekä psykolingvistiikan alaa...
The field of natural language processing (NLP) has developed enormously during the last decades. The...
Tehtävää sanan oikean merkityksen määritämiseksi automattisesti jossakin luonnollisen kielen ilmaisu...
The retrieval of documents from a dataset based on a search query is an important feature of user-fa...
In this doctoral dissertation we propose new methods and frameworks for computer-assisted learning b...
Building WordNets from comparable corpora is a task that is explored, but especially using Wikipedi...
Keywords are used in many document databases to improve search. The process of assigning keywords fr...
In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two language...
There are comprehensive requirements in Finland for procurement by any government organization to go...
This paper presents a simple method for finding new synonym candidates to a bilingual wordnet by usi...
This paper introduces a work in progress for implementing a free full text semantic tagger for Finni...
Erilaiset kieliteknologiasovellukset ovat olleet jo vuosikymmeniä arkipäiväises-sä käytössä. Esimerk...
This paper describes a new Word Sense Disambiguation (WSD) algorithm which extends two well-known va...
The language model is one of the key components of a large vocabulary continuous speech recognition ...
Sanasto-osaamisen tärkeys vieraan kielen oppimisessa ja hallitsemisessa on nykyään itsestäänselvää....
Tutkimus käsittelee semanttista teoriaa ja siitä erityisesti polysemiaa sekä psykolingvistiikan alaa...