Abstract: Lemmatisation is the process of finding the normalised forms of words appearing in text. It is a useful preprocessing step for a number of language engineering and text mining tasks, and especially important for languages with rich inflectional morphology. This paper presents a new lemmatisation system, LemmaGen, which was trained to generate accurate and efficient lemmatisers for twelve different languages. Its evaluation on the corresponding lexicons shows that LemmaGen outperforms the lemmatisers generated by two alternative approaches, RDR and CST, both in terms of accuracy and efficiency. To our knowledge, LemmaGen is the most efficient publicly available lemmatiser trained on large lexicons of multiple languages, whose learn...
We present an approach to lemmatization based on exhaustive morphological analysis and use of extern...
The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an a...
Although lemmatization is a very common subtask in many natural language processing tasks, there is ...
Lemmatisation is the process of finding the normalised forms of words appearing in text. It is a use...
Lemmatization is the process of finding the normalized form of a word. It is the same as looking for...
Lemmatization is the process of finding the normalized form of words from surface word-forms as they...
In this paper we focus our attention on the comparison of various lemmatization and stemming algorit...
We present LEMMING, a modular log-linear model that jointly models lemmati-zation and tagging and su...
Lemmatization for languages with rich inflectional morphology is one of the basic, indispensable ste...
We present GATE DictLemmatizer, a multilingual open source lemmatizer for the GATE NLP framework tha...
1) Fully automatic rule based lemmatization of inflected languages 2) Fully automatic training of le...
We present lemmatization experiments on the unstandardized low-resourced languages Low Saxon and Occ...
Lemmatisation is a crucial part of the compilation of a computational lexicon; it is the process whi...
Sense marked corpora is essential for super-vised word sense disambiguation (WSD). The marked sense ...
Lemmatization is a central task in many NLP applications. Despite this importance, the number of (fr...
We present an approach to lemmatization based on exhaustive morphological analysis and use of extern...
The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an a...
Although lemmatization is a very common subtask in many natural language processing tasks, there is ...
Lemmatisation is the process of finding the normalised forms of words appearing in text. It is a use...
Lemmatization is the process of finding the normalized form of a word. It is the same as looking for...
Lemmatization is the process of finding the normalized form of words from surface word-forms as they...
In this paper we focus our attention on the comparison of various lemmatization and stemming algorit...
We present LEMMING, a modular log-linear model that jointly models lemmati-zation and tagging and su...
Lemmatization for languages with rich inflectional morphology is one of the basic, indispensable ste...
We present GATE DictLemmatizer, a multilingual open source lemmatizer for the GATE NLP framework tha...
1) Fully automatic rule based lemmatization of inflected languages 2) Fully automatic training of le...
We present lemmatization experiments on the unstandardized low-resourced languages Low Saxon and Occ...
Lemmatisation is a crucial part of the compilation of a computational lexicon; it is the process whi...
Sense marked corpora is essential for super-vised word sense disambiguation (WSD). The marked sense ...
Lemmatization is a central task in many NLP applications. Despite this importance, the number of (fr...
We present an approach to lemmatization based on exhaustive morphological analysis and use of extern...
The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an a...
Although lemmatization is a very common subtask in many natural language processing tasks, there is ...