In this paper, we compare the performance of different lemmatization approaches for information retrieval over Turkish text collection. A lemma is simply the "dictionary form" of a word and lemmatization is the process of determining the lemma for a given word where different inflected forms of a word can be analyzed as a single item. We compared three different lemmatizer and one fixed length truncation approaches over Turkish text collection. The first one is based on morphological analyzer for Turkish using with finite state language processing technology; another one is Dictionary-based Turkish Lemmatizer (DTL), which uses radix-trie data structure; the third one is a simple dictionary based top-down parser and the last one is truncatio...
One of the main problems involved in the use of free text for indexing and retrieval is the variatio...
This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an...
This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an...
We present the results of the first large-scale Turkish information retrieval experiments performed ...
In this study, we investigate information retrieval (IR) on Turkish texts using a large-scale test c...
Bitdefender;Department of Computers and Information Technology of the Faculty of Automation, Compute...
Stemming, truncating, suffix stripping and decompounding algorithms used in information retrieval (I...
We used Lemur Toolkit, an open source toolkit designed for Information Retrieval research, for our a...
The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an a...
Abstract: Lemmatisation is the process of finding the normalised forms of words appearing in text. I...
Purpose: The article discusses on a general methodological level different methods that have been us...
This paper introduces a set of freely available, open-source tools for Turkish that are built around...
This paper primarily discusses how to model Turkish morphotactics using flag diacritics. We present ...
Title: Processing of Turkic Languages Author: Sibel Ciddi Department: Institute of Formal and Applie...
The current study proposes to compare document retrieval precision performances based on language mo...
One of the main problems involved in the use of free text for indexing and retrieval is the variatio...
This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an...
This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an...
We present the results of the first large-scale Turkish information retrieval experiments performed ...
In this study, we investigate information retrieval (IR) on Turkish texts using a large-scale test c...
Bitdefender;Department of Computers and Information Technology of the Faculty of Automation, Compute...
Stemming, truncating, suffix stripping and decompounding algorithms used in information retrieval (I...
We used Lemur Toolkit, an open source toolkit designed for Information Retrieval research, for our a...
The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an a...
Abstract: Lemmatisation is the process of finding the normalised forms of words appearing in text. I...
Purpose: The article discusses on a general methodological level different methods that have been us...
This paper introduces a set of freely available, open-source tools for Turkish that are built around...
This paper primarily discusses how to model Turkish morphotactics using flag diacritics. We present ...
Title: Processing of Turkic Languages Author: Sibel Ciddi Department: Institute of Formal and Applie...
The current study proposes to compare document retrieval precision performances based on language mo...
One of the main problems involved in the use of free text for indexing and retrieval is the variatio...
This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an...
This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an...