In the processing of Chinese documents and queries in information retrieval (IR), one has to identify the units that are used as indexes. Words and n-grams had been used as indexes in several previous studies, which showed that both kinds of indexes lead to comparable IR performances. In this study, we carried out more experiments to find the better way to index Chinese texts. First, we investigated the inpacts on IR performance of the accuracy of word segmentation. Second, fifteen different groups of indexing units, which were the possible combination of words and character n-grams, were discussed detailedly. Experiments showed that better segmentation results in better IR performances, and a combination of words with uni-grams is the bett...
This paper presents the results of experiments in which the authors tested different types of featur...
The majority of recent Cross-Language Information Retrieval (CLIR) research has focused on European ...
Term segmentation plays a vital role in building effective information retrieval systems. In particu...
In the processing of Chinese documents and queries in information retrieval (IR), one has to identi...
Chinese word segmentation is a prerequisite process in Chinese information retrieval (IR) to divide ...
In this paper we present results of experiments with Chinese word segmentation and information retri...
A distinctive feature of Chinese test is that a Chinese document is a sequence of Chinese with no sp...
In this fast growing information age, information retrieval (IR) systems and their related fields h...
With the widespread of the Internet, great research in-terests are being shown in Chinese language i...
With the advent of the Internet and intranets, substantial interest is being shown in Asian language...
Due to the lack of explicit word boundaries in Chinese, and Japanese, and to some extent in Korean, ...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
We investigate the effects of lexicon size and stopwords on Chinese information retrieval using our ...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Query expansion has long been suggested as a technique for dealing with word mismatch problem in inf...
This paper presents the results of experiments in which the authors tested different types of featur...
The majority of recent Cross-Language Information Retrieval (CLIR) research has focused on European ...
Term segmentation plays a vital role in building effective information retrieval systems. In particu...
In the processing of Chinese documents and queries in information retrieval (IR), one has to identi...
Chinese word segmentation is a prerequisite process in Chinese information retrieval (IR) to divide ...
In this paper we present results of experiments with Chinese word segmentation and information retri...
A distinctive feature of Chinese test is that a Chinese document is a sequence of Chinese with no sp...
In this fast growing information age, information retrieval (IR) systems and their related fields h...
With the widespread of the Internet, great research in-terests are being shown in Chinese language i...
With the advent of the Internet and intranets, substantial interest is being shown in Asian language...
Due to the lack of explicit word boundaries in Chinese, and Japanese, and to some extent in Korean, ...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
We investigate the effects of lexicon size and stopwords on Chinese information retrieval using our ...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Query expansion has long been suggested as a technique for dealing with word mismatch problem in inf...
This paper presents the results of experiments in which the authors tested different types of featur...
The majority of recent Cross-Language Information Retrieval (CLIR) research has focused on European ...
Term segmentation plays a vital role in building effective information retrieval systems. In particu...