With the advent of the Internet and intranets, substantial interest is being shown in Asian language information retrieval; especially in Chinese, which is a good example of an Asian ideographic language (other examples include Japanese and Korean). Since, in this type of language, spaces do not delimit words, an important issue is which index terms should be extracted from documents. This issue also has wider implications for indexing other languages such as agglutinating languages (e.g., Finnish and Turkish), archaic ideographic languages like Egyptian hieroglyphs, and other types of information such as data stored in genomic databases. Although comparisons of indexing strategies for Chinese documents have been made, almost all of them ar...