Textual information written in Chinese now represents a huge knowledge repository. The first step of managing and processing information in written Chinese text is segmentation. The thesis investigates three main issues in Chinese text segmentation: word frequency estimation, ambiguity resolution, and unknown word identification. The latter two issues are addressed in the same segmentation process. Defining Chinese word is a very difficult task. This makes estimating the correct word frequency a challenging task. A main source to obtain the frequencies of words is by constructing Chinese corpus. Many manually segmented Chinese corpora have been produced by different organisations and institutes. The word frequencies obtained from the differ...
Abstract- Proposed an approach of Chinese word segmenta-tion based on statistic and rules. The appro...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
This paper addresses two remaining challenges in Chinese word segmentation. The challenge in HLT is ...
Textual information written in Chinese now represents a huge knowledge repository. The first step of...
Currently most of state-of-the-art methods for Chinese word segmentation (CWS) are based on supervis...
Chinese word segmentation is the first step for Chinese text processing. The accuracy of Chinese wor...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
A Chinese sentence is typically written as a sequence of characters. However, a word, a logical sema...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
The Chinese language is written without using spaces or other word delimiters. Although a text may b...
The Chinese language is written without using spaces or other word delimiters. Although a text may b...
[[abstract]]Chinese word segmentation is an essential step in a processing of Chinese natural langua...
This paper describes a hybrid Chinese word segmenter that is being developed as part of a larger Chi...
[[abstract]]Chinese word segmentation in a Chinese sentence is an essential step in the processing o...
This document describes the segmentation guidelines for the Penn Chinese Treebank Project. The goal ...
Abstract- Proposed an approach of Chinese word segmenta-tion based on statistic and rules. The appro...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
This paper addresses two remaining challenges in Chinese word segmentation. The challenge in HLT is ...
Textual information written in Chinese now represents a huge knowledge repository. The first step of...
Currently most of state-of-the-art methods for Chinese word segmentation (CWS) are based on supervis...
Chinese word segmentation is the first step for Chinese text processing. The accuracy of Chinese wor...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
A Chinese sentence is typically written as a sequence of characters. However, a word, a logical sema...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
The Chinese language is written without using spaces or other word delimiters. Although a text may b...
The Chinese language is written without using spaces or other word delimiters. Although a text may b...
[[abstract]]Chinese word segmentation is an essential step in a processing of Chinese natural langua...
This paper describes a hybrid Chinese word segmenter that is being developed as part of a larger Chi...
[[abstract]]Chinese word segmentation in a Chinese sentence is an essential step in the processing o...
This document describes the segmentation guidelines for the Penn Chinese Treebank Project. The goal ...
Abstract- Proposed an approach of Chinese word segmenta-tion based on statistic and rules. The appro...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
This paper addresses two remaining challenges in Chinese word segmentation. The challenge in HLT is ...