In this article, we assign Chinese n-gram sequences to different types by their statistical properties such as frequency, mutual information and left/right border entropy. We call these sequence type "Radixes" and define some combination rules between them. Based on the radixes we classified and their combination rule we designed a new Chinese segmentation algorithm without dictionary based on dynamic programming, and do some research on the automatic word extraction of Chinese words consist of 2 to 4 letters, we achieved good performance on some aspects.http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000270587500121&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomer...
Abstract In this paper, we propose an unsupervised seg-mentation approach, named "n-gram mutual...
(PKU). Based on a maximum entropy approach, our word segmenter achieved the highest F measure for AS...
Abstract Among the language texts in natural language, Chinese texts are written in a continuous way...
This thesis proposes an approach to generating n-gram features for Conditional Random Fields (CRFs) ...
Textual information written in Chinese now represents a huge knowledge repository. The first step of...
In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", ...
This Article is brought to you for free and open access by the Computer Science at ScholarWorks@UMas...
Automatic Chinese Word Segmentation is one of the basic research issues on text categorization, auto...
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We...
Parsing, the task of identifying syntactic components, e.g., noun and verb phrases, in a sentence, i...
A Chinese sentence is typically written as a sequence of characters. However, a word, a logical sema...
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We...
Abstract- Proposed an approach of Chinese word segmenta-tion based on statistic and rules. The appro...
In this paper, we propose a joint model for unsupervised Chinese word segmentation (CWS). Inspired b...
International audienceIn this paper, we present an unsupervised segmentation system tested on Mandar...
Abstract In this paper, we propose an unsupervised seg-mentation approach, named "n-gram mutual...
(PKU). Based on a maximum entropy approach, our word segmenter achieved the highest F measure for AS...
Abstract Among the language texts in natural language, Chinese texts are written in a continuous way...
This thesis proposes an approach to generating n-gram features for Conditional Random Fields (CRFs) ...
Textual information written in Chinese now represents a huge knowledge repository. The first step of...
In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", ...
This Article is brought to you for free and open access by the Computer Science at ScholarWorks@UMas...
Automatic Chinese Word Segmentation is one of the basic research issues on text categorization, auto...
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We...
Parsing, the task of identifying syntactic components, e.g., noun and verb phrases, in a sentence, i...
A Chinese sentence is typically written as a sequence of characters. However, a word, a logical sema...
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We...
Abstract- Proposed an approach of Chinese word segmenta-tion based on statistic and rules. The appro...
In this paper, we propose a joint model for unsupervised Chinese word segmentation (CWS). Inspired b...
International audienceIn this paper, we present an unsupervised segmentation system tested on Mandar...
Abstract In this paper, we propose an unsupervised seg-mentation approach, named "n-gram mutual...
(PKU). Based on a maximum entropy approach, our word segmenter achieved the highest F measure for AS...
Abstract Among the language texts in natural language, Chinese texts are written in a continuous way...