Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. This paper demonstrates the ability of linear-chain conditional random fields (CRFs) to perform robust and accurate Chinese word segmentation by providing a principled framework that easily supports the integration of domain knowledge in the form of multiple lexicons of characters and words. We also present a probabilistic new word detection method, which further improves performance. Our system is evaluated on four datasets used in a recent comprehensive Chinese word segmentation competition. State-of-the-art performance is obtained
There is rich knowledge encoded in on-line web data. For example, punctua-tion and entity tags in Wi...
Myanmar texts are different fromEnglish texts in that they have no spaces tomark the boundaries of w...
In this paper, we propose a joint model for unsupervised Chinese word segmentation (CWS). Inspired b...
Chinese word segmentation is a difficult, im-portant and widely-studied sequence modeling problem. T...
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. Th...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
This thesis proposes an approach to generating n-gram features for Conditional Random Fields (CRFs) ...
This Article is brought to you for free and open access by the Computer Science at ScholarWorks@UMas...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
In this paper, we describe a Chinese word segmentation system that we de-veloped for the Third SIGHA...
This paper describes the Chinese Word Segmenter for the fourth International Chinese Language Proces...
A Chinese sentence is typically written as a sequence of characters. However, a word, a logical sema...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
This paper presents a Chinese word segmentation system submitted to the closed training evaluations ...
In this paper, we proposed a Chinese word segmentation model for micro-blog text. Alt-hough Conditio...
There is rich knowledge encoded in on-line web data. For example, punctua-tion and entity tags in Wi...
Myanmar texts are different fromEnglish texts in that they have no spaces tomark the boundaries of w...
In this paper, we propose a joint model for unsupervised Chinese word segmentation (CWS). Inspired b...
Chinese word segmentation is a difficult, im-portant and widely-studied sequence modeling problem. T...
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. Th...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
This thesis proposes an approach to generating n-gram features for Conditional Random Fields (CRFs) ...
This Article is brought to you for free and open access by the Computer Science at ScholarWorks@UMas...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
In this paper, we describe a Chinese word segmentation system that we de-veloped for the Third SIGHA...
This paper describes the Chinese Word Segmenter for the fourth International Chinese Language Proces...
A Chinese sentence is typically written as a sequence of characters. However, a word, a logical sema...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
This paper presents a Chinese word segmentation system submitted to the closed training evaluations ...
In this paper, we proposed a Chinese word segmentation model for micro-blog text. Alt-hough Conditio...
There is rich knowledge encoded in on-line web data. For example, punctua-tion and entity tags in Wi...
Myanmar texts are different fromEnglish texts in that they have no spaces tomark the boundaries of w...
In this paper, we propose a joint model for unsupervised Chinese word segmentation (CWS). Inspired b...