There is rich knowledge encoded in on-line web data. For example, punctua-tion and entity tags in Wikipedia data define some word boundaries in a sen-tence. In this paper we adopt partial-label learning with conditional random fields to make use of this valuable knowledge for semi-supervised Chinese word segmenta-tion. The basic idea of partial-label learn-ing is to optimize a cost function that marginalizes the probability mass in the constrained space that encodes this knowl-edge. By integrating some domain adap-tation techniques, such as EasyAdapt, our result reaches an F-measure of 95.98 % on the CTB-6 corpus, a significant improve-ment from both the supervised baseline and a previous proposed approach, namely constrained decode.
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
In this paper we report an empirical study on semi-supervised Chinese word segmenta-tion using co-tr...
Currently, the best performing models for Chinese word segmentation (CWS) are extremely re-source in...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
Nowadays supervised sequence labeling models can reach competitive performance on the task of Chines...
This paper presents a Chinese word segmentation system submitted to the closed training evaluations ...
Nowadays supervised sequence labeling models can reach competitive performance on the task of Chines...
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. Th...
Chinese word segmentation is a difficult, im-portant and widely-studied sequence modeling problem. T...
In this article, we focus on Chinese word segmentation by systematically incorporating non-local inf...
This paper presents a bilingual semi-supervised Chinese word segmentation (CWS) method that leverage...
In this paper, we describe a Chinese word segmentation system that we de-veloped for the Third SIGHA...
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. Th...
This thesis proposes an approach to generating n-gram features for Conditional Random Fields (CRFs) ...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
In this paper we report an empirical study on semi-supervised Chinese word segmenta-tion using co-tr...
Currently, the best performing models for Chinese word segmentation (CWS) are extremely re-source in...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
This paper presents a novel approach to Chinese word segmentation (CWS) that attempts to utilize glo...
Nowadays supervised sequence labeling models can reach competitive performance on the task of Chines...
This paper presents a Chinese word segmentation system submitted to the closed training evaluations ...
Nowadays supervised sequence labeling models can reach competitive performance on the task of Chines...
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. Th...
Chinese word segmentation is a difficult, im-portant and widely-studied sequence modeling problem. T...
In this article, we focus on Chinese word segmentation by systematically incorporating non-local inf...
This paper presents a bilingual semi-supervised Chinese word segmentation (CWS) method that leverage...
In this paper, we describe a Chinese word segmentation system that we de-veloped for the Third SIGHA...
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. Th...
This thesis proposes an approach to generating n-gram features for Conditional Random Fields (CRFs) ...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
In this paper we report an empirical study on semi-supervised Chinese word segmenta-tion using co-tr...
Currently, the best performing models for Chinese word segmentation (CWS) are extremely re-source in...