International audienceThough Cantonese is the most influential variety of Chinese other than Mandarin, there are only a limited number of Cantonese corpora available for linguistic studies. Among the essential steps of building a corpus, word segmentation is a necessary but highly challenging task due to the lack of clear word boundary in Cantonese. This paper reports the construction and evaluation of an open-source automatic Cantonese word segmenter developed for Cantonese. The tool is a component of the multilingual SPPAS program designed to be used directly by linguists. It is a free software distributed under a GPL license. The effectiveness of the tool was evaluated by comparing the result of segmenting some samples of a spoken Canton...
Chinese word segmentation is the first step for Chinese text processing. The accuracy of Chinese wor...
This paper describes the development of CU2C, a dual-condition Cantonese speech database for speaker...
In this paper, we present an integrated method to machine translation from Cantonese to English text...
International audienceThough Cantonese is the most influential variety of Chinese other than Mandari...
A new software for annotation and segmentation in Cantonese Thanks to the Variamu project and a coll...
This thesis presents a colloquial language modeling technique for spontaneous Cantonese speech recog...
It is hard to collect corpora used to train good language models for many minority languages. Canton...
In this paper, we will present the up-to-date status for the development of several large-scale Cant...
This paper describes our recent work on developing a large vocabulary speech database for Cantonese....
viii, 91 leaves : ill. ; 30 cm.PolyU Library Call No.: [THS] LG51 .H577M COMP 2001 LiAs computer tec...
This paper introduces Cancorp, a one million character child language corpus constructed to study th...
We consider the possibility of Cantonese and English reciprocally influencing vowel space in Toronto...
This paper describes our recent work on automatic recognition of Cantonese. Cantonese is one of the ...
Tsik Chung Wai Benjamin.Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.Includes bibliograph...
This paper introduces the development of ShefCE: a Cantonese-English bilingual speech corpus from L2...
Chinese word segmentation is the first step for Chinese text processing. The accuracy of Chinese wor...
This paper describes the development of CU2C, a dual-condition Cantonese speech database for speaker...
In this paper, we present an integrated method to machine translation from Cantonese to English text...
International audienceThough Cantonese is the most influential variety of Chinese other than Mandari...
A new software for annotation and segmentation in Cantonese Thanks to the Variamu project and a coll...
This thesis presents a colloquial language modeling technique for spontaneous Cantonese speech recog...
It is hard to collect corpora used to train good language models for many minority languages. Canton...
In this paper, we will present the up-to-date status for the development of several large-scale Cant...
This paper describes our recent work on developing a large vocabulary speech database for Cantonese....
viii, 91 leaves : ill. ; 30 cm.PolyU Library Call No.: [THS] LG51 .H577M COMP 2001 LiAs computer tec...
This paper introduces Cancorp, a one million character child language corpus constructed to study th...
We consider the possibility of Cantonese and English reciprocally influencing vowel space in Toronto...
This paper describes our recent work on automatic recognition of Cantonese. Cantonese is one of the ...
Tsik Chung Wai Benjamin.Thesis (M.Phil.)--Chinese University of Hong Kong, 1994.Includes bibliograph...
This paper introduces the development of ShefCE: a Cantonese-English bilingual speech corpus from L2...
Chinese word segmentation is the first step for Chinese text processing. The accuracy of Chinese wor...
This paper describes the development of CU2C, a dual-condition Cantonese speech database for speaker...
In this paper, we present an integrated method to machine translation from Cantonese to English text...