The evolution of language follows the rule of gradual change. Grammar, vocabulary, and lexical semantic shifts take place over time, resulting in a diachronic linguistic gap. As such, a considerable amount of texts are written in languages of different eras, which creates obstacles for natural language processing tasks, such as word segmentation and machine translation. Although the Chinese language has a long history, previous Chinese natural language processing research has primarily focused on tasks within a specific era. Therefore, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. Experiments on four corpora...
Shen Ruiqing. The monosyllabicization of Old Chinese and the birth of Chinese Writing: A hypothesi...
In translation, considering the document as a whole can help to resolve ambiguities and inconsistenc...
International audienceThis work is part of a broader project which requires adapting information ext...
Languages change over time and ancient languages have been studied in linguistics and other related ...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
To investigate the role of linguistic knowledge in data augmentation (DA) for Natural Language Proce...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
Supervised Chinese word segmentation has entered the deep learning era which reduces the hassle of f...
The linguist James Huang categorized languages into “cool” languages (i.e., languages that rely more...
This paper presents a bilingual semi-supervised Chinese word segmentation (CWS) method that leverage...
In translation, considering the document as a whole can help to resolve ambiguities and inconsiste...
In translation, considering the document as a whole can help to resolve ambiguities and inconsiste...
By comparing the languages of the world, we gain invaluable insights into human prehistory, predatin...
In translation, considering the document as a whole can help to resolve ambiguities and inconsiste...
Machine translation (MT) models usually translate a text by considering isolated sentences based on ...
Shen Ruiqing. The monosyllabicization of Old Chinese and the birth of Chinese Writing: A hypothesi...
In translation, considering the document as a whole can help to resolve ambiguities and inconsistenc...
International audienceThis work is part of a broader project which requires adapting information ext...
Languages change over time and ancient languages have been studied in linguistics and other related ...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
To investigate the role of linguistic knowledge in data augmentation (DA) for Natural Language Proce...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
Supervised Chinese word segmentation has entered the deep learning era which reduces the hassle of f...
The linguist James Huang categorized languages into “cool” languages (i.e., languages that rely more...
This paper presents a bilingual semi-supervised Chinese word segmentation (CWS) method that leverage...
In translation, considering the document as a whole can help to resolve ambiguities and inconsiste...
In translation, considering the document as a whole can help to resolve ambiguities and inconsiste...
By comparing the languages of the world, we gain invaluable insights into human prehistory, predatin...
In translation, considering the document as a whole can help to resolve ambiguities and inconsiste...
Machine translation (MT) models usually translate a text by considering isolated sentences based on ...
Shen Ruiqing. The monosyllabicization of Old Chinese and the birth of Chinese Writing: A hypothesi...
In translation, considering the document as a whole can help to resolve ambiguities and inconsistenc...
International audienceThis work is part of a broader project which requires adapting information ext...