Abstract. An HMM-based Single Character Recovery (SCR) Model is proposed in this paper to extract a large set of “atomic abbreviation pairs”from a text corpus. By an “atomic abbreviation pair,”it refers to an abbreviated word and its root word (i.e., unabbreviated form) in which the abbreviation is a single Chinese character. This task is important since Chinese abbreviations cannot be enumerated exhaustively but the abbreviation process for Chinese compound words seems to be“compositional”; one can often decode an abbreviated word, such as“台大”(Taiwan University), character-by-character back to its root form,“台灣”plus“大 學”. With a large“atomic abbreviation dictionary,”one may be able to handle problems associated with multiple-character abbr...
Correctly predicting abbreviations given the full forms is important in many natu-ral language proce...
International audienceIdentifying Full names/abbreviations for entities is a challenging problem in ...
Unknown term translation is important to CLIR and MT systems, but it is still an unsolved problem. R...
[[abstract]]An HMM-based single character recovery (SCR) model is proposed in this paper to extract ...
[[abstract]]An HMM-based single character recovery (SCR) model is proposed in this paper to extract ...
This paper presents a hybrid approach to Chinese abbreviation expansion. In this study, each short-f...
As a special form of unknown words, Chinese abbreviations represent significant problems for Chinese...
In Chinese, phrases and named entities play a central role in information retrieval. Abbreviations, ...
Abbreviation is a common linguistic phenomenon with wide popularity and high rate of growth. Correct...
Abstract. Chinese abbreviations are frequently used without being defined, which has brought much di...
We propose a new Chinese abbreviation prediction method which can incorporate rich local information...
This paper presents an n-gram based approach to Chinese abbreviation expansion. In this study, we di...
Chinese abbreviations are widely used in modern Chinese texts. Compared with English abbreviations (...
Correctly predicting abbreviations given the full forms is important in many natural language proces...
Chinese abbreviations are frequently used without being defined, which has brought much difficulty i...
Correctly predicting abbreviations given the full forms is important in many natu-ral language proce...
International audienceIdentifying Full names/abbreviations for entities is a challenging problem in ...
Unknown term translation is important to CLIR and MT systems, but it is still an unsolved problem. R...
[[abstract]]An HMM-based single character recovery (SCR) model is proposed in this paper to extract ...
[[abstract]]An HMM-based single character recovery (SCR) model is proposed in this paper to extract ...
This paper presents a hybrid approach to Chinese abbreviation expansion. In this study, each short-f...
As a special form of unknown words, Chinese abbreviations represent significant problems for Chinese...
In Chinese, phrases and named entities play a central role in information retrieval. Abbreviations, ...
Abbreviation is a common linguistic phenomenon with wide popularity and high rate of growth. Correct...
Abstract. Chinese abbreviations are frequently used without being defined, which has brought much di...
We propose a new Chinese abbreviation prediction method which can incorporate rich local information...
This paper presents an n-gram based approach to Chinese abbreviation expansion. In this study, we di...
Chinese abbreviations are widely used in modern Chinese texts. Compared with English abbreviations (...
Correctly predicting abbreviations given the full forms is important in many natural language proces...
Chinese abbreviations are frequently used without being defined, which has brought much difficulty i...
Correctly predicting abbreviations given the full forms is important in many natu-ral language proce...
International audienceIdentifying Full names/abbreviations for entities is a challenging problem in ...
Unknown term translation is important to CLIR and MT systems, but it is still an unsolved problem. R...