In natural language processing, information about a person’s geographical origin is an important feature for name entity transliteration and question answering. We propose a language-independent name origin clustering and classification framework. Provided with a small amount of bilingual name translation pairs with labeled origins, we measure origin similarities based on the perplexities of name character language and translation models. We group similar origins into clusters, then train a Bayesian classifier with different features. It achieves 84 % classification accuracy with source names only, and 91 % with both source and target name pairs. We apply the origin clustering and classification technique to a name transliteration task. The...
This paper presents a two-step approach to determining whether a transliterated personal name from d...
Identification of transliterated names is a particularly difficult task of Named Entity Recognition ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
We present an exploratory tool that extracts person names from multilingual news collections, matche...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
Transliteration is the process of expressing a proper name from a source language in the characters ...
In a global setting, texts contain transliterated names from many cultural origins. Correct translit...
AbstractForeign name transliterations typically include multiple spelling variants. These variants c...
reservedThe exploration of name origins holds immense value in understanding the rich cultural and h...
This paper describes the development of a ground truth dataset of culturally diverse Romanized names...
In several author name disambiguation studies, some ethnic name groups such as East Asian names are ...
Identification of transliterated names is a particularly difficult task of Named Entity Recognition ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
This paper presents a two-step approach to determining whether a transliterated personal name from d...
Identification of transliterated names is a particularly difficult task of Named Entity Recognition ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
We present an exploratory tool that extracts person names from multilingual news collections, matche...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
Transliteration is the process of expressing a proper name from a source language in the characters ...
In a global setting, texts contain transliterated names from many cultural origins. Correct translit...
AbstractForeign name transliterations typically include multiple spelling variants. These variants c...
reservedThe exploration of name origins holds immense value in understanding the rich cultural and h...
This paper describes the development of a ground truth dataset of culturally diverse Romanized names...
In several author name disambiguation studies, some ethnic name groups such as East Asian names are ...
Identification of transliterated names is a particularly difficult task of Named Entity Recognition ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
This paper presents a two-step approach to determining whether a transliterated personal name from d...
Identification of transliterated names is a particularly difficult task of Named Entity Recognition ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...