AbstractForeign name transliterations typically include multiple spelling variants. These variants cause data sparseness and inconsistency problems, increase the Out-of-Vocabulary (OOV) rate, and present challenges for Machine Translation, Information Extraction and other natural language processing (NLP) tasks. This work aims to identify and cluster name spelling variants using a Statistical Machine Translation method: word alignment. The variants are identified by being aligned to the same “pivot” name in another language (the source-language in Machine Translation settings). Based on word-to-word translation and transliteration probabilities, as well as the string edit distance metric, names with similar spellings in the target language ...
Automatic transliteration and back-transliteration across languages with different phonemes and alph...
Any cross-language processing application has to first tackle the problem of transliteration when fa...
This paper presents a method to improve a word alignment model in a phrase-based Statistical Machine...
AbstractForeign name transliterations typically include multiple spelling variants. These variants c...
Foreign name translations typically include multiple spelling variants. These variants cause data sp...
Transcribing named entities from one language into another is called transliteration. This thesis pr...
We propose a language-independent method for the automatic extraction of transliteration pairs from ...
[[abstract]]This paper describes a framework for modeling the machine transliteration problem. The p...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
This paper studies transliteration align-ment, its evaluation metrics and applica-tions. We propose ...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
In a global setting, texts contain transliterated names from many cultural origins. Correct translit...
This paper presents a framework for extracting English and Chinese transliterated word pairs from pa...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
Automatic transliteration and back-transliteration across languages with different phonemes and alph...
Any cross-language processing application has to first tackle the problem of transliteration when fa...
This paper presents a method to improve a word alignment model in a phrase-based Statistical Machine...
AbstractForeign name transliterations typically include multiple spelling variants. These variants c...
Foreign name translations typically include multiple spelling variants. These variants cause data sp...
Transcribing named entities from one language into another is called transliteration. This thesis pr...
We propose a language-independent method for the automatic extraction of transliteration pairs from ...
[[abstract]]This paper describes a framework for modeling the machine transliteration problem. The p...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
This paper studies transliteration align-ment, its evaluation metrics and applica-tions. We propose ...
Existing named entity (NE) transliteration approaches often exploit a general model to transliterate...
In a global setting, texts contain transliterated names from many cultural origins. Correct translit...
This paper presents a framework for extracting English and Chinese transliterated word pairs from pa...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
Transliteration of named entities in user queries is a vital step in any Cross-Language Information ...
Automatic transliteration and back-transliteration across languages with different phonemes and alph...
Any cross-language processing application has to first tackle the problem of transliteration when fa...
This paper presents a method to improve a word alignment model in a phrase-based Statistical Machine...