As developers of a highly multilingual named entity recognition (NER) system, we face an evaluation resource bottleneck problem: we need evaluation data in many languages, the annotation should not be too time-consuming, and the evaluation results across languages should be comparable. We solve the problem by automatically annotating the English version of a multi-parallel corpus and by projecting the annotations into all the other language versions. For the translation of English entities, we use a phrase-based statistical machine translation system as well as a lookup of known names from a multilingual name database. For the projection, we incrementally apply different methods: perfect string matching, perfect consonant signature matching...
We are presenting a mature text analysis application that relies heavily on multilingual Named Entit...
To advance information extraction and question answering technologies toward a more realistic path, ...
In this paper we illustrate and evaluate an approach to the creation of high quality linguistically ...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
Parallel corpora, Often exploited for Machine Translation, have recently been used for mono- lingual...
The lack of hand curated data is a major impediment to developing statistical semantic processors f...
The increasing diversity of languages used on the web introduces a new level of complexity to Inform...
Abstract. We present a named-entity recognition (NER) system for parallel multilingual text. Our sys...
Translation of named entities (NE), including proper names, temporal and numerical expressions, is v...
Named entity recognition is a challenging task in the field of NLP. As other machine learning proble...
This work presents parallel corpora automatically annotated with several NLP tools, including lemma ...
International audienceThis paper presents a multilingual system designed to recognize named entities...
In this paper, we describe a technique to improve named entity recognition in a resource-poor langua...
In this paper, we study direct transfer methods for multilingual named entity recognition. Specifica...
We present an effort for the development of multilingual named entity grammars in a unification-base...
We are presenting a mature text analysis application that relies heavily on multilingual Named Entit...
To advance information extraction and question answering technologies toward a more realistic path, ...
In this paper we illustrate and evaluate an approach to the creation of high quality linguistically ...
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors:...
Parallel corpora, Often exploited for Machine Translation, have recently been used for mono- lingual...
The lack of hand curated data is a major impediment to developing statistical semantic processors f...
The increasing diversity of languages used on the web introduces a new level of complexity to Inform...
Abstract. We present a named-entity recognition (NER) system for parallel multilingual text. Our sys...
Translation of named entities (NE), including proper names, temporal and numerical expressions, is v...
Named entity recognition is a challenging task in the field of NLP. As other machine learning proble...
This work presents parallel corpora automatically annotated with several NLP tools, including lemma ...
International audienceThis paper presents a multilingual system designed to recognize named entities...
In this paper, we describe a technique to improve named entity recognition in a resource-poor langua...
In this paper, we study direct transfer methods for multilingual named entity recognition. Specifica...
We present an effort for the development of multilingual named entity grammars in a unification-base...
We are presenting a mature text analysis application that relies heavily on multilingual Named Entit...
To advance information extraction and question answering technologies toward a more realistic path, ...
In this paper we illustrate and evaluate an approach to the creation of high quality linguistically ...