In this paper we present several parallel corpora for English↔Hindi and talk about their natures and domains. We also discuss briefly a few previous attempts in MT for translation from English to Hindi. The lack of uniformly annotated data makes it difficult to compare these attempts and precisely analyze their strengths and shortcomings. With this in mind, we propose a standard pipeline to provide uniform linguistic annotations to these resources using state-of-art NLP technologies. We conclude the paper by presenting evaluation scores of different statistical MT systems on the corpora detailed in this paper for English→Hindi and present the proposed plans for future work. We hope that both these annotated parallel corpora resources and MT...
In this paper, we report our work on incor-porating syntactic and morphological infor-mation for Eng...
Corpus is a large collection of homogeneous and authentic written texts (or speech) of a particular ...
Hindi and Urdu share a common phonol-ogy, morphology and grammar but are written in different script...
Statistical machine translation to morphologically richer languages is a challenging task and more s...
Abstract In this paper, we describe our EnglishHindi and Hindi-English statistical systems submitted...
Statistical machine translation to morphologically richer languages is a challenging task and more s...
Parallel corpora are often injected with bilingual dictionaries for improved Indian language machine...
In this paper, we describe our English-Hindi and Hindi-English statistical sys-tems submitted to the...
Recent work has established the efficacy of Amazon’s Mechanical Turk for constructing parallel corpo...
Parallel corpora are often injected with bilingual dictionaries for improved Indian language machine...
Even though lot of Statistical Machine Translation(SMT) research work is happening for English-Hindi...
We present HindEnCorp, a parallel corpus of Hindi and English, and HindMonoCorp, a monolingual corpu...
In recent years, the multilingual content over the internet has grown exponentially together with th...
Importance of translation has been realized long way back, but mostly it was manual translation. Tra...
Importance of translation has been realized long way back, but mostly it was manual translation. Tra...
In this paper, we report our work on incor-porating syntactic and morphological infor-mation for Eng...
Corpus is a large collection of homogeneous and authentic written texts (or speech) of a particular ...
Hindi and Urdu share a common phonol-ogy, morphology and grammar but are written in different script...
Statistical machine translation to morphologically richer languages is a challenging task and more s...
Abstract In this paper, we describe our EnglishHindi and Hindi-English statistical systems submitted...
Statistical machine translation to morphologically richer languages is a challenging task and more s...
Parallel corpora are often injected with bilingual dictionaries for improved Indian language machine...
In this paper, we describe our English-Hindi and Hindi-English statistical sys-tems submitted to the...
Recent work has established the efficacy of Amazon’s Mechanical Turk for constructing parallel corpo...
Parallel corpora are often injected with bilingual dictionaries for improved Indian language machine...
Even though lot of Statistical Machine Translation(SMT) research work is happening for English-Hindi...
We present HindEnCorp, a parallel corpus of Hindi and English, and HindMonoCorp, a monolingual corpu...
In recent years, the multilingual content over the internet has grown exponentially together with th...
Importance of translation has been realized long way back, but mostly it was manual translation. Tra...
Importance of translation has been realized long way back, but mostly it was manual translation. Tra...
In this paper, we report our work on incor-porating syntactic and morphological infor-mation for Eng...
Corpus is a large collection of homogeneous and authentic written texts (or speech) of a particular ...
Hindi and Urdu share a common phonol-ogy, morphology and grammar but are written in different script...