EnTam is a sentence aligned English-Tamil bilingual corpus from some of the publicly available websites that we have collected for NLP research involving Tamil. The standard set of processing has been applied on the the raw web data before the data became available in sentence aligned English-Tamil parallel corpus suitable for various NLP tasks. The parallel corpus includes texts from bible, cinema and news domains
Various experiments from literature suggest that in statistical machine translation (SMT), applying ...
The corpus based techniques in Machine Translation involves parallel corpora, but it is not applicab...
Parallel corpus is a critical resource in machine learning based translation. The task of collecting...
EnTam is a sentence aligned English-Tamil bilingual corpus from some of the publicly available websi...
We have collected English-Tamil bilingual data from some of the publicly available websites for NLP ...
English-Hindi parallel corpus collected from several sources. Tokenized and sentence-aligned. A part...
A sentence aligned parallel corpus is an important prerequisite in statistical machine translation. ...
A parallel corpus aligned at both sentence and word level is an important prerequisite in statistica...
Corpus is a large collection of homogeneous and authentic written texts (or speech) of a particular ...
In this paper we present several parallel corpora for English↔Hindi and talk about their natures and...
This paper describes our submission for the English-Tamil news translation task of WMT-2020. The var...
This paper describes our submission for the English-Tamil news translation task of WMT-2020. The var...
HindEnCorp parallel texts (sentence-aligned) come from the following sources: Tides, which contains...
Word alignment in bilingual corpora has been a very active research topic in the Machine Translation...
TamilTB is a first published syntactically annotated corpus of Tamil. TamilTB will allow a more rapi...
Various experiments from literature suggest that in statistical machine translation (SMT), applying ...
The corpus based techniques in Machine Translation involves parallel corpora, but it is not applicab...
Parallel corpus is a critical resource in machine learning based translation. The task of collecting...
EnTam is a sentence aligned English-Tamil bilingual corpus from some of the publicly available websi...
We have collected English-Tamil bilingual data from some of the publicly available websites for NLP ...
English-Hindi parallel corpus collected from several sources. Tokenized and sentence-aligned. A part...
A sentence aligned parallel corpus is an important prerequisite in statistical machine translation. ...
A parallel corpus aligned at both sentence and word level is an important prerequisite in statistica...
Corpus is a large collection of homogeneous and authentic written texts (or speech) of a particular ...
In this paper we present several parallel corpora for English↔Hindi and talk about their natures and...
This paper describes our submission for the English-Tamil news translation task of WMT-2020. The var...
This paper describes our submission for the English-Tamil news translation task of WMT-2020. The var...
HindEnCorp parallel texts (sentence-aligned) come from the following sources: Tides, which contains...
Word alignment in bilingual corpora has been a very active research topic in the Machine Translation...
TamilTB is a first published syntactically annotated corpus of Tamil. TamilTB will allow a more rapi...
Various experiments from literature suggest that in statistical machine translation (SMT), applying ...
The corpus based techniques in Machine Translation involves parallel corpora, but it is not applicab...
Parallel corpus is a critical resource in machine learning based translation. The task of collecting...