A growing body of machine translation research aims to exploit lexical patterns (e.g., ngrams and phrase pairs) with gaps (Simard et al., 2005; Chiang, 2005; Xiong et al., 2011). Typically, these “gappy patterns” are discovered using heuristics based on word alignments or local statistics such as mutual information. In this paper, we develop generative models of monolingual and parallel text that build sentences using gappy patterns of arbitrary length and with arbitrarily many gaps. We exploit Bayesian nonparametrics and collapsed Gibbs sampling to discover salient patterns in a corpus. We evaluate the patterns qualitatively and also add them as features to an MT system, reporting promising preliminary results.</p
The goal of a machine translation (MT) system is to automatically translate a document written in so...
We present an unsupervised word segmentation model for machine translation. The model uses existing ...
The translation of a text can be viewed as a detailed annotation of the text\u27s meaning. From this...
A growing body of machine translation re-search aims to exploit lexical patterns (e.g., n-grams and ...
The development of broad domain statistical machine translation systems is gated by the availability...
Thesis (Master's)--University of Washington, 2018In an emergency, machine translation systems can be...
The subject of investigation of this thesis is the building blocks of translation in Statistical Mac...
Building models of language is a central task in natural language processing. Traditionally, languag...
We investigate why weights from generative models underperform heuristic estimates in phrasebased ...
We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our go...
In this paper we present research results with gApp, a text-preprocessing system designed for automa...
In this paper we discuss sentence generation strategy for pattern-based machine translation and thei...
We discuss a probabilistic graphical model for recog-nizing patterns in texts. It is derived from th...
International audienceWe give a probabilistic analysis of parameters related to $\alpha$-gapped repe...
Grammars for machine translation can be materialized on demand by finding source phrases in an index...
The goal of a machine translation (MT) system is to automatically translate a document written in so...
We present an unsupervised word segmentation model for machine translation. The model uses existing ...
The translation of a text can be viewed as a detailed annotation of the text\u27s meaning. From this...
A growing body of machine translation re-search aims to exploit lexical patterns (e.g., n-grams and ...
The development of broad domain statistical machine translation systems is gated by the availability...
Thesis (Master's)--University of Washington, 2018In an emergency, machine translation systems can be...
The subject of investigation of this thesis is the building blocks of translation in Statistical Mac...
Building models of language is a central task in natural language processing. Traditionally, languag...
We investigate why weights from generative models underperform heuristic estimates in phrasebased ...
We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our go...
In this paper we present research results with gApp, a text-preprocessing system designed for automa...
In this paper we discuss sentence generation strategy for pattern-based machine translation and thei...
We discuss a probabilistic graphical model for recog-nizing patterns in texts. It is derived from th...
International audienceWe give a probabilistic analysis of parameters related to $\alpha$-gapped repe...
Grammars for machine translation can be materialized on demand by finding source phrases in an index...
The goal of a machine translation (MT) system is to automatically translate a document written in so...
We present an unsupervised word segmentation model for machine translation. The model uses existing ...
The translation of a text can be viewed as a detailed annotation of the text\u27s meaning. From this...