∗ Both authors contributed equally Cross-language learning allows one to use training data from one language to build models for a different language. Many approaches to bilingual learning re-quire that we have word-level alignment of sentences from parallel corpora. In this work we explore the use of autoencoder-based methods for cross-language learn-ing of vectorial word representations that are coherent between two languages, while not relying on word-level alignments. We show that by simply learning to reconstruct the bag-of-words representations of aligned sentences, within and be-tween languages, we can in fact learn high-quality representations and do without word alignments. We empirically investigate the success of our approach on ...
The ability to accurately align concepts between languages can provide significant benefits in many ...
We introduce bilingual word embeddings: semantic embeddings associated across two languages in the c...
Building bilingual lexica from non-parallel data is a long-standing natural language processing rese...
Recent work on learning multilingual word representations usually relies on the use of word-level al...
Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages...
Cross-lingual embeddings are vector space representations where word translations tend to be co-loca...
Thesis (Master's)--University of Washington, 2020This work presents methods for learning cross-lingu...
One of the notable developments in current natural language processing is the practical efficacy of ...
We propose a new model for learning bilingual word representations from non-parallel document-aligne...
Traditional approaches to supervised learning require a generous amount of labeled data for good gen...
Traditional approaches to supervised learning require a generous amount of labeled data for good gen...
Recent work in learning bilingual repre-sentations tend to tailor towards achiev-ing good performanc...
We propose a new model for learning bilingual word representations from non-parallel document-aligne...
Traditional approaches to supervised learning require a generous amount of labeled data for good gen...
International audienceSome Transformer-based models can perform crosslingual transfer learning: thos...
The ability to accurately align concepts between languages can provide significant benefits in many ...
We introduce bilingual word embeddings: semantic embeddings associated across two languages in the c...
Building bilingual lexica from non-parallel data is a long-standing natural language processing rese...
Recent work on learning multilingual word representations usually relies on the use of word-level al...
Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages...
Cross-lingual embeddings are vector space representations where word translations tend to be co-loca...
Thesis (Master's)--University of Washington, 2020This work presents methods for learning cross-lingu...
One of the notable developments in current natural language processing is the practical efficacy of ...
We propose a new model for learning bilingual word representations from non-parallel document-aligne...
Traditional approaches to supervised learning require a generous amount of labeled data for good gen...
Traditional approaches to supervised learning require a generous amount of labeled data for good gen...
Recent work in learning bilingual repre-sentations tend to tailor towards achiev-ing good performanc...
We propose a new model for learning bilingual word representations from non-parallel document-aligne...
Traditional approaches to supervised learning require a generous amount of labeled data for good gen...
International audienceSome Transformer-based models can perform crosslingual transfer learning: thos...
The ability to accurately align concepts between languages can provide significant benefits in many ...
We introduce bilingual word embeddings: semantic embeddings associated across two languages in the c...
Building bilingual lexica from non-parallel data is a long-standing natural language processing rese...