We propose the SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation learning framework. Unlike previous works on speech representation learning, which learns multilingual contextual speech embedding at the resolution of an acoustic frame (10-20ms), this work focuses on learning multimodal (speech-text) multilingual speech embedding at the resolution of a sentence (5-10s) such that the embedding vector space is semantically aligned across different languages. We combine state-of-the-art multilingual acoustic frame-level speech representation learning model XLS-R with the Language Agnostic BERT Sentence Embedding (LaBSE) model to create an utterance-level multimodal multilingual speech encoder SAMU-XL...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
One of the notable developments in current natural language processing is the practical efficacy of ...
We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework ...
Over the past few years, self-supervised learned speech representations have emerged as fruitful rep...
Multilingual sentence embeddings capture rich semantic information not only for measuring similarity...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
Abstract Meaning Representation (AMR) is a popular formalism of natural language that represents the...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
The scarcity of labeled training data across many languages is a significant roadblock for multiling...
Current multilingual vision-language models either require a large number of additional parameters ...
In this thesis, we present a transformers-based multi-lingual embedding model to represent sentences...
Being able to learn generic representations of objects such as images, words or sentences is essenti...
Word embeddings represent words in a numeric space so that semantic relations between words are repr...
Recent work on learning multilingual word representations usually relies on the use of word-level al...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
One of the notable developments in current natural language processing is the practical efficacy of ...
We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework ...
Over the past few years, self-supervised learned speech representations have emerged as fruitful rep...
Multilingual sentence embeddings capture rich semantic information not only for measuring similarity...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
Abstract Meaning Representation (AMR) is a popular formalism of natural language that represents the...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
Word embeddings - dense vector representations of a word’s distributional semantics - are an indespe...
The scarcity of labeled training data across many languages is a significant roadblock for multiling...
Current multilingual vision-language models either require a large number of additional parameters ...
In this thesis, we present a transformers-based multi-lingual embedding model to represent sentences...
Being able to learn generic representations of objects such as images, words or sentences is essenti...
Word embeddings represent words in a numeric space so that semantic relations between words are repr...
Recent work on learning multilingual word representations usually relies on the use of word-level al...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
One of the notable developments in current natural language processing is the practical efficacy of ...
We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework ...