In this project we make a study on universal language agnostic sentence embeddings: internal neural networks sentence representations that are independent with relation to the task and the language. To be more precise, we focus on how combining sentence embeddings of different models can improve benchmarks of well-known tasks. We also confirm our results by applying the methods in two self-created tasks involving a minority language, Occitan. We have used a total of four different architectures that produce four different encodings - each one with its characteristics and dimensions - and explored the behaviour when they are ensembled via concatenation or addition. This methodology is an easy and very simple approach that shows remarkable im...
Computational models for automatic text understanding have gained a lot of interest due to unusual p...
Sentence-level representations are necessary for various NLP tasks. Recurrent neural networks have p...
There has been an exponential surge of text data in the recent years. As a consequence, unsupervised...
Being able to learn generic representations of objects such as images, words or sentences is essenti...
We propose a new neural model for word embeddings, which uses Unitary Matrices as the primary device...
Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on lar...
Comunicació presentada a: 56th Annual Meeting of the Association for Computational Linguistics celeb...
Historically, models of human language assume that sentences have a symbolic structure and that this...
The scarcity of labeled training data across many languages is a significant roadblock for multiling...
The field of Natural Language Processing (NLP) has progressed rapidly in recent years due to the evo...
International audienceAlthough much effort has recently been devoted to training high-quality senten...
We present a Recurrent Neural Network (RNN) that performs thematic role assignment and can be used f...
International audienceDistributional semantic models trained using neural networks techniques yield ...
open4siThis work was supported in part by the Leibniz Gemeinschaft via the SAW-2016-ZPID-2 project a...
Multilingual sentence embeddings capture rich semantic information not only for measuring similarity...
Computational models for automatic text understanding have gained a lot of interest due to unusual p...
Sentence-level representations are necessary for various NLP tasks. Recurrent neural networks have p...
There has been an exponential surge of text data in the recent years. As a consequence, unsupervised...
Being able to learn generic representations of objects such as images, words or sentences is essenti...
We propose a new neural model for word embeddings, which uses Unitary Matrices as the primary device...
Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on lar...
Comunicació presentada a: 56th Annual Meeting of the Association for Computational Linguistics celeb...
Historically, models of human language assume that sentences have a symbolic structure and that this...
The scarcity of labeled training data across many languages is a significant roadblock for multiling...
The field of Natural Language Processing (NLP) has progressed rapidly in recent years due to the evo...
International audienceAlthough much effort has recently been devoted to training high-quality senten...
We present a Recurrent Neural Network (RNN) that performs thematic role assignment and can be used f...
International audienceDistributional semantic models trained using neural networks techniques yield ...
open4siThis work was supported in part by the Leibniz Gemeinschaft via the SAW-2016-ZPID-2 project a...
Multilingual sentence embeddings capture rich semantic information not only for measuring similarity...
Computational models for automatic text understanding have gained a lot of interest due to unusual p...
Sentence-level representations are necessary for various NLP tasks. Recurrent neural networks have p...
There has been an exponential surge of text data in the recent years. As a consequence, unsupervised...