Recently, information retrieval has seen the emergence of dense retrievers, based on neural networks, as an alternative to classical sparse methods based on term-frequency. These models have obtained state-of-the-art results on datasets and tasks where large training sets are available. However, they do not transfer well to new applications with no training data, and are outperformed by unsupervised term-frequency methods such as BM25. In this work, we explore the limits of contrastive learning as a way to train unsupervised dense retrievers and show that it leads to strong performance in various retrieval settings. On the BEIR benchmark our unsupervised model outperforms BM25 on 11 out of 15 datasets for the Recall@100 metric. When used as...
Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this...
Dense retrieval uses a contrastive learning framework to learn dense representations of queries and ...
Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased...
Recent research demonstrates the effectiveness of using pretrained language models (PLM) to improve ...
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) whic...
Dense retrieval models have predominantly been studied for English, where models have shown great su...
<p>With the rapid growth of world-wide information accessibility, cross-language information retriev...
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) whic...
In this work, we evaluate contrastive models for the task of imageretrieval. We hypothesise that mod...
Neural language models do not scale well when the vocabulary is large. Noise contrastive estimation ...
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive per...
International audienceIn the last decade, Deep neural networks (DNNs) have been proven to outperform...
Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse ...
Cross-lingual information retrieval is a difficult task typically involving query translation into m...
The advent of contextualised language models has brought gains in search effectiveness, not just whe...
Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this...
Dense retrieval uses a contrastive learning framework to learn dense representations of queries and ...
Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased...
Recent research demonstrates the effectiveness of using pretrained language models (PLM) to improve ...
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) whic...
Dense retrieval models have predominantly been studied for English, where models have shown great su...
<p>With the rapid growth of world-wide information accessibility, cross-language information retriev...
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) whic...
In this work, we evaluate contrastive models for the task of imageretrieval. We hypothesise that mod...
Neural language models do not scale well when the vocabulary is large. Noise contrastive estimation ...
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive per...
International audienceIn the last decade, Deep neural networks (DNNs) have been proven to outperform...
Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse ...
Cross-lingual information retrieval is a difficult task typically involving query translation into m...
The advent of contextualised language models has brought gains in search effectiveness, not just whe...
Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this...
Dense retrieval uses a contrastive learning framework to learn dense representations of queries and ...
Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased...