Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse methods such as BM25 in their ability to reliably match salient phrases and rare entities in the query and to generalize to out-of-domain data. It has been argued that this is an inherent limitation of dense models. We rebut this claim by introducing the Salient Phrase Aware Retriever (SPAR), a dense retriever with the lexical matching capacity of a sparse model. We show that a dense Lexical Model {\Lambda} can be trained to imitate a sparse one, and SPAR is built by augmenting a standard dense retriever with {\Lambda}. Empirically, SPAR shows superior performance on a range of tasks including five question answering datasets, MS MARCO passag...
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive per...
Recent advances in dense retrieval techniques have offered the promise of being able not just to re-...
Recent work has shown that small distilled language models are strong competitors to models that are...
In this work, we propose a simple method that applies a large language model (LLM) to large-scale re...
Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At...
Text retrieval is a long-standing research topic on information seeking, where a system is required ...
Language model fusion helps smart assistants recognize words which are rare in acoustic data but abu...
We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Li...
Conversational search (CS) needs a holistic understanding of conversational inputs to retrieve relev...
Dense retrieval (DR) converts queries and documents into dense embeddings and measures the similarit...
The advent of contextualised language models has brought gains in search effectiveness, not just whe...
Recently, information retrieval has seen the emergence of dense retrievers, based on neural networks...
Pseudo-relevance feedback (PRF) is a classical technique to improve search engine retrieval effectiv...
Dense retrieval uses a contrastive learning framework to learn dense representations of queries and ...
Large-scale retrieval is to recall relevant documents from a huge collection given a query. It relie...
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive per...
Recent advances in dense retrieval techniques have offered the promise of being able not just to re-...
Recent work has shown that small distilled language models are strong competitors to models that are...
In this work, we propose a simple method that applies a large language model (LLM) to large-scale re...
Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At...
Text retrieval is a long-standing research topic on information seeking, where a system is required ...
Language model fusion helps smart assistants recognize words which are rare in acoustic data but abu...
We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Li...
Conversational search (CS) needs a holistic understanding of conversational inputs to retrieve relev...
Dense retrieval (DR) converts queries and documents into dense embeddings and measures the similarit...
The advent of contextualised language models has brought gains in search effectiveness, not just whe...
Recently, information retrieval has seen the emergence of dense retrievers, based on neural networks...
Pseudo-relevance feedback (PRF) is a classical technique to improve search engine retrieval effectiv...
Dense retrieval uses a contrastive learning framework to learn dense representations of queries and ...
Large-scale retrieval is to recall relevant documents from a huge collection given a query. It relie...
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive per...
Recent advances in dense retrieval techniques have offered the promise of being able not just to re-...
Recent work has shown that small distilled language models are strong competitors to models that are...