AbstractWe propose succinct data structures for text retrieval systems supporting document listing queries and ranking queries based on the tf*idf (term frequency times inverse document frequency) scores of documents. Traditional data structures for these problems support queries only for some predetermined keywords. Recently Muthukrishnan proposed a data structure for document listing queries for arbitrary patterns at the cost of data structure size. For computing the tf*idf scores there has been no efficient data structures for arbitrary patterns.Our new data structures support these queries using small space. The space is only 2/ϵ times the size of compressed documents plus 10n bits for a document collection of length n, for any 0<ϵ⩽1. T...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
[[abstract]]Given a set D = fd1; d2; :::; dDg of D strings of total length n, our task is to report ...
Keywords: in information retrieval for decades. We propose a novel term weighting method based on wh...
AbstractWe give new space/time tradeoffs for compressed indexes that answer document retrieval queri...
Succinct data structures are used today in many information retrieval applications, e.g., posting li...
Term frequency – Inverse Document Frequency (TFIDF) is a vital first step in text analytics for info...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...
We address the problem of indexing a collectionD = {T1,T2,...TD} of D string documents of total leng...
[[abstract]]We present a framework to dynamize succinct data structures, to encourage their use over...
Given a collection of strings, document listing refers to the problem of finding all the strings (or...
The original publication is available at www.springerlink.comWe present a framework to dynamize succ...
[[abstract]]In the document retrieval problem [9], we are given a collection of documents (strings) ...
Let D = {T1,T2,...,TD} be a collection of D string documents of n characters in total. The forbidden...
Let D={T1,T2,…,TD} be a collection of D documents having n characters in total. Given two patterns P...
Given a collection of strings, document listing refers to the problem of finding all the strings (or...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
[[abstract]]Given a set D = fd1; d2; :::; dDg of D strings of total length n, our task is to report ...
Keywords: in information retrieval for decades. We propose a novel term weighting method based on wh...
AbstractWe give new space/time tradeoffs for compressed indexes that answer document retrieval queri...
Succinct data structures are used today in many information retrieval applications, e.g., posting li...
Term frequency – Inverse Document Frequency (TFIDF) is a vital first step in text analytics for info...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...
We address the problem of indexing a collectionD = {T1,T2,...TD} of D string documents of total leng...
[[abstract]]We present a framework to dynamize succinct data structures, to encourage their use over...
Given a collection of strings, document listing refers to the problem of finding all the strings (or...
The original publication is available at www.springerlink.comWe present a framework to dynamize succ...
[[abstract]]In the document retrieval problem [9], we are given a collection of documents (strings) ...
Let D = {T1,T2,...,TD} be a collection of D string documents of n characters in total. The forbidden...
Let D={T1,T2,…,TD} be a collection of D documents having n characters in total. Given two patterns P...
Given a collection of strings, document listing refers to the problem of finding all the strings (or...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
[[abstract]]Given a set D = fd1; d2; :::; dDg of D strings of total length n, our task is to report ...
Keywords: in information retrieval for decades. We propose a novel term weighting method based on wh...