We show how full-text search based on inverted indices can be accelerated by clustering the documents without losing results (SeCluD – Search with Clustered Documents). We develop a fast multilevel clustering algorithm that explicitly uses query cost for conjunctive queries as an objective function. Depending on the inputs we get up to four times faster than non-clustered search. The resulting clusters are also useful for data compression and for distributing the work over many machines.
We develop a new algorithm for clustering search results. Differently from many other clustering sys...
Document clustering, which is also refered to as text clustering, is a technique of unsupervised doc...
textClustering is a central problem in unsupervised learning for discovering interesting patterns in...
Our research shows that for large databases, without considerable additional storage overhead, clust...
Approximated algorithms for clustering large-scale document collection are proposed and evaluated un...
Clustering is a powerful technique for large-scale topic discovery from text. It involves two phases...
An index or topic hierarchy of full-text documents can organize a domain and speed information retri...
The efficiency of various cluster based retrieval (CBR) strategies is analyzed. The possibility of c...
This work addresses the problem of reducing the time between query submission and results output in ...
Conventional document retrieval systems (e.g., Alta Vista) return long lists of ranked documents in ...
The processing time and disk space requirements of an inverted index and top-down cluster search ar...
: Development of cluster-based search systems has been hampered by prohibitive times involved in clu...
The process of clustering documents in a manner which produces accurate and compact clusters becomes...
Our research shows that for large databases, without considerable additional storage overhead, clust...
We develop a new algorithm for clustering search results. Differently from many other clustering sys...
We develop a new algorithm for clustering search results. Differently from many other clustering sys...
Document clustering, which is also refered to as text clustering, is a technique of unsupervised doc...
textClustering is a central problem in unsupervised learning for discovering interesting patterns in...
Our research shows that for large databases, without considerable additional storage overhead, clust...
Approximated algorithms for clustering large-scale document collection are proposed and evaluated un...
Clustering is a powerful technique for large-scale topic discovery from text. It involves two phases...
An index or topic hierarchy of full-text documents can organize a domain and speed information retri...
The efficiency of various cluster based retrieval (CBR) strategies is analyzed. The possibility of c...
This work addresses the problem of reducing the time between query submission and results output in ...
Conventional document retrieval systems (e.g., Alta Vista) return long lists of ranked documents in ...
The processing time and disk space requirements of an inverted index and top-down cluster search ar...
: Development of cluster-based search systems has been hampered by prohibitive times involved in clu...
The process of clustering documents in a manner which produces accurate and compact clusters becomes...
Our research shows that for large databases, without considerable additional storage overhead, clust...
We develop a new algorithm for clustering search results. Differently from many other clustering sys...
We develop a new algorithm for clustering search results. Differently from many other clustering sys...
Document clustering, which is also refered to as text clustering, is a technique of unsupervised doc...
textClustering is a central problem in unsupervised learning for discovering interesting patterns in...