The real-time search problem requires making ingested doc-uments immediately searchable, which presents architectural challenges for systems built around inverted indexing. In this paper, we explore a radical proposition: What if we abandon document inversion and instead adopt an architec-ture based on brute force scans of document representations? In such a design, “indexing ” simply involves appending the parsed representation of an ingested document to an exist-ing buffer, which is simple and fast. Quite surprisingly, ex-periments with TREC Microblog test collections show that query evaluation with brute force scans is feasible and per-formance compares favorably to a traditional search archi-tecture based on an inverted index, especiall...
The technology underlying text search engines has advanced dramatically in the past decade. The deve...
Communicating in short messages, such as using microblogs, was becoming more popular currently. Twit...
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. The convolutional neural netw...
There is an increasing trend of social media usage in recent years and users desire a search system ...
MIREX (MapReduce Information Retrieval Experiments) is a software library initially developed by the...
The impressive rise of user-generated content on the web in the hands of sites like Twitter imposes ...
The impressive rise of user-generated content on the web in the hands of sites like Twitter imposes ...
Abstract. In the real-time tweet search task operationalized in the TREC Microblog evaluations, a to...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...
Abstract. In this paper, we outline our experiments carried out at the TREC Microblog Track 2011. Ou...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...
© 2017 ACM. In this article, we study the problem of efficient top-k disjunctive query processing in...
10.1145/1989323.1989391Proceedings of the ACM SIGMOD International Conference on Management of Data6...
The most widely used similarity measure in the field of natural language processing may be co-sine s...
© 2017 ACM. Many real applications in real-time news stream advertising call for efficient processin...
The technology underlying text search engines has advanced dramatically in the past decade. The deve...
Communicating in short messages, such as using microblogs, was becoming more popular currently. Twit...
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. The convolutional neural netw...
There is an increasing trend of social media usage in recent years and users desire a search system ...
MIREX (MapReduce Information Retrieval Experiments) is a software library initially developed by the...
The impressive rise of user-generated content on the web in the hands of sites like Twitter imposes ...
The impressive rise of user-generated content on the web in the hands of sites like Twitter imposes ...
Abstract. In the real-time tweet search task operationalized in the TREC Microblog evaluations, a to...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...
Abstract. In this paper, we outline our experiments carried out at the TREC Microblog Track 2011. Ou...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...
© 2017 ACM. In this article, we study the problem of efficient top-k disjunctive query processing in...
10.1145/1989323.1989391Proceedings of the ACM SIGMOD International Conference on Management of Data6...
The most widely used similarity measure in the field of natural language processing may be co-sine s...
© 2017 ACM. Many real applications in real-time news stream advertising call for efficient processin...
The technology underlying text search engines has advanced dramatically in the past decade. The deve...
Communicating in short messages, such as using microblogs, was becoming more popular currently. Twit...
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. The convolutional neural netw...