Existing solutions for answering SPARQL queries in a shared-nothing environment using MapReduce failed to fully explore the substantial scalability and parallelism of the computing framework. In this paper, we propose a cost model based RDF join processing solution using MapReduce to minimize the query responding time as much as possible. After transforming a SPARQL query into a sequence of MapReduce jobs, we propose a novel index structure, called All Possible Join tree (APJ-tree), to reduce the searching space for the optimal execution plan of a set of MapReduce jobs. To speed up the join processing, we employ hybrid join and bloom filter for performance optimization. Extensive experiments on real data sets proved the effectiveness of our...
Abstract—In recent times, it has been widely recognized that, due to their inherent scalability, fra...
International audienceA common way to achieve scalability for processing SPARQL queries over large R...
The join ordering problem is a fundamental challenge that has to be solved by any query optimizer. S...
Existing solutions for answering SPARQL queries in a shared-nothing environment using MapReduce fail...
The expansion of the services of the Semantic Web and the evolution of cloud computing technologies ...
Social network data analysis becomes increasingly important today. In order to improve the integrati...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
The use of RDF to expose semantic data on the Web has seen a dramatic increase over the last few yea...
International audienceThe growth of real-time data generation and stored data leads us to be constan...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
freiburg.de One of the major challenges in large-scale data processing with MapReduce is the smart c...
Abstract. As a massive linked open data is available in RDF, the scalable stor-age and efficient ret...
Join query is one of the most expressive and expensive data analytic tools in traditional database s...
Abstract. We present D-SPARQ, a distributed RDF query engine that combines the MapReduce processing ...
Abstract—In recent times, it has been widely recognized that, due to their inherent scalability, fra...
International audienceA common way to achieve scalability for processing SPARQL queries over large R...
The join ordering problem is a fundamental challenge that has to be solved by any query optimizer. S...
Existing solutions for answering SPARQL queries in a shared-nothing environment using MapReduce fail...
The expansion of the services of the Semantic Web and the evolution of cloud computing technologies ...
Social network data analysis becomes increasingly important today. In order to improve the integrati...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
The use of RDF to expose semantic data on the Web has seen a dramatic increase over the last few yea...
International audienceThe growth of real-time data generation and stored data leads us to be constan...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
freiburg.de One of the major challenges in large-scale data processing with MapReduce is the smart c...
Abstract. As a massive linked open data is available in RDF, the scalable stor-age and efficient ret...
Join query is one of the most expressive and expensive data analytic tools in traditional database s...
Abstract. We present D-SPARQ, a distributed RDF query engine that combines the MapReduce processing ...
Abstract—In recent times, it has been widely recognized that, due to their inherent scalability, fra...
International audienceA common way to achieve scalability for processing SPARQL queries over large R...
The join ordering problem is a fundamental challenge that has to be solved by any query optimizer. S...