With the widespread use of shared-nothing clusters of servers, there has been a proliferation of distributed object stores that offer high availability, reliability and enhanced performance for MapReduce-style workloads. However, relational workloads cannot always be evaluated efficiently using MapReduce without extensive data migrations, which cause network congestion and reduced query throughput. We study the problem of computing data placement strategies that minimize the data communication costs incurred by typical relational query workloads in a distributed setting. Our main contribution is a reduction of the data placement prob-lem to the well-studied problem of GRAPH PARTITIONING, which is NP-Hard but for which efficient approximatio...
Data placement for optimal performance is an old problem. For example the problem dealt with the pla...
Due to their better price/performance ratios and scalability, Shared-Nothing computer systems are be...
Web-scale RDF datasets are increasingly processed using distributed RDF data stores built on top of ...
Increasing need for large-scale data analytics in a number of ap-plication domains has led to a dram...
Abstract—We present a data replication framework for dis-tributed graph processing. First we partiti...
International audienceWe introduce optimal algorithms for the problems of data placement (DP) and pa...
We introduce optimal algorithms for the problems of data placement (DP) and page placement (PP) in n...
We introduce optimal algorithms for the problems of data placement (DP) and page placement (PP) in n...
In the last years, scalable RDF stores in the cloud have been developed, where graph data is distrib...
Distance join queries have recently been recognized as a particularly useful operation over graph da...
Distance join queries have recently been recognized as a particularly useful operation over graph da...
Load imbalance in an application can lead to degradation of performance and a significant drop in sy...
Due to their better price/performance ratios and scalability, Shared-Nothing computer systems are be...
Abstract. Web-scale RDF datasets are increasingly processed using dis-tributed RDF data stores built...
Abstract. Web-scale RDF datasets are increasingly processed using dis-tributed RDF data stores built...
Data placement for optimal performance is an old problem. For example the problem dealt with the pla...
Due to their better price/performance ratios and scalability, Shared-Nothing computer systems are be...
Web-scale RDF datasets are increasingly processed using distributed RDF data stores built on top of ...
Increasing need for large-scale data analytics in a number of ap-plication domains has led to a dram...
Abstract—We present a data replication framework for dis-tributed graph processing. First we partiti...
International audienceWe introduce optimal algorithms for the problems of data placement (DP) and pa...
We introduce optimal algorithms for the problems of data placement (DP) and page placement (PP) in n...
We introduce optimal algorithms for the problems of data placement (DP) and page placement (PP) in n...
In the last years, scalable RDF stores in the cloud have been developed, where graph data is distrib...
Distance join queries have recently been recognized as a particularly useful operation over graph da...
Distance join queries have recently been recognized as a particularly useful operation over graph da...
Load imbalance in an application can lead to degradation of performance and a significant drop in sy...
Due to their better price/performance ratios and scalability, Shared-Nothing computer systems are be...
Abstract. Web-scale RDF datasets are increasingly processed using dis-tributed RDF data stores built...
Abstract. Web-scale RDF datasets are increasingly processed using dis-tributed RDF data stores built...
Data placement for optimal performance is an old problem. For example the problem dealt with the pla...
Due to their better price/performance ratios and scalability, Shared-Nothing computer systems are be...
Web-scale RDF datasets are increasingly processed using distributed RDF data stores built on top of ...