Distributed processing frameworks process data in parallel by dividing it into multiple partitions and each partition is processed in a separate task. The number of tasks is always created based on the total file size. However, this can lead to launch more tasks than needed in the case of hybrid layouts, because they help to read less data for certain operations (i.e., projection, selection). The over-provisioning of tasks may increase the job execution time and induce significant waste of computing resources. The latter due to the fact that each task introduces extra overhead (e.g., initialization, garbage collection, etc.). To allow a more efficient use of resources and reduce the job execution time, we propose a cost-based approach th...
AbstractMapReduce simplifies parallel programming, abstracting the programmer responsibilities as sy...
In managing multiprocessing of parallel distributed systems the central issue is the scheduling of j...
Most common huge volume data processing programs do counting, sorting, merging etc. Such programs re...
Distributed processing frameworks process data in parallel by dividing it into multiple partitions a...
The rapid increase in the data volumes encountered in many application domains has led to widespread...
Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysi...
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet ef...
Across the landscape of computing, parallelism within applications is increasingly important in orde...
International audienceThe task-based approach is a parallelization paradigm in which an algorithm is...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/1...
Ad-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized ...
Parallel processing is capable of executing a large number of tasks on a multiprocessor at the same ...
In order to have an optimal execution time of a program running on a multiprocessor system, the pro...
International audienceAccelerator-enhanced computing platforms have drawn a lot of attention due to ...
Load imbalance in parallel systems can be generated by external factors to the currently running app...
AbstractMapReduce simplifies parallel programming, abstracting the programmer responsibilities as sy...
In managing multiprocessing of parallel distributed systems the central issue is the scheduling of j...
Most common huge volume data processing programs do counting, sorting, merging etc. Such programs re...
Distributed processing frameworks process data in parallel by dividing it into multiple partitions a...
The rapid increase in the data volumes encountered in many application domains has led to widespread...
Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysi...
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet ef...
Across the landscape of computing, parallelism within applications is increasingly important in orde...
International audienceThe task-based approach is a parallelization paradigm in which an algorithm is...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/1...
Ad-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized ...
Parallel processing is capable of executing a large number of tasks on a multiprocessor at the same ...
In order to have an optimal execution time of a program running on a multiprocessor system, the pro...
International audienceAccelerator-enhanced computing platforms have drawn a lot of attention due to ...
Load imbalance in parallel systems can be generated by external factors to the currently running app...
AbstractMapReduce simplifies parallel programming, abstracting the programmer responsibilities as sy...
In managing multiprocessing of parallel distributed systems the central issue is the scheduling of j...
Most common huge volume data processing programs do counting, sorting, merging etc. Such programs re...