Much compiler-orientated work in the area of mapping parallel programs to parallel architectures has ignored the issue of external workload. Given that the majority of platforms will not be dedicated to just one task at a time, the impact of other jobs needs to be addressed. As mapping is highly dependent on the underlying machine, a technique that is easily portable across platforms is also desirable. In this paper we develop an approach for predicting the optimal number of threads for a given data-parallel application in the presence of external workload. We achieve 93.7% of the maximum speedup available which gives an average speedup of 1.66 on 4 cores, a factor 1.24 times better than the OpenMP compiler's default policy. We also develop...
It is possible to reduce the computation time of data parallel programs by dividing the computation ...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
International audienceCurrent and future architectures rely on thread-level parallelism to sustain p...
Given the wide scale adoption of multi-cores in main stream computing, parallel programs rarely exec...
Given the wide scale adoption of multi-cores in main stream computing, parallel programs rarely exec...
For a wide variety of applications, both task and data parallelism must be exploited to achieve the ...
International audienceThe parallelism in shared-memory systems has increased significantly with the ...
A faire apr`es Keywords: Parallel environment, Distributed-memory machines, Load-balancing, Mapping...
The efficient mapping of program parallelism to multi-core processors is highly dependent on the und...
Abstract-For a wide variety of applications, both task and data parallelism must be exploited to ach...
The emergence of multicore and manycore processors is set to change the parallel computing world. Ap...
The need for high-performance computing together with the increasing trend from single processor to ...
Future integrated systems will contain billions of transistors, composing tens to hundreds of IP cor...
A fundamental issue affecting the performance of a parallel application running on message-passing p...
Modern day hardware platforms are parallel and diverse, ranging from mobiles to data centers. Mains...
It is possible to reduce the computation time of data parallel programs by dividing the computation ...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
International audienceCurrent and future architectures rely on thread-level parallelism to sustain p...
Given the wide scale adoption of multi-cores in main stream computing, parallel programs rarely exec...
Given the wide scale adoption of multi-cores in main stream computing, parallel programs rarely exec...
For a wide variety of applications, both task and data parallelism must be exploited to achieve the ...
International audienceThe parallelism in shared-memory systems has increased significantly with the ...
A faire apr`es Keywords: Parallel environment, Distributed-memory machines, Load-balancing, Mapping...
The efficient mapping of program parallelism to multi-core processors is highly dependent on the und...
Abstract-For a wide variety of applications, both task and data parallelism must be exploited to ach...
The emergence of multicore and manycore processors is set to change the parallel computing world. Ap...
The need for high-performance computing together with the increasing trend from single processor to ...
Future integrated systems will contain billions of transistors, composing tens to hundreds of IP cor...
A fundamental issue affecting the performance of a parallel application running on message-passing p...
Modern day hardware platforms are parallel and diverse, ranging from mobiles to data centers. Mains...
It is possible to reduce the computation time of data parallel programs by dividing the computation ...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
International audienceCurrent and future architectures rely on thread-level parallelism to sustain p...