Autotuning is an established technique for optimizing the performance of parallel applications. However, programmers must prepare applications for autotuning, which is tedious and error prone coding work. We demonstrate how applications become ready for autotuning with few or no modifications by extending Threading Building Blocks (TBB), a library for parallel programming, with autotuning. The extended TBB library optimizes all application-independent tuning parameters fully automatically. We compare manual effort, autotuning overhead and performance gains on 17 examples. While some examples benefit only slightly, others speed up by 28% over standard TBB
International audienceCurrent architecture complexity requires fine tuning of compiler and runtime p...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
There are proposed software tools for automatic generating autotuners – special kind of applications...
Auto-tuning has recently received significant attention from the High Performance Computing communi...
Automatic performance tuning (auto-tuning) has been used in parallel numerical applications for adap...
Modern high performance libraries, such as ATLAS and FFTW, and programming languages, such as PetaBr...
International audienceCurrent architecture complexity requires fine tuning of compiler and runtime p...
Abstract. Several classes of applications expose a set of parameters that influence their extra-func...
In high-performance computing, excellent node-level performance is required for the efficient use of...
Abstract. Autotuning is an established technique for adjusting perfor-mance-critical parameters of a...
Abstract. In today’s multicore era, parallelization of serial code is es-sential in order to exploit...
In today’s multicore era, parallelization of serial code is essential in order to exploit the archit...
Scientific software applications are increasingly developed by large interdiscplinary teams operatin...
The recent transformation from an environment where gains in computational performance came from inc...
The tuning of parallel programs on large distributed-memory machines today is usually a costly, and ...
International audienceCurrent architecture complexity requires fine tuning of compiler and runtime p...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
There are proposed software tools for automatic generating autotuners – special kind of applications...
Auto-tuning has recently received significant attention from the High Performance Computing communi...
Automatic performance tuning (auto-tuning) has been used in parallel numerical applications for adap...
Modern high performance libraries, such as ATLAS and FFTW, and programming languages, such as PetaBr...
International audienceCurrent architecture complexity requires fine tuning of compiler and runtime p...
Abstract. Several classes of applications expose a set of parameters that influence their extra-func...
In high-performance computing, excellent node-level performance is required for the efficient use of...
Abstract. Autotuning is an established technique for adjusting perfor-mance-critical parameters of a...
Abstract. In today’s multicore era, parallelization of serial code is es-sential in order to exploit...
In today’s multicore era, parallelization of serial code is essential in order to exploit the archit...
Scientific software applications are increasingly developed by large interdiscplinary teams operatin...
The recent transformation from an environment where gains in computational performance came from inc...
The tuning of parallel programs on large distributed-memory machines today is usually a costly, and ...
International audienceCurrent architecture complexity requires fine tuning of compiler and runtime p...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
There are proposed software tools for automatic generating autotuners – special kind of applications...