Nowadays, parallel applications are used every day in high performance computing, scientific computing and also in everyday tasks due to the pervasiveness of multi-core architectures. However, several implementation challenges have so far stifled the integration of parallel applications and automatic precision tuning. First of all, tuning a parallel application introduces difficulties in the detection of the region of code that must be affected by the optimization. Moreover, additional challenges arise in handling shared variables and accumulators. In this work we address such challenges by introducing OpenMP parallel programming support to the TAFFO precision tuning framework. With our approach we achieve speedups up to 750% with respect t...
Many classes of applications, both in the embedded and high performance domains, can trade off the a...
OpenMP is a popular application programming interface (API) used to write shared-memory parallel pro...
The tuning of parallel programs on large distributed-memory machines today is usually a costly, and ...
Nowadays, parallel applications are used every day in high performance computing, scientific computi...
In recent years parallel computing has become ubiquitous. Lead by the spread of commodity multicore ...
Parallelisation is becoming more and more important as the single core performance increase is stagn...
We present taffo, a framework that automatically performs precision tuning to exploit the performanc...
This paper describes a new parallel program tuning framework, with a new approach for tuning. The ap...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
AbstractWe present a dynamic method for tuning algorithmic parameters of parallel scientific program...
We have developed an environment, based upon robust, existing, open source software, for tuning appl...
The demand for large compute capabilities in scientific computing led to wide use and acceptance of ...
Multi-core architectures have become more popular due to better performance, reduced heat dissipatio...
Shared memory parallel programming, for instance by inserting OpenMP pragmas into program code, migh...
Modern high performance computing architectures are based on multi-core and multi-threaded computing...
Many classes of applications, both in the embedded and high performance domains, can trade off the a...
OpenMP is a popular application programming interface (API) used to write shared-memory parallel pro...
The tuning of parallel programs on large distributed-memory machines today is usually a costly, and ...
Nowadays, parallel applications are used every day in high performance computing, scientific computi...
In recent years parallel computing has become ubiquitous. Lead by the spread of commodity multicore ...
Parallelisation is becoming more and more important as the single core performance increase is stagn...
We present taffo, a framework that automatically performs precision tuning to exploit the performanc...
This paper describes a new parallel program tuning framework, with a new approach for tuning. The ap...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
AbstractWe present a dynamic method for tuning algorithmic parameters of parallel scientific program...
We have developed an environment, based upon robust, existing, open source software, for tuning appl...
The demand for large compute capabilities in scientific computing led to wide use and acceptance of ...
Multi-core architectures have become more popular due to better performance, reduced heat dissipatio...
Shared memory parallel programming, for instance by inserting OpenMP pragmas into program code, migh...
Modern high performance computing architectures are based on multi-core and multi-threaded computing...
Many classes of applications, both in the embedded and high performance domains, can trade off the a...
OpenMP is a popular application programming interface (API) used to write shared-memory parallel pro...
The tuning of parallel programs on large distributed-memory machines today is usually a costly, and ...