This thesis concerns techniques for efficient runtime optimisation of regular parallel programs that are built from separate software components. High-quality, high-performance parallel software is frequently built from separately-written reusa-ble software components such as functions from a library of parallel routines. Apart from the strong case from the software engineering point-of-view for constructing software in such a way, there is often also a large performance benefit in hand-optimising individual, frequently used routines. Hitherto, a problem with such libraries of separate software components has been that there is a performance penalty, both because of invocation and indirection overheads, and because opportuni-ties for cross-...
AbstractEliminating partially dead code has proved to be a powerful technique for the runtime optimi...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
The number of transistors as well as the frequency of processors have followed Moore's law for the p...
Abstract. This paper describes a combination of methods which make interprocedural data placement op...
Abstract. This paper shows how data placement optimisation tech-niques which are normally only found...
. We are developing a lazy, self-optimising parallel library of vector-matrix routines. The aim is ...
Available from British Library Document Supply Centre- DSC:DXN063301 / BLDSC - British Library Docum...
Distributing the workload of computationally intensive software components across a set of homogeneo...
The area of parallel and distributed computing has grown very fast in the past few decades with the ...
Abstract. We are developing a lazy, self-optimising parallel library of vector-matrix routines. The ...
While modern parallel computing systems offer high performance, utilizing these powerful computing r...
Prediction of the performance of parallel applications is a concept useful in several domains of sof...
In the past few years, code optimization has become a major field of research. Many efforts have bee...
Efficient performance tuning of parallel programs is often hard. Optimization is often done when the...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
AbstractEliminating partially dead code has proved to be a powerful technique for the runtime optimi...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
The number of transistors as well as the frequency of processors have followed Moore's law for the p...
Abstract. This paper describes a combination of methods which make interprocedural data placement op...
Abstract. This paper shows how data placement optimisation tech-niques which are normally only found...
. We are developing a lazy, self-optimising parallel library of vector-matrix routines. The aim is ...
Available from British Library Document Supply Centre- DSC:DXN063301 / BLDSC - British Library Docum...
Distributing the workload of computationally intensive software components across a set of homogeneo...
The area of parallel and distributed computing has grown very fast in the past few decades with the ...
Abstract. We are developing a lazy, self-optimising parallel library of vector-matrix routines. The ...
While modern parallel computing systems offer high performance, utilizing these powerful computing r...
Prediction of the performance of parallel applications is a concept useful in several domains of sof...
In the past few years, code optimization has become a major field of research. Many efforts have bee...
Efficient performance tuning of parallel programs is often hard. Optimization is often done when the...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
AbstractEliminating partially dead code has proved to be a powerful technique for the runtime optimi...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
The number of transistors as well as the frequency of processors have followed Moore's law for the p...