Modern processors and computer systems are designed to be eÆcient and achieve high performance with applications that have regular memory access patterns. For example, dense linear algebra routines can be implemented to achieve near peak performance. While such routines have traditionally formed the core of many scientific and engineering applications, commercial workloads like database and web servers, or decision support systems (data warehouses and data mining) are one of the fastest growing market segments on high-performance computing platforms. Many of these commercial applications are characterised by more complex codes and irregular memory access patterns, which often result in a decrease of performance that is achieved. Due to thei...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Special issue on Clusters and Computational Grids for Scientific Computing (CCGSC'02)In this paper w...
The internal representation of numerical data, their speed of manipulation to generate the desired r...
Modern processors and computer systems are designed to be efficient and achieve high performance wit...
12 pagesThe community of program optimisation and analysis, code performance evaluation, parallelisa...
. In this paper we explore the characteristics of numerically intensive programs and explore their e...
This paper examines how to write code to gain high performance on modern computers as well as the im...
In this paper we presents a tool for the dynamic forecast of performance of linear algebra routine a...
We have developed a hierarchical performance bounding meth-odology that attempts to explain the perf...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
Performance comparisons are ubiquitous in computer science. The proceedings of most conferences are ...
Over the past few years, the interest and application of machine learning algorithms has risen expon...
Performance analysis tools are essential to the maintenance of efficient parallel execution of scien...
The performance of a parallel matrix-matrix-multiplication routine with the same functionality as DG...
This thesis describes novel techniques and test implementations for optimizing numerically intensive...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Special issue on Clusters and Computational Grids for Scientific Computing (CCGSC'02)In this paper w...
The internal representation of numerical data, their speed of manipulation to generate the desired r...
Modern processors and computer systems are designed to be efficient and achieve high performance wit...
12 pagesThe community of program optimisation and analysis, code performance evaluation, parallelisa...
. In this paper we explore the characteristics of numerically intensive programs and explore their e...
This paper examines how to write code to gain high performance on modern computers as well as the im...
In this paper we presents a tool for the dynamic forecast of performance of linear algebra routine a...
We have developed a hierarchical performance bounding meth-odology that attempts to explain the perf...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
Performance comparisons are ubiquitous in computer science. The proceedings of most conferences are ...
Over the past few years, the interest and application of machine learning algorithms has risen expon...
Performance analysis tools are essential to the maintenance of efficient parallel execution of scien...
The performance of a parallel matrix-matrix-multiplication routine with the same functionality as DG...
This thesis describes novel techniques and test implementations for optimizing numerically intensive...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Special issue on Clusters and Computational Grids for Scientific Computing (CCGSC'02)In this paper w...
The internal representation of numerical data, their speed of manipulation to generate the desired r...