Performance tuning, as carried out by compiler designers and application programmers to close the performance gap between the achievable peak and delivered performance, becomes increasingly important and challenging as the microprocessor speeds and system sizes increase. However, although performance tuning on scientific codes usually deals with relatively small program regions, it is not generally known how to establish a reasonable performance objective and how to efficiently achieve this objective. We suggest a goal-directed approach and develop such an approach for each of three major system performance components: central processor unit (CPU) computation, memory accessing, and communication. For the CPU, we suggest using a machine-appl...
We have developed a hierarchical performance bounding methodology that attempts to explain the perfo...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
Programmers often rely on performance analysis tools to provide feedback about the execution of thei...
ABSTRACT Goal-Directed Performance Tuning for Scientific Applications by Tien-Pao Shih Chair: Edward...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
We have developed a performance bounding methodology that explains the performance of loop-dominated...
Tuning the performance of applications requires understanding the interactions between code and targ...
We present a cache performance modeling methodology that facilitates the tuning of uniprocessor cach...
The recent transformation from an environment where gains in computational performance came from inc...
Tuning the performance of applications requires understanding the interactions between code and targ...
Application performance on modern microprocessors depends heavily on performance related characteris...
Obtaining high performance without machine-specific tuning is an important goal of scientific applic...
An effective methodology of performance evaluation and improvement enables application developers to...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Modern supercomputers deliver large computational power, but it is difficult for an application to e...
We have developed a hierarchical performance bounding methodology that attempts to explain the perfo...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
Programmers often rely on performance analysis tools to provide feedback about the execution of thei...
ABSTRACT Goal-Directed Performance Tuning for Scientific Applications by Tien-Pao Shih Chair: Edward...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
We have developed a performance bounding methodology that explains the performance of loop-dominated...
Tuning the performance of applications requires understanding the interactions between code and targ...
We present a cache performance modeling methodology that facilitates the tuning of uniprocessor cach...
The recent transformation from an environment where gains in computational performance came from inc...
Tuning the performance of applications requires understanding the interactions between code and targ...
Application performance on modern microprocessors depends heavily on performance related characteris...
Obtaining high performance without machine-specific tuning is an important goal of scientific applic...
An effective methodology of performance evaluation and improvement enables application developers to...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Modern supercomputers deliver large computational power, but it is difficult for an application to e...
We have developed a hierarchical performance bounding methodology that attempts to explain the perfo...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
Programmers often rely on performance analysis tools to provide feedback about the execution of thei...