Abstract—In many scientific applications, significant time is spent in tuning codes for a particular high-performance archi-tecture. Tuning approaches range from the relatively nonintrusive (e.g., by using compiler options) to extensive code modifications that attempt to exploit specific architecture features. Intrusive techniques often result in code changes that are not easily re-versible, which can negatively impact readability, maintainability, and performance on different architectures. We introduce an extensible annotation-based empirical tuning system called Orio, which is aimed at improving both performance and productivity by enabling software developers to insert annotations in the form of structured comments into their source cod...
Performance is the critical feature in the design and productivity of software systems. A key to imp...
Efficient large-scale scientific computing requires efficient code, yet optimizing code to render it...
Achieving peak performance from the computational ker-nels that dominate application performance oft...
Abstract—In many scientific applications, significant time is spent in tuning codes for a particular...
We have developed an environment, based upon robust, existing, open source software, for tuning appl...
Large scale applications developers have many tools at their disposal to optimize and verify their s...
Abstract—Autotuning systems intelligently navigate a search space of possible implementations of a c...
We present some preliminary results of selective profiling in our efforts towards automatic performa...
The contemporary parallel I/O software stack is complex due to a large number of configurations for ...
This paper presents an automated performance tuning solution, which partitions a program into a numb...
The excessive complexity of both machine architectures and applications have made it difficult for c...
Compile-time optimizations generally improve program performance. Nevertheless, degradations caused ...
Over the last several decades we have witnessed tremendous change in the landscape of computer archi...
As the complexity of machines and architectures has increased, performance tuning has become more ch...
In today’s multicore era, parallelization of serial code is essential in order to exploit the archit...
Performance is the critical feature in the design and productivity of software systems. A key to imp...
Efficient large-scale scientific computing requires efficient code, yet optimizing code to render it...
Achieving peak performance from the computational ker-nels that dominate application performance oft...
Abstract—In many scientific applications, significant time is spent in tuning codes for a particular...
We have developed an environment, based upon robust, existing, open source software, for tuning appl...
Large scale applications developers have many tools at their disposal to optimize and verify their s...
Abstract—Autotuning systems intelligently navigate a search space of possible implementations of a c...
We present some preliminary results of selective profiling in our efforts towards automatic performa...
The contemporary parallel I/O software stack is complex due to a large number of configurations for ...
This paper presents an automated performance tuning solution, which partitions a program into a numb...
The excessive complexity of both machine architectures and applications have made it difficult for c...
Compile-time optimizations generally improve program performance. Nevertheless, degradations caused ...
Over the last several decades we have witnessed tremendous change in the landscape of computer archi...
As the complexity of machines and architectures has increased, performance tuning has become more ch...
In today’s multicore era, parallelization of serial code is essential in order to exploit the archit...
Performance is the critical feature in the design and productivity of software systems. A key to imp...
Efficient large-scale scientific computing requires efficient code, yet optimizing code to render it...
Achieving peak performance from the computational ker-nels that dominate application performance oft...