In response to the productivity challenge of the U.S. DARPA HPCS initiative, we have developed a methodol-ogy that provides an extremely simple and pain-free inter-face through which scientists can collect rich performance data from selected parts of an execution, digest the data at a very high level, and plan for improvements. This pro-cess can be easily repeated, each time refining the selection of parts of the application and revising the granularity of data collected, until complete insight is gained about bot-tlenecks. A distinct feature of our approach is that the framework is independent of the features being examined. Recognizing that the features to be examined change with systems/applications and also with depth at which an aspect...
Abstract: Design and implementation of applications comprise an anticipation of what increased perfo...
Many existing applications suffer from inherent scalability limitations that will prevent them from ...
Programmers often rely on performance analysis tools to provide feedback about the execution of thei...
Modern supercomputers deliver large computational power, but it is difficult for an application to e...
Over the past 10 years we have seen the transition from single core computer to multicore computing,...
Application performance tuning is a complex process that requires assembling various types of inform...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
Modern parallel systems and applications are constantly increasing in scale and complexity, and cons...
One key to improving high performance computing (HPC) productivity is to find better ways to measure...
We present some preliminary results of selective profiling in our efforts towards automatic performa...
Parallel and distributed programming constitutes a highly promising approach to improving the perfor...
A large and important class of national challenge applications are irregular, with complex, data dep...
The 2014 TOP500 supercomputer list includes over 40 deployed petascale systems, and the high perform...
As access to supercomputing resources is becoming more and more commonplace, performance analysis to...
As machines get larger and scientific applications advance, it is more and more imperative to fully ...
Abstract: Design and implementation of applications comprise an anticipation of what increased perfo...
Many existing applications suffer from inherent scalability limitations that will prevent them from ...
Programmers often rely on performance analysis tools to provide feedback about the execution of thei...
Modern supercomputers deliver large computational power, but it is difficult for an application to e...
Over the past 10 years we have seen the transition from single core computer to multicore computing,...
Application performance tuning is a complex process that requires assembling various types of inform...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
Modern parallel systems and applications are constantly increasing in scale and complexity, and cons...
One key to improving high performance computing (HPC) productivity is to find better ways to measure...
We present some preliminary results of selective profiling in our efforts towards automatic performa...
Parallel and distributed programming constitutes a highly promising approach to improving the perfor...
A large and important class of national challenge applications are irregular, with complex, data dep...
The 2014 TOP500 supercomputer list includes over 40 deployed petascale systems, and the high perform...
As access to supercomputing resources is becoming more and more commonplace, performance analysis to...
As machines get larger and scientific applications advance, it is more and more imperative to fully ...
Abstract: Design and implementation of applications comprise an anticipation of what increased perfo...
Many existing applications suffer from inherent scalability limitations that will prevent them from ...
Programmers often rely on performance analysis tools to provide feedback about the execution of thei...