TOPAS is a tool to automatically and transparently monitor usage and performance of every parallel job executed on a CRAY T3E. We have modified the UNICOS/mk compiler wrapper scripts to automatically link the TOPAS measurement module to every user application whenever it is recompiled. No modification is necessary in the user’s program or build procedures. At run-time, two PEs of the parallel application are picked to actually perform the measurement for the parallel job as a whole. The measurement consists of executing special code immediately before and after the execution of the program. So there is no measurement overhead during the execution of the application itself. The TOPAS module is very simple (about 250 lines of code). It is bas...
This paper introduces an infrastructure for efficiently collecting performance profiles from paralle...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
Effectively and efficiently implementing parallel programs for computer systems with a large number ...
One of the reasons why parallel programming is considered to be a difficult task is that users frequ...
This paper presents an automatic counter instrumentation and pro ling module added to the MPI librar...
The IPS-2 parallel program measurement tools pro-vide performance data from application programs, th...
The performance of a Cray system is highly dependent on the tuning techniques used by individuals on...
Writing efficient parallel programs for a massively parallel system like the Cray T3E is still a dif...
A new approach to monitoring the runtime behaviour of parallel programs will be presented. Our appro...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
The development of efficient applications in parallel computing is due to the complex parallel hardw...
This paper presents the design, implementation, and application of TALP, a lightweight, portable, ex...
A Performance Analysis Tools (PAT) report implement-ing hpm and perftrace software, installed under ...
Performance observability is the ability to accurately capture, analyze, and present (collectively o...
We carry out a performance study using the Cray T3D parallel supercomputer to illustrate some import...
This paper introduces an infrastructure for efficiently collecting performance profiles from paralle...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
Effectively and efficiently implementing parallel programs for computer systems with a large number ...
One of the reasons why parallel programming is considered to be a difficult task is that users frequ...
This paper presents an automatic counter instrumentation and pro ling module added to the MPI librar...
The IPS-2 parallel program measurement tools pro-vide performance data from application programs, th...
The performance of a Cray system is highly dependent on the tuning techniques used by individuals on...
Writing efficient parallel programs for a massively parallel system like the Cray T3E is still a dif...
A new approach to monitoring the runtime behaviour of parallel programs will be presented. Our appro...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
The development of efficient applications in parallel computing is due to the complex parallel hardw...
This paper presents the design, implementation, and application of TALP, a lightweight, portable, ex...
A Performance Analysis Tools (PAT) report implement-ing hpm and perftrace software, installed under ...
Performance observability is the ability to accurately capture, analyze, and present (collectively o...
We carry out a performance study using the Cray T3D parallel supercomputer to illustrate some import...
This paper introduces an infrastructure for efficiently collecting performance profiles from paralle...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
Effectively and efficiently implementing parallel programs for computer systems with a large number ...