This paper describes investigations on the memory performance of the shared memory systems Cray X-MP and Cray Y-MP. Single and multiple CPU performance will be considered and special emphasis will be put on performance differences that result from differences in the interconnect network between memory and CPUs of both machines. Measurement results for kernels and application programs are presented to demonstrate the impact of these architectural changes. The effect of memory contention will be studied for codes with even stride memory access that utilize only part of the available memory banks
The objective of this work is to compare the performance of three common environments for supporting...
We present a new scheme for evaluating the performance of multithreaded computers and demonstrate it...
We describe our experiences in repeated cycles of performance optimization, benchmarking, and perfor...
Abstract—The Cray X1 supercomputer is a distributed shared memory vector multiprocessor, scalable to...
Abstract. Memory subsystems of contemporary processor architectures are typically equipped with a mu...
Oak Ridge National Laboratory recently installed a 32 processor Cray X1. In this paper, we describe ...
The historical trend of increasing single CPU performance has given way to roadmap of increasing co...
In this paper we investigate some of the important factors which affect the message-passing performa...
Across a broad range of applications, multicore technol-ogy is the most important factor that drives...
In this paper we investigate some of the important factors which affect the message-passing performa...
All methods of multi-processing need some form of processor to processor communication. In shared me...
ABSTRACT: The historical trend of increasing single CPU performance has given way to roadmap of incr...
Modern supercomputers like CRAY X-MP and CRAY Y-MP achieve their high computing speed by using both ...
Shared-memory multiprocessors built from commodity microprocessors are being increasingly used to pr...
ABSTRACT: In this paper, we describe how to write efficient, parallel codes for the Cray XMTTM syste...
The objective of this work is to compare the performance of three common environments for supporting...
We present a new scheme for evaluating the performance of multithreaded computers and demonstrate it...
We describe our experiences in repeated cycles of performance optimization, benchmarking, and perfor...
Abstract—The Cray X1 supercomputer is a distributed shared memory vector multiprocessor, scalable to...
Abstract. Memory subsystems of contemporary processor architectures are typically equipped with a mu...
Oak Ridge National Laboratory recently installed a 32 processor Cray X1. In this paper, we describe ...
The historical trend of increasing single CPU performance has given way to roadmap of increasing co...
In this paper we investigate some of the important factors which affect the message-passing performa...
Across a broad range of applications, multicore technol-ogy is the most important factor that drives...
In this paper we investigate some of the important factors which affect the message-passing performa...
All methods of multi-processing need some form of processor to processor communication. In shared me...
ABSTRACT: The historical trend of increasing single CPU performance has given way to roadmap of incr...
Modern supercomputers like CRAY X-MP and CRAY Y-MP achieve their high computing speed by using both ...
Shared-memory multiprocessors built from commodity microprocessors are being increasingly used to pr...
ABSTRACT: In this paper, we describe how to write efficient, parallel codes for the Cray XMTTM syste...
The objective of this work is to compare the performance of three common environments for supporting...
We present a new scheme for evaluating the performance of multithreaded computers and demonstrate it...
We describe our experiences in repeated cycles of performance optimization, benchmarking, and perfor...