The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple opt...
In this paper, we use execution-driven simulation to study and compare vector processing performance...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
Vector machines are well known for their high-peak performance, but the delivered performance varies...
The growing gap between sustained and peak performance for scientific applications is a well-known ...
The growing gap between sustained and peak performance for scientific applications has become a well...
The growing gap between sustained and peak performance for scientific applications is a well-known p...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
The growing gap between sustained and peak performance for scientific applications is a well-known p...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
This paper presents an experimental study on cache memory designs for vector computers. We use an ex...
Abstract. The last decade has witnessed a rapid proliferation of superscalar cache-based microproces...
Abstract The last decade has witnessed a rapid proliferation of superscalar cache-based microprocess...
The last decade has witnessed a rapid proliferation ofsuperscalar cache-based microprocessors to bui...
In this paper, we use execution-driven simulation to study and compare vector processing performance...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
Vector machines are well known for their high-peak performance, but the delivered performance varies...
The growing gap between sustained and peak performance for scientific applications is a well-known ...
The growing gap between sustained and peak performance for scientific applications has become a well...
The growing gap between sustained and peak performance for scientific applications is a well-known p...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
The growing gap between sustained and peak performance for scientific applications is a well-known p...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
This paper presents an experimental study on cache memory designs for vector computers. We use an ex...
Abstract. The last decade has witnessed a rapid proliferation of superscalar cache-based microproces...
Abstract The last decade has witnessed a rapid proliferation of superscalar cache-based microprocess...
The last decade has witnessed a rapid proliferation ofsuperscalar cache-based microprocessors to bui...
In this paper, we use execution-driven simulation to study and compare vector processing performance...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
Vector machines are well known for their high-peak performance, but the delivered performance varies...