Vector machines are well known for their high-peak performance, but the delivered performance varies greatly over different workloads and depends strongly on compiler optimizations. Recently it has been claimed that several horizontal superscalar architectures, e.g., VLIW and polycyclic architectures, provide a more balanced performance across a wider range of scientific workloads than do vector machines. The purpose of this research is to study the performance of register-register vector processors, such as Cray supercomputers, as a function of their architectural features, scheduling schemes, compiler optimization capabilities, and program parameters. The results of this study also provide a base for comparing vector machines with horizon...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The second generation of the Digital Equipment Corp. (DEC) DECchip Alpha AXP microprocessor is refer...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
During the last decade the scientific computing community has optimized many applications for execu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
The growing gap between sustained and peak performance for full-scale complex scientific applicatio...
The growing gap between sustained and peak performance for full-scale complex scientific application...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
In this paper we present the results of a detailed simulation study of the execution of vector progr...
Parallel-vector supercomputers have been the workhorses of high performance computing. As expectatio...
The growing gap between sustained and peak performance for scientific applications has become a well...
In this paper we present the results of a detailed simulation study of the execution of vector progr...
The basic architectures of vector and parallel computers and their properties are presented followed...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The second generation of the Digital Equipment Corp. (DEC) DECchip Alpha AXP microprocessor is refer...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
During the last decade the scientific computing community has optimized many applications for execu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
The growing gap between sustained and peak performance for full-scale complex scientific applicatio...
The growing gap between sustained and peak performance for full-scale complex scientific application...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
In this paper we present the results of a detailed simulation study of the execution of vector progr...
Parallel-vector supercomputers have been the workhorses of high performance computing. As expectatio...
The growing gap between sustained and peak performance for scientific applications has become a well...
In this paper we present the results of a detailed simulation study of the execution of vector progr...
The basic architectures of vector and parallel computers and their properties are presented followed...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The second generation of the Digital Equipment Corp. (DEC) DECchip Alpha AXP microprocessor is refer...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...