In this paper we present the results of a detailed simulation study of the execution of vector programs on a single processor of a Convex C3480 machine, using a subset of the Perfect Club benchmarks. We are interested in evaluating several cost/performance tradeoffs that the machine designers made in order to assess which features of the architecture severely limit the performance attainable. We present the detailed usage of the vector functional units and a study of the kinds of resource conflicts that stall the machine. The results obtained show that the resources of the vector architecture are not efficiently used mainly due to the single bus memory architecture. Other severe limitations of the machine turn out to be the lack of chaining...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
English: Power consumption has become one of the dominant issues in processor design, especially imp...
The growing gap between sustained and peak performance for scientific applications has become a well...
In this paper we present the results of a detailed simulation study of the execution of vector progr...
In this paper we study the instruction level characteristics of the Perfect Club programs when compi...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
This paper presents an experimental study on cache memory designs for vector computers. We use an ex...
Vector machines are well known for their high-peak performance, but the delivered performance varies...
Vector architectures have long been the architecture of choice for numerical high performance comput...
This paper presents a study of the impact of reducing the vector register size in a decoupled vector...
The purpose of this paper is to show that using decoupling techniques in a vector processor, the per...
The purpose of this paper is to show that using decoupling techniques in a vector processor, the per...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors...
The basic architectures of vector and parallel computers and their properties are presented followed...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
English: Power consumption has become one of the dominant issues in processor design, especially imp...
The growing gap between sustained and peak performance for scientific applications has become a well...
In this paper we present the results of a detailed simulation study of the execution of vector progr...
In this paper we study the instruction level characteristics of the Perfect Club programs when compi...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
This paper presents an experimental study on cache memory designs for vector computers. We use an ex...
Vector machines are well known for their high-peak performance, but the delivered performance varies...
Vector architectures have long been the architecture of choice for numerical high performance comput...
This paper presents a study of the impact of reducing the vector register size in a decoupled vector...
The purpose of this paper is to show that using decoupling techniques in a vector processor, the per...
The purpose of this paper is to show that using decoupling techniques in a vector processor, the per...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors...
The basic architectures of vector and parallel computers and their properties are presented followed...
Scientific programs are typically characterized as floating-point intensive loop-dominated tasks wit...
English: Power consumption has become one of the dominant issues in processor design, especially imp...
The growing gap between sustained and peak performance for scientific applications has become a well...