The poor bandwidth obtained from memory when conflicts arise in the modules or in the interconnection network degrades the performance of computers. Address transformation schemes, such as interleaving, skewing and linear transformations, have been proposed to achieve conflict-free access for streams with constant stride. However, this is achieved only for some strides. In this paper, we summarize a mechanism to request the elements in an out-of-order way which allows to achieve conflict-free access for a larger number of strides. We study the cases of a single vector processor and of a vector multiprocessor system. For this latter case, we propose a synchronous mode of accessing memory that can be applied in SIMD machines or in MIMD syste...
Most existing analytical models for memory interference generally assume random bank selection for e...
Memory interleaving is a cost-efficient approach to increase bandwidth. Improving data access locali...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The poor bandwidth obtained from memory when conflicts arise in the modules or in the interconnectio...
The synchronized and simultaneous access to several vectors that form a single stream occurs in SIMD...
Address transformation schemes, such as skewing and linear transformations, have been proposed to ac...
When accessing streams in vector multiprocessor machines, degradation in the interconnection network...
Address transformation schemes, such as skewing and linear transformations, have been proposed to ac...
Address transformation schemes, such as skewing and linear transformations, have been proposed to ac...
The high latency of memory accesses is one of the factors that most contribute to reduce the perform...
The synchronized and simultaneous access to several vectors that form a single stream occurs in SIMD...
On many commercial supercomputers, several vector register processors share a global highly interlea...
Vector supercomputers, which can process large amounts of vector data efficiently, are among the fas...
The performance of a vector processor accessing vectors is strongly dependent on the conflicts produ...
Proceedings of the 1993 IEEE Region 10 Conference on Computer, Communication, Control and Power Engi...
Most existing analytical models for memory interference generally assume random bank selection for e...
Memory interleaving is a cost-efficient approach to increase bandwidth. Improving data access locali...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...
The poor bandwidth obtained from memory when conflicts arise in the modules or in the interconnectio...
The synchronized and simultaneous access to several vectors that form a single stream occurs in SIMD...
Address transformation schemes, such as skewing and linear transformations, have been proposed to ac...
When accessing streams in vector multiprocessor machines, degradation in the interconnection network...
Address transformation schemes, such as skewing and linear transformations, have been proposed to ac...
Address transformation schemes, such as skewing and linear transformations, have been proposed to ac...
The high latency of memory accesses is one of the factors that most contribute to reduce the perform...
The synchronized and simultaneous access to several vectors that form a single stream occurs in SIMD...
On many commercial supercomputers, several vector register processors share a global highly interlea...
Vector supercomputers, which can process large amounts of vector data efficiently, are among the fas...
The performance of a vector processor accessing vectors is strongly dependent on the conflicts produ...
Proceedings of the 1993 IEEE Region 10 Conference on Computer, Communication, Control and Power Engi...
Most existing analytical models for memory interference generally assume random bank selection for e...
Memory interleaving is a cost-efficient approach to increase bandwidth. Improving data access locali...
The purpose of this paper is to show that multi-threading techniques can be applied to a vector proc...