Vectorizing code for short vector architectures as employed by today’s multimedia extensions comes with a number of issues. The responsibilities of these issues are moved to the compiler in order to keep hardware simple. One of those issues is memory-alignment, which requires the compiler to guarantee loading and storing vectors at aligned addresses. Previous work that covered this issue proposed a mechanism to reorder vectors at runtime to ensure proper alignments, while other work has focussed on finding a minimal number of reorderings. We combined these subjects into an in-depth research and implemented the optimization for the retar- getable CoSy(R) compiler framework. Instead of solely focussing on the minimal number of reorder- ings, ...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
This paper presents a study of the impact of reducing the vector register size in a decoupled vector...
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors...
Abstract — In order to provide the best performance for memory accesses in the multimedia extensions...
An emerging trend in processor design is the incorporation of short vector instructions into the ISA...
With advances in VLSI technology, it is now possible to implement vector processors on a single chip...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
When generating codes for today’s multimedia extensions, one of the major challenges is to deal with...
[[abstract]]In this paper, we propose a compilation scheme to analyze and exploit the implicit reuse...
An emerging trend in processor design is the addition of short vector instructions to general-purpos...
Compiler-based static vectorization is used widely to extract data-level parallelism from computatio...
International audienceMemory disambiguation mechanisms, coupled with load/store queues in out-of-ord...
Data and computation alignment is an important part of compiling sequential programs to architecture...
Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabili...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
This paper presents a study of the impact of reducing the vector register size in a decoupled vector...
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors...
Abstract — In order to provide the best performance for memory accesses in the multimedia extensions...
An emerging trend in processor design is the incorporation of short vector instructions into the ISA...
With advances in VLSI technology, it is now possible to implement vector processors on a single chip...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
When generating codes for today’s multimedia extensions, one of the major challenges is to deal with...
[[abstract]]In this paper, we propose a compilation scheme to analyze and exploit the implicit reuse...
An emerging trend in processor design is the addition of short vector instructions to general-purpos...
Compiler-based static vectorization is used widely to extract data-level parallelism from computatio...
International audienceMemory disambiguation mechanisms, coupled with load/store queues in out-of-ord...
Data and computation alignment is an important part of compiling sequential programs to architecture...
Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabili...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
This paper presents a study of the impact of reducing the vector register size in a decoupled vector...
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors...