Abstract. A swap instruction, which exchanges a value in memory with a value of a register, is available on many architectures. The primary application of a swap instruction has been for process synchronization. In this paper we show that a swap instruction can often be used to coalesce loads and stores in a variety of applications. We describe the analysis necessary to detect opportunities to exploit a swap and the transformation required to coalesce a load and a store into a swap instruction. The results show that both the number of accesses to the memory system (data cache) and the number of executed instructions are reduced. In addition, the transformation reduces the register pressure by one register at the point the swap instruction i...
In this paper, we present a novel mechanism that implements register renaming, dynamic speculation a...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Two of the most important phases of code generation for instruction level parallel processors are re...
The swap mechanism allows operating systems to manage more memory than the available RAM space, by t...
High clock frequencies combined with deep pipelining employed by many of the state-ofthe -art proces...
Abstract. The swap mechanism allows an operating system to work with more memory than available RAM ...
We propose a novel cache set index scheme called SWAP (swap-based cache set index). SWAP introduces ...
This work aims to reduce the power consumed in the instruction memory of instruction set processors ...
Optimistic coalescing has been proven as an elegant and effective technique that provides better cha...
The detection of opportunities for value reuse optimizations in memory operations require both the a...
International audienceRecent developments in register allocation, mostly linked tostatic single assi...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...
We prove theorems that show that if we can reorder a program's memory refer-ence stream such th...
Modern superscalar processors support a large number of in-flight instructions, which requires sizea...
International audienceRegister allocation is generally considered a practically solved problem. For ...
In this paper, we present a novel mechanism that implements register renaming, dynamic speculation a...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Two of the most important phases of code generation for instruction level parallel processors are re...
The swap mechanism allows operating systems to manage more memory than the available RAM space, by t...
High clock frequencies combined with deep pipelining employed by many of the state-ofthe -art proces...
Abstract. The swap mechanism allows an operating system to work with more memory than available RAM ...
We propose a novel cache set index scheme called SWAP (swap-based cache set index). SWAP introduces ...
This work aims to reduce the power consumed in the instruction memory of instruction set processors ...
Optimistic coalescing has been proven as an elegant and effective technique that provides better cha...
The detection of opportunities for value reuse optimizations in memory operations require both the a...
International audienceRecent developments in register allocation, mostly linked tostatic single assi...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...
We prove theorems that show that if we can reorder a program's memory refer-ence stream such th...
Modern superscalar processors support a large number of in-flight instructions, which requires sizea...
International audienceRegister allocation is generally considered a practically solved problem. For ...
In this paper, we present a novel mechanism that implements register renaming, dynamic speculation a...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Two of the most important phases of code generation for instruction level parallel processors are re...