The processor-memory gap is widening every year with no prospect of reprieve. More and more latency is being added to program runtimes as memory cannot satisfy the demands of CPUs quickly enough. In the past, this has been alleviated through caches of increasing complexity or techniques like prefetching, to give the illusion of faster memory. However, these techniques have drawbacks because they are reactive or rely on incomplete information. In general, this leads to large amounts of latency in programs due to processor stalls. It is our contention that through tracing a program's data accesses and feeding this information back to the cache, overall program runtime can be reduced. This is achieved through a new piece of hardware called a T...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
In this paper we address the important problem of instruc-tion fetch for future wide issue superscal...
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buf...
The processor-memory gap is widening every year with no prospect of reprieve. More and more latency ...
Trace cache, an instruction fetch technique that reduces taken branch penalties by storing and fetch...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
This thesis evaluates an innovative cache design called, prime-mapped cache. The performance analysi...
Trace caches are used to help dynamic branch prediction make multiple predictions in a cycle by embe...
As the instruction issue width of superscalar proces-sors increases, instruction fetch bandwidth req...
Techniques such as out-of-order issue and speculative execution aggressively exploit instruction lev...
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Value specialization is a technique which can improve a program’s performance when its code frequent...
As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements ...
In order to meet the demands of wider issue processors, fetch mechanisms will need to fetch multiple...
Benefits of advances in processor technology have long been held hostage to the widening processor-m...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
In this paper we address the important problem of instruc-tion fetch for future wide issue superscal...
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buf...
The processor-memory gap is widening every year with no prospect of reprieve. More and more latency ...
Trace cache, an instruction fetch technique that reduces taken branch penalties by storing and fetch...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
This thesis evaluates an innovative cache design called, prime-mapped cache. The performance analysi...
Trace caches are used to help dynamic branch prediction make multiple predictions in a cycle by embe...
As the instruction issue width of superscalar proces-sors increases, instruction fetch bandwidth req...
Techniques such as out-of-order issue and speculative execution aggressively exploit instruction lev...
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Value specialization is a technique which can improve a program’s performance when its code frequent...
As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements ...
In order to meet the demands of wider issue processors, fetch mechanisms will need to fetch multiple...
Benefits of advances in processor technology have long been held hostage to the widening processor-m...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
In this paper we address the important problem of instruc-tion fetch for future wide issue superscal...
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buf...