This work examines dynamic cluster assignment for a clustered trace cache processor (CTCP). Previously pro-posed cluster assignment techniques run into unique prob-lems as issue width and cluster count increase. Realistic design conditions, such as variable data forwarding laten-cies between clusters and a heavily partitioned instruction window, increase the degree of difficulty for effective cluster assignment. In this work, the trace cache and fill unit are used to per-form dynamic cluster assignment. The retire-time fill unit analysis is aided by a dynamic profiling mechanism em-bedded within the trace cache. This mechanism provides information about inter-trace data dependencies, an ele-ment absent in previous retire-time CTCP cluster a...
Recent works [14] show that delays introduced in the issue and bypass logic will become critical for...
The objective of this paper is to improve the use of the hardware resources of the trace cache mecha...
The Software Trace Cache is a compiler transformation, or a postcompilation binary optimization, tha...
This paper proposes DCC (Dynamic Cache Clustering), a novel distributed cache management scheme for ...
This paper proposes DCC (Dynamic Cache Clustering), a novel distributed cache management scheme for ...
In a multicore system, effective management of shared last level cache (LLC), such as hardware/softw...
Clustered microarchitectures are an effective approach to reducing the penalties caused by wire dela...
Value specialization is a technique which can improve a program’s performance when its code frequent...
Clustered microarchitectures are an effective approach to reducing the penalties caused by wire dela...
In order to meet the demands of wider issue processors, fetch mechanisms will need to fetch multiple...
We introduce a method for improving the cache performance of irregular computations in which data ar...
This thesis proposes a software-oriented distributed shared cache management approach for chip multi...
textFor the past decade, microprocessors have been improving in overall performance at a rate of ap...
International audienceThe most widely used programming models expect hardware to guarantee coherent ...
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buf...
Recent works [14] show that delays introduced in the issue and bypass logic will become critical for...
The objective of this paper is to improve the use of the hardware resources of the trace cache mecha...
The Software Trace Cache is a compiler transformation, or a postcompilation binary optimization, tha...
This paper proposes DCC (Dynamic Cache Clustering), a novel distributed cache management scheme for ...
This paper proposes DCC (Dynamic Cache Clustering), a novel distributed cache management scheme for ...
In a multicore system, effective management of shared last level cache (LLC), such as hardware/softw...
Clustered microarchitectures are an effective approach to reducing the penalties caused by wire dela...
Value specialization is a technique which can improve a program’s performance when its code frequent...
Clustered microarchitectures are an effective approach to reducing the penalties caused by wire dela...
In order to meet the demands of wider issue processors, fetch mechanisms will need to fetch multiple...
We introduce a method for improving the cache performance of irregular computations in which data ar...
This thesis proposes a software-oriented distributed shared cache management approach for chip multi...
textFor the past decade, microprocessors have been improving in overall performance at a rate of ap...
International audienceThe most widely used programming models expect hardware to guarantee coherent ...
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buf...
Recent works [14] show that delays introduced in the issue and bypass logic will become critical for...
The objective of this paper is to improve the use of the hardware resources of the trace cache mecha...
The Software Trace Cache is a compiler transformation, or a postcompilation binary optimization, tha...