Abstract—To deal with the “memory wall ” problem, micro-processors include large secondary on-chip caches. But as these caches enlarge, they originate a new latency gap between them and fast L1 caches (inter-cache latency gap). Recently, Non-Uniform Cache Architectures (NUCAs) have been proposed to sustain the size growth trend of secondary caches that is threatened by wire-delay problems. NUCAs are size-oriented, and they were not conceived to close the inter-cache latency gap. To tackle this problem, we propose Light NUCAs (L-NUCAs) leveraging on-chip wire density to interconnect small tiles through specialized networks, which convey packets with distributed and dynamic routing. Our design reduces the tile delay (cache access plus one-hop...
Future embedded applications will require high performance processors integrating fast and low-power...
One of the most important issues designing large last level cache in a CMP system is the increasing...
Journal ArticleA significant part of future microprocessor real estate will be dedicated to L2 or L3...
To deal with the “memory wall” problem, microprocessors include large secondary on-chip caches. But ...
Abstract—High-end embedded processors demand complex on-chip cache hierarchies satisfying several co...
Abstract—High-end embedded processors demand complex on-chip cache hierarchies satisfying several co...
Non-uniform cache architectures (NUCAs) are a novel design paradigm for large last-level on-chip cac...
Wire delays continue to grow as the dominant component of latency for large caches. A recent work pr...
Journal ArticleThe ever increasing sizes of on-chip caches and the growing domination of wire delay...
Global interconnect becomes the delay bottleneck in microprocessor designs, and latency for large on...
Abstract: Non-uniform cache architecture (NUCA) aims to limit the wire-delay problem typical of lar...
Abstract— Wire delays and leakage energy consumption are both growing problems in designing large on...
Increasing on-chip wire delay and growing off-chip miss latency, present two key challenges in desig...
Growing wire delay and clock rates limit the amount of cache accessible within a single cycle. Non-u...
Future embedded applications will require high performance processors integrating fast and low-power...
Future embedded applications will require high performance processors integrating fast and low-power...
One of the most important issues designing large last level cache in a CMP system is the increasing...
Journal ArticleA significant part of future microprocessor real estate will be dedicated to L2 or L3...
To deal with the “memory wall” problem, microprocessors include large secondary on-chip caches. But ...
Abstract—High-end embedded processors demand complex on-chip cache hierarchies satisfying several co...
Abstract—High-end embedded processors demand complex on-chip cache hierarchies satisfying several co...
Non-uniform cache architectures (NUCAs) are a novel design paradigm for large last-level on-chip cac...
Wire delays continue to grow as the dominant component of latency for large caches. A recent work pr...
Journal ArticleThe ever increasing sizes of on-chip caches and the growing domination of wire delay...
Global interconnect becomes the delay bottleneck in microprocessor designs, and latency for large on...
Abstract: Non-uniform cache architecture (NUCA) aims to limit the wire-delay problem typical of lar...
Abstract— Wire delays and leakage energy consumption are both growing problems in designing large on...
Increasing on-chip wire delay and growing off-chip miss latency, present two key challenges in desig...
Growing wire delay and clock rates limit the amount of cache accessible within a single cycle. Non-u...
Future embedded applications will require high performance processors integrating fast and low-power...
Future embedded applications will require high performance processors integrating fast and low-power...
One of the most important issues designing large last level cache in a CMP system is the increasing...
Journal ArticleA significant part of future microprocessor real estate will be dedicated to L2 or L3...