Level-one data cache (L1 DC) accesses impact energy usage as they frequently occur and use significantly more energy than register file accesses. A memory access instruction consists of an address generation operation calculating the location where the data item resides in memory and the data access operation that loads/stores a value from/to that location. We propose to decouple these two operations into separate machine instructions to reduce energy usage. By associating the data translation lookaside buffer (DTLB) access and level-one data cache (L1 DC) tag check with an address generation instruction, only a single data array in a set-associative L1 DC needs to be accessed during a load instruction when the result of the tag check is kn...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25% of an embedd...
Energy efficiency is one of the key metrics in the design of a widerange of processor types. For exa...
Energy efficiency is a first-order design goal for nearly all classes of processors, but it is parti...
Level-one data cache (L1 DC) accesses impact energy usage as they frequently occur and use significa...
Level-one data cache (L1 DC) and data translation lookaside buffer (DTLB) accesses impact energy usa...
Memory operations have a significant impact on both performance and energy usage even when an access...
The need for energy efficiency continues to grow for many classes of processors, including those for...
The number of battery powered devices is growing significantly and these devices require energy-effi...
Abstract—Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load...
Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load operatio...
Due to performance reasons, all ways in set-associative level-one (L1) data caches are accessed in p...
Address translation using the Translation Lookaside Buffer (TLB) consumes as much as 16 % of the chi...
Abstract—Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load...
Abstract—Due to performance reasons, all ways in set-associative level-one (L1) data caches are acce...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25 % of an embed...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25% of an embedd...
Energy efficiency is one of the key metrics in the design of a widerange of processor types. For exa...
Energy efficiency is a first-order design goal for nearly all classes of processors, but it is parti...
Level-one data cache (L1 DC) accesses impact energy usage as they frequently occur and use significa...
Level-one data cache (L1 DC) and data translation lookaside buffer (DTLB) accesses impact energy usa...
Memory operations have a significant impact on both performance and energy usage even when an access...
The need for energy efficiency continues to grow for many classes of processors, including those for...
The number of battery powered devices is growing significantly and these devices require energy-effi...
Abstract—Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load...
Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load operatio...
Due to performance reasons, all ways in set-associative level-one (L1) data caches are accessed in p...
Address translation using the Translation Lookaside Buffer (TLB) consumes as much as 16 % of the chi...
Abstract—Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load...
Abstract—Due to performance reasons, all ways in set-associative level-one (L1) data caches are acce...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25 % of an embed...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25% of an embedd...
Energy efficiency is one of the key metrics in the design of a widerange of processor types. For exa...
Energy efficiency is a first-order design goal for nearly all classes of processors, but it is parti...