Loop fusion combines corresponding iterations of different loops. As shown in previous work, it can often decrease program run time by reducing the overhead of loop control and effective address calculations, and in important cases by dramatically increasing cache or register reuse. In this paper we consider corresponding changes in program energy. By merging program phases, fusion tends to increase the uniformity, or balance of demand for system resources. On a conventional superscalar processor, increased balance tends to increase IPC, and thus dynamic power, so that fusion-induced improvements in program energy are slightly smaller than improvements in program run time. If IPC is held constant, however, by reducing frequency ...
Modern processors use memory hierarchy of several levels. Achieving high performance mandates the ef...
Abstract. Loop fusion is a program transformation that merges multi-ple loops into one. It is eectiv...
As energy consumption plays a more and more critical role in high-performance computing installation...
Loop fusion improves data locality and reduces synchronization in data-parallel applications. Howeve...
The rapidly increasing number of architectural changes in embedded processors puts compiler technolo...
For multimedia applications, loop buffering is an efficient mechanism to reduce the power in the ins...
The memory bandwidth largely determines the performance of embedded systems. However, very often com...
A computer consists of multiple components such as functional units, cache and main memory. At each ...
Abstract: Loop fusion is recognized as an effective transformation for improving memory hierarchy pe...
Embedded systems require maximum performance from a processor within significant constraints in powe...
Loops are the main time consuming part of programs based on floating point computations. The perform...
Energy efficiency in modern microprocessor design is a first order concern. Every facet of the micr...
Superscalar processors contain large, complex structures to hold data and instructions as they wait ...
Abstract--- Energy efficiency is becoming increasingly important for computation, especially in the ...
Embedded processors have limited on-chip memory. Fusing loops that use the same data can reduce the ...
Modern processors use memory hierarchy of several levels. Achieving high performance mandates the ef...
Abstract. Loop fusion is a program transformation that merges multi-ple loops into one. It is eectiv...
As energy consumption plays a more and more critical role in high-performance computing installation...
Loop fusion improves data locality and reduces synchronization in data-parallel applications. Howeve...
The rapidly increasing number of architectural changes in embedded processors puts compiler technolo...
For multimedia applications, loop buffering is an efficient mechanism to reduce the power in the ins...
The memory bandwidth largely determines the performance of embedded systems. However, very often com...
A computer consists of multiple components such as functional units, cache and main memory. At each ...
Abstract: Loop fusion is recognized as an effective transformation for improving memory hierarchy pe...
Embedded systems require maximum performance from a processor within significant constraints in powe...
Loops are the main time consuming part of programs based on floating point computations. The perform...
Energy efficiency in modern microprocessor design is a first order concern. Every facet of the micr...
Superscalar processors contain large, complex structures to hold data and instructions as they wait ...
Abstract--- Energy efficiency is becoming increasingly important for computation, especially in the ...
Embedded processors have limited on-chip memory. Fusing loops that use the same data can reduce the ...
Modern processors use memory hierarchy of several levels. Achieving high performance mandates the ef...
Abstract. Loop fusion is a program transformation that merges multi-ple loops into one. It is eectiv...
As energy consumption plays a more and more critical role in high-performance computing installation...