Abstract—Memory profiling is the process of collecting memory address traces during the execution of a program, then analyzing and characterizing the memory behavior of the program offline. With the trend that there will be more and more cores integrated in a processor chip, the “Memory Wall ” problem will become more serious in the chip multiprocessor (CMP) system. Thus accurate and effective memory profiling is becoming one of the keys to identify the source of memory system bottlenecks. A large body of work has been contributed to memory profiling, however, most adopted instrumentation, simulator which would suffer heavy overhead, or hardware performance counter which would be lack of detail trace information. Furthermore, correlating th...
We introduce a novel memory architecture that can count the occurrences of patterns on a system’s bu...
The performance and energy efficiency of modern architectures depend on memory locality, which can b...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
Abstract—Memory trace analysis is an important technology for architecture research, system software...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Abstract—Multi-core prototyping presents a good oppor-tunity for establishing low overhead and detai...
Abstract—Optimizing memory access is critical for perfor-mance and power efficiency. CPU manufacture...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Application performance on modern microprocessors depends heavily on performance related characteris...
International audienceIn this paper, we present the current state of our work on profiling the memor...
Abstract—The availability of commercial hardware transactional memory (TM) systems has not yet been ...
We introduce object ownership profiling, a technique for finding and fixing memory leaks in object-o...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
We introduce a novel memory architecture that can count the occurrences of patterns on a system’s bu...
The performance and energy efficiency of modern architectures depend on memory locality, which can b...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
Abstract—Memory trace analysis is an important technology for architecture research, system software...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Abstract—Multi-core prototyping presents a good oppor-tunity for establishing low overhead and detai...
Abstract—Optimizing memory access is critical for perfor-mance and power efficiency. CPU manufacture...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Application performance on modern microprocessors depends heavily on performance related characteris...
International audienceIn this paper, we present the current state of our work on profiling the memor...
Abstract—The availability of commercial hardware transactional memory (TM) systems has not yet been ...
We introduce object ownership profiling, a technique for finding and fixing memory leaks in object-o...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
We introduce a novel memory architecture that can count the occurrences of patterns on a system’s bu...
The performance and energy efficiency of modern architectures depend on memory locality, which can b...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...