We introduce a novel memory architecture that can count the occurrences of patterns on a system’s bus, a task known as profiling. Such profiling can serve a variety of purposes, like detecting a microprocessor’s software hot spots or frequently used data values, which can be used to optimize various aspects of the system. The memory, which we call ProMem, is based on a pipelined binary search tree structure, yielding several beneficial features, including non-intrusiveness, accurate counts, excellent size and power efficiency, very fast access times, and the use of standard memories with only simple additional logic. The main limitation is that the set of potential patterns must be preloaded into the memory. We describe the ProMem architect...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
Presented at HiPEAC Conference 2020, Bologna (Italy)Time series analysis is an important research to...
Abstract—A major challenge to the creation of chip mul-tiprocessors is designing the on-chip memory ...
For aggressive path-based program optimizations to be profitable in cost-sensitive environments, acc...
The authors describe a VLSI processor for pattern recognition based on content addressable memory (C...
The authors describe a VLSI processor for pattern recognition based on content addressable memory (C...
Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally m...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...
Abstract—Memory profiling is the process of collecting memory address traces during the execution of...
For aggressive path-based optimizations to be profitable in cost-senstive environments, accurate pat...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
The growing demand of processing power is being satisfied mainly by an increase in the number of hom...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
Presented at HiPEAC Conference 2020, Bologna (Italy)Time series analysis is an important research to...
Abstract—A major challenge to the creation of chip mul-tiprocessors is designing the on-chip memory ...
For aggressive path-based program optimizations to be profitable in cost-sensitive environments, acc...
The authors describe a VLSI processor for pattern recognition based on content addressable memory (C...
The authors describe a VLSI processor for pattern recognition based on content addressable memory (C...
Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally m...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...
Abstract—Memory profiling is the process of collecting memory address traces during the execution of...
For aggressive path-based optimizations to be profitable in cost-senstive environments, accurate pat...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Operating systems have historically had to manage only a single type of memory device. The imminent ...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
The growing demand of processing power is being satisfied mainly by an increase in the number of hom...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
Presented at HiPEAC Conference 2020, Bologna (Italy)Time series analysis is an important research to...
Abstract—A major challenge to the creation of chip mul-tiprocessors is designing the on-chip memory ...