Recent research advocates memory streaming techniques to alleviate the performance bottleneck caused by the high latencies of off-chip memory accesses. Temporal memory streaming replays previously observed miss sequences to eliminate long chains of dependent misses. Spatial memory streaming predicts repetitive data layout patterns within fixed-size memory regions. Because each technique targets a different subset of misses, their effectiveness varies across workloads and each leaves a significant fraction of misses unpredicted. In this paper, we propose Spatio-Temporal Memory Streaming (STeMS) to exploit the synergy between spatial and temporal streaming. We observe that the order of spatial accesses repeats both within and across regions. ...
Autonomous streaming anomaly detection can have a significant impact in any domain where continuous,...
A spatio-temporal data stream is a sequence of time-stamped geo-referenced data elements which arriv...
Hardware prefetchers are commonly used to hide and tol-erate off-chip memory latency. Prefetching te...
Coherent read misses in shared-memory multiprocessors account for a substantial fraction of executio...
Coherent read misses in shared-memory multiprocessors account for a substantial fraction of executio...
Coherence misses in shared-memory multiprocessors account for a substantial fraction of execution ti...
Modern prefetchers can generally be divided into two categories, spatial and temporal, based on the ...
The memory system remains a bottleneck in modern computer systems. Traditionally, designers have use...
Journal ArticleThe speed gap between processors and memory system is becoming the performance bottle...
The growing processor/memory performance gap causes the performance of many codes to be limited by m...
Prior research demonstrates that temporal memory streaming and related address-correlating prefetche...
International audienceThe visual analysis of large multidimensional spatiotem-poral datasets poses c...
In this paper, we define the problem of spatial mapping. We present reasons why performing spatial m...
Coherence misses in shared-memory multiprocessors account for a substantial fraction of execution ti...
Recent research suggests that there are large variations in a cache's spatial usage, both within and...
Autonomous streaming anomaly detection can have a significant impact in any domain where continuous,...
A spatio-temporal data stream is a sequence of time-stamped geo-referenced data elements which arriv...
Hardware prefetchers are commonly used to hide and tol-erate off-chip memory latency. Prefetching te...
Coherent read misses in shared-memory multiprocessors account for a substantial fraction of executio...
Coherent read misses in shared-memory multiprocessors account for a substantial fraction of executio...
Coherence misses in shared-memory multiprocessors account for a substantial fraction of execution ti...
Modern prefetchers can generally be divided into two categories, spatial and temporal, based on the ...
The memory system remains a bottleneck in modern computer systems. Traditionally, designers have use...
Journal ArticleThe speed gap between processors and memory system is becoming the performance bottle...
The growing processor/memory performance gap causes the performance of many codes to be limited by m...
Prior research demonstrates that temporal memory streaming and related address-correlating prefetche...
International audienceThe visual analysis of large multidimensional spatiotem-poral datasets poses c...
In this paper, we define the problem of spatial mapping. We present reasons why performing spatial m...
Coherence misses in shared-memory multiprocessors account for a substantial fraction of execution ti...
Recent research suggests that there are large variations in a cache's spatial usage, both within and...
Autonomous streaming anomaly detection can have a significant impact in any domain where continuous,...
A spatio-temporal data stream is a sequence of time-stamped geo-referenced data elements which arriv...
Hardware prefetchers are commonly used to hide and tol-erate off-chip memory latency. Prefetching te...