Functional programming languages contain a number of runtime and language features, such as garbage collection, indirect memory accesses, linked data structures and immutability, that interact with a processor’s memory system. These conspire to cause a variety of unintuitive memory performance effects. For example, it is slower to traverse through linked lists and arrays of data that have been sorted than to traverse the same data accessed in the order it was allocated. We seek to understand these issues and mitigate them in a manner consistent with functional languages, taking advantage of the features themselves where possible. For example, immutability and garbage collection force linked lists to be allocated roughly sequentially in memo...
Due to garbage collection and language features that preclude stack-based allocation, functional pro...
Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefet...
Software prefetching and locality optimizations are two techniques for overcoming the speed gap bet...
Many modern data processing and HPC workloads are heavily memory-latency bound. A tempting propositi...
Many modern data processing and HPC workloads are heavily memory-latency bound. A tempting propositi...
Software prefetching and locality optimizations are two techniques for overcoming the speed gap betw...
Software prefetching and locality optimizations are techniques for overcoming the speed gap between ...
Software prefetching and locality optimizations are techniques for overcoming the gap between proces...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
textModern computer systems spend a substantial fraction of their running time waiting for data from...
Memory latency becoming an increasing important performance bottleneck as the gap between processor ...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
Software prefetching and locality optimizations are techniques for overcoming the gap between proc...
Modern processors and compilers hide long memory latencies through non-blocking loads or explicit so...
Current microprocessors aggressively exploit instruction-level parallelism (ILP) through techniques ...
Due to garbage collection and language features that preclude stack-based allocation, functional pro...
Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefet...
Software prefetching and locality optimizations are two techniques for overcoming the speed gap bet...
Many modern data processing and HPC workloads are heavily memory-latency bound. A tempting propositi...
Many modern data processing and HPC workloads are heavily memory-latency bound. A tempting propositi...
Software prefetching and locality optimizations are two techniques for overcoming the speed gap betw...
Software prefetching and locality optimizations are techniques for overcoming the speed gap between ...
Software prefetching and locality optimizations are techniques for overcoming the gap between proces...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
textModern computer systems spend a substantial fraction of their running time waiting for data from...
Memory latency becoming an increasing important performance bottleneck as the gap between processor ...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
Software prefetching and locality optimizations are techniques for overcoming the gap between proc...
Modern processors and compilers hide long memory latencies through non-blocking loads or explicit so...
Current microprocessors aggressively exploit instruction-level parallelism (ILP) through techniques ...
Due to garbage collection and language features that preclude stack-based allocation, functional pro...
Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefet...
Software prefetching and locality optimizations are two techniques for overcoming the speed gap bet...