As per-core CPU performance plateaus and data-bound applications like graph analytics and key-value stores become more prevalent, understanding memory performance is more important than ever. Many existing techniques to predict and measure cache performance on a given workload involve either static analysis or tracing, but programs like key-value stores can easily have billions of memory accesses in a trace and have access patterns driven by non-statically observable phenomena such as user behavior. Past analytical solutions focus on modeling cache hits, but the rise of non-volatile memory (NVM) like Intel\u27s Optane with asymmetric read/write latencies, bandwidths, and power consumption means that writes and writebacks are now critical pe...