The consistent growth of DRAM memory bandwidth and capacity has enabled the computation of increasingly larger workloads in high-performance computing. However, the memory latency improvement over time is nominal, which severely bottlenecks the performance of modern systems. Modern computers rely on the exploitation of data locality using large cache hierarchies to keep the throughput high. Additionally, the effect of Dennard scaling is getting more pronounced in processor technologies, prompting the architects to employ multicore and multithreaded architectures to boost throughput by extracting parallelism. However, irregular applications do not enjoy the same performance gain from these techniques. The poor data locality in these applicat...
Irregular workloads are programs organized around pointer-based data structures such as graphs. The...
Accelerator-based systems are making rapid inroads into becoming platforms of choice for both high e...
Hardware accelerators are known to be performance and power efficient. This article focuses on accel...
The consistent growth of DRAM memory bandwidth and capacity has enabled the computation of increasin...
Specialized accelerators are increasingly attractive solutions to continue expected generational per...
Parallel computing hardware is ubiquitous, ranging from cell-phones with multiple cores to super-com...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
This paper reviews the massively micro-parallel compute system POETS (Partially Ordered Event Trigge...
Graph analytics are an emerging class of irregular applications. Operating on very large datasets, t...
Modern parallel programming models perform their best under the particular patterns they are tuned t...
In this article, we present experiences implementing a general Parallel Discrete Event Simulation (P...
This work explores the acceleration of graph processing on a heterogeneous platform that tightly int...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
As we witness the breakdown of Dennard scaling, we can no longer get faster computers by shrinking t...
In modern data centers, massive concurrent graph processing jobs are being processed on large graphs...
Irregular workloads are programs organized around pointer-based data structures such as graphs. The...
Accelerator-based systems are making rapid inroads into becoming platforms of choice for both high e...
Hardware accelerators are known to be performance and power efficient. This article focuses on accel...
The consistent growth of DRAM memory bandwidth and capacity has enabled the computation of increasin...
Specialized accelerators are increasingly attractive solutions to continue expected generational per...
Parallel computing hardware is ubiquitous, ranging from cell-phones with multiple cores to super-com...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
This paper reviews the massively micro-parallel compute system POETS (Partially Ordered Event Trigge...
Graph analytics are an emerging class of irregular applications. Operating on very large datasets, t...
Modern parallel programming models perform their best under the particular patterns they are tuned t...
In this article, we present experiences implementing a general Parallel Discrete Event Simulation (P...
This work explores the acceleration of graph processing on a heterogeneous platform that tightly int...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
As we witness the breakdown of Dennard scaling, we can no longer get faster computers by shrinking t...
In modern data centers, massive concurrent graph processing jobs are being processed on large graphs...
Irregular workloads are programs organized around pointer-based data structures such as graphs. The...
Accelerator-based systems are making rapid inroads into becoming platforms of choice for both high e...
Hardware accelerators are known to be performance and power efficient. This article focuses on accel...