© 2015 IEEE. Graph processing is an increasingly important application domain and is typically communication-bound. In this work, we analyze the performance characteristics of three high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server. Unlike many other communication-bound workloads, graph algorithms struggle to fully utilize the platform's memory bandwidth and so increasing memory bandwidth utilization could be just as effective as decreasing communication. Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new memory system
In modern data centers, massive concurrent graph processing jobs are being processed on large graphs...
Algorithms operating on a graph setting are known to be highly irregular and un- structured. This le...
Processors have evolved to the now de-facto standard multicore architecture. The continuous advances...
© 2015 IEEE. Graph processing is an increasingly important application domain and is typically commu...
Abstract—Graph processing is an increasingly important ap-plication domain and is typically communic...
Graph processing is experiencing a surge of renewed interest as applications in social networks and ...
Both static and streaming graph processing are central in data analytics scenarios such as recommend...
Large-scale graph problems are becoming increasingly important in science and engineering. The irreg...
The size of graphs has dramatically increased. Graph engines for a single machine have been emerged ...
Mechanisms for improving the execution efficiency of graph algorithms on Data-Parallel Architectures...
Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration....
Abstract. Locality behavior study is crucial for achieving good performance for irregular problems. ...
Graph processing is increasingly bottlenecked by main memory accesses. On-chip caches are of little ...
Graph processing workloads are being widely used in many domains such as computational biology, soci...
Graph processing is one of the most important and ubiquitous classes of analytical workloads. To pro...
In modern data centers, massive concurrent graph processing jobs are being processed on large graphs...
Algorithms operating on a graph setting are known to be highly irregular and un- structured. This le...
Processors have evolved to the now de-facto standard multicore architecture. The continuous advances...
© 2015 IEEE. Graph processing is an increasingly important application domain and is typically commu...
Abstract—Graph processing is an increasingly important ap-plication domain and is typically communic...
Graph processing is experiencing a surge of renewed interest as applications in social networks and ...
Both static and streaming graph processing are central in data analytics scenarios such as recommend...
Large-scale graph problems are becoming increasingly important in science and engineering. The irreg...
The size of graphs has dramatically increased. Graph engines for a single machine have been emerged ...
Mechanisms for improving the execution efficiency of graph algorithms on Data-Parallel Architectures...
Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration....
Abstract. Locality behavior study is crucial for achieving good performance for irregular problems. ...
Graph processing is increasingly bottlenecked by main memory accesses. On-chip caches are of little ...
Graph processing workloads are being widely used in many domains such as computational biology, soci...
Graph processing is one of the most important and ubiquitous classes of analytical workloads. To pro...
In modern data centers, massive concurrent graph processing jobs are being processed on large graphs...
Algorithms operating on a graph setting are known to be highly irregular and un- structured. This le...
Processors have evolved to the now de-facto standard multicore architecture. The continuous advances...