In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bou...
As dataset sizes increase, data analysis tasks in high performance computing (HPC) are increasingly ...
As the adoption of Big Data technologies becomes the norm in an increasing number of scenarios, ther...
There has been much research devoted to improving the performance of data analytics frameworks, but ...
In last decade, data analytics have rapidly progressed from traditional disk-based processing to mod...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
Sheer increase in volume of data over the last decade has triggered research in cluster computing fr...
Sheer increase in volume of data over the last decade has triggered research in cluster computing fr...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
\ua9 2014 IEEE. The increasing demands of big data applications have led researchers and practitione...
As dataset sizes increase, data analysis tasks in high performance computing (HPC) are increasingly ...
As the adoption of Big Data technologies becomes the norm in an increasing number of scenarios, ther...
There has been much research devoted to improving the performance of data analytics frameworks, but ...
In last decade, data analytics have rapidly progressed from traditional disk-based processing to mod...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
Sheer increase in volume of data over the last decade has triggered research in cluster computing fr...
Sheer increase in volume of data over the last decade has triggered research in cluster computing fr...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
\ua9 2014 IEEE. The increasing demands of big data applications have led researchers and practitione...
As dataset sizes increase, data analysis tasks in high performance computing (HPC) are increasingly ...
As the adoption of Big Data technologies becomes the norm in an increasing number of scenarios, ther...
There has been much research devoted to improving the performance of data analytics frameworks, but ...