Big data frameworks, such as Spark and Giraph, suffer from high memory pressure because they allocate massive volumes of long-lived objects on the managed heap. Thus, frameworks temporarily move long-lived objects outside the managed heap (off-heap) on a fast storage device. Unfortunately, this practice results in: (1) high serialization/deserialization (S/D) cost, and (2) high garbage collection (GC) cost when many off-heap objects are moved back to the managed heap for processing. In this paper, we propose HugeHeap, which extends the managed runtime (JVM) to use a second, high-capacity heap over a fast storage device that coexists with the regular heap. HugeHeap provides direct access to objects on the second heap (no S/D). It also reduc...
Planning optimized memory management is critical for Big Data analysis tools to perform faster runti...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
Big data analytics frameworks, such as Spark and Giraph, need to process and cache massive amounts o...
Many Big Data analytics and IoT scenarios rely on fast and non-relational storage (NoSQL) to help pr...
The past decade has witnessed the increasing demands on data-driven business intelligence that led t...
GCspy is an architectural framework for the collection, transmission, storage and replay of memory m...
This is a post-peer-review, pre-copyedit version of an article published in Journal of Parallel and ...
The memory system has been evolving at a fast pace recently, driven by the emergence of large-scale ...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
While a conventional program uses exactly as much memory as it needs, the memory use of a garbage-co...
Existing virtual memory systems usually work well with applications written in C and C++, but they d...
Big Data systems have been used for multiple years to solve problems that require scale. A framework...
On contemporary cache-coherent Non-Uniform Memory Access (ccNUMA) architectures, applications with a...
Large-scale data analytical applications such as social network analysis and web analysis have revol...
Planning optimized memory management is critical for Big Data analysis tools to perform faster runti...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
Big data analytics frameworks, such as Spark and Giraph, need to process and cache massive amounts o...
Many Big Data analytics and IoT scenarios rely on fast and non-relational storage (NoSQL) to help pr...
The past decade has witnessed the increasing demands on data-driven business intelligence that led t...
GCspy is an architectural framework for the collection, transmission, storage and replay of memory m...
This is a post-peer-review, pre-copyedit version of an article published in Journal of Parallel and ...
The memory system has been evolving at a fast pace recently, driven by the emergence of large-scale ...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
While a conventional program uses exactly as much memory as it needs, the memory use of a garbage-co...
Existing virtual memory systems usually work well with applications written in C and C++, but they d...
Big Data systems have been used for multiple years to solve problems that require scale. A framework...
On contemporary cache-coherent Non-Uniform Memory Access (ccNUMA) architectures, applications with a...
Large-scale data analytical applications such as social network analysis and web analysis have revol...
Planning optimized memory management is critical for Big Data analysis tools to perform faster runti...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...
Many popular systems for processing “big data ” are im-plemented in high-level programming languages...