The novel ScaleMP vSMP architecture employs commodity x86-based servers with an InfiniBand network to assemble a large shared memory system at an attractive price point. We examine this combined hardware- and softwareapproach of a DSM system using both system-level kernel benchmarks as well as real-world application codes. We compare this architecture with traditional shared memory machines and elaborate on strategies to tune application codes parallelized with OpenMP on multiple levels. Finally we summarize the necessary conditions which a scalable application has to fulfill in order to profit from the full potential of the ScaleMP approach
OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/18...
The concept of a shared address space simplifies the parallelization of programs by using shared dat...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
Nowadays clusters are one of the most used platforms in High Performance Computing and most programm...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
This paper describes an OpenMP ready distributed shared memory system called FDSM. FDSM analyzes the...
OpenMP has established itself as the de facto standard for parallel programming on shared-memory pla...
Exascale systems will exhibit much higher degrees of parallelism both in terms of the number of node...
Abstract. This paper presents a source-to-source translation strategy from OpenMP to Global Arrays i...
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium sca...
Summary form only given. Traditional software distributed shared memory (SDSM) systems modify the se...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
OpenMP has emerged as the de facto standard for writing parallel programs on shared address space pl...
In this work we report on our experiences running OpenMP (message passing) programs on a commodity c...
OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/18...
The concept of a shared address space simplifies the parallelization of programs by using shared dat...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
Nowadays clusters are one of the most used platforms in High Performance Computing and most programm...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
This paper describes an OpenMP ready distributed shared memory system called FDSM. FDSM analyzes the...
OpenMP has established itself as the de facto standard for parallel programming on shared-memory pla...
Exascale systems will exhibit much higher degrees of parallelism both in terms of the number of node...
Abstract. This paper presents a source-to-source translation strategy from OpenMP to Global Arrays i...
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium sca...
Summary form only given. Traditional software distributed shared memory (SDSM) systems modify the se...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
OpenMP has emerged as the de facto standard for writing parallel programs on shared address space pl...
In this work we report on our experiences running OpenMP (message passing) programs on a commodity c...
OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/18...
The concept of a shared address space simplifies the parallelization of programs by using shared dat...