OpenMP provides a portable programming interface for shared memory parallel computers (SMPs). Although this interface has proven successful for small SMPs, it requires greater flexibility in light of the steadily growing size of individual SMPs and the recent advent of multithreaded chips. In this paper, we describe two application development experiences that exposed these expressivity problems in the current OpenMP specification. We then propose mechanisms to overcome these limitations, including thread subteams and thread topologies. Thus, we identify language features that improve OpenMP application performance on emerging and large-scale platforms while preserving ease of programming
Summary form only given. Traditional software distributed shared memory (SDSM) systems modify the se...
With the increasing prevalence of multicore processors, shared-memory programming models are essenti...
This paper advances the state-of-the-art in programming models for exploiting task-level parallelism...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
OpenMP has established itself as the de facto standard for parallel programming on shared-memory pla...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/18...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
This paper presents a set of proposals for the OpenMP shared--memory programming model oriented tow...
OpenMP is attracting wide-spread interest because of its easy-to-use parallel programming model for ...
This paper presents a new parallel programming environment called ParADE to enable easy, portable, a...
Traditional software distributed shared memory (SDSM) systems modify the semantics of a real hardwar...
Transactional Memory (TM) is a key future technology for emerging many-cores. On the other hand, Ope...
F. Wolf, B. Mohr, and D. an Ney (Eds.), pages 12, pp. 53-64International audienceThread affinity has...
This paper presents a new parallel programming environment called ParADE to enable easy, portable, ...
Summary form only given. Traditional software distributed shared memory (SDSM) systems modify the se...
With the increasing prevalence of multicore processors, shared-memory programming models are essenti...
This paper advances the state-of-the-art in programming models for exploiting task-level parallelism...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
OpenMP has established itself as the de facto standard for parallel programming on shared-memory pla...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/18...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
This paper presents a set of proposals for the OpenMP shared--memory programming model oriented tow...
OpenMP is attracting wide-spread interest because of its easy-to-use parallel programming model for ...
This paper presents a new parallel programming environment called ParADE to enable easy, portable, a...
Traditional software distributed shared memory (SDSM) systems modify the semantics of a real hardwar...
Transactional Memory (TM) is a key future technology for emerging many-cores. On the other hand, Ope...
F. Wolf, B. Mohr, and D. an Ney (Eds.), pages 12, pp. 53-64International audienceThread affinity has...
This paper presents a new parallel programming environment called ParADE to enable easy, portable, ...
Summary form only given. Traditional software distributed shared memory (SDSM) systems modify the se...
With the increasing prevalence of multicore processors, shared-memory programming models are essenti...
This paper advances the state-of-the-art in programming models for exploiting task-level parallelism...