drodenas,xavim,eduard,jesus¡ In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM Cyclops multithreaded architecture. Both solutions are independent and they are focused to obtain better performance through a better management of the cache locality. The first solution is based on software modifications to the OpenMP runtime library to balance stack accesses across all data caches. The second solution is a small hardware modification to change the data cache mapping behavior, with the same goal. Both solutions help parallel applications to improve scalability and obtain better performance in this kind of architectures. In fact, they could also be applied to future multi–core processors. We have ...
Cyclops is a new architecture for high performance parallel computers being developed at the IBM T. ...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
Nowadays clusters are one of the most used platforms in High Performance Computing and most programm...
In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM ...
Multithreaded architectures have the potential of tolerating large memory and functional unit latenc...
Abstract. This paper is motivated by the desire to provide an efficient and scal-able software cache...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
International audienceIn [8], we demonstrated that contrary to sequential applications, parallel Ope...
Abstract\u2014OpenMP is a de facto standard interface of the shared address space parallel programmi...
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium sca...
In this work, we present an OpenMP implementation suitable for multiprogrammed environments on Intel...
This paper presents COBRA (Continuous Binary Re-Adaptation), a runtime binary optimization framework...
This archive contains the benchmarks used in the conference paper "Multipurpose Cacheing to accelera...
In this paper, we present an alternative implementation of the NANOS OpenMP runtime library (NthLib)...
The emergence of System-on-Chip (SOC) design shows the growing popularity of the integration of mult...
Cyclops is a new architecture for high performance parallel computers being developed at the IBM T. ...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
Nowadays clusters are one of the most used platforms in High Performance Computing and most programm...
In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM ...
Multithreaded architectures have the potential of tolerating large memory and functional unit latenc...
Abstract. This paper is motivated by the desire to provide an efficient and scal-able software cache...
The most widely used node type in high-performance computing nowadays is a 2-socket server node. The...
International audienceIn [8], we demonstrated that contrary to sequential applications, parallel Ope...
Abstract\u2014OpenMP is a de facto standard interface of the shared address space parallel programmi...
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium sca...
In this work, we present an OpenMP implementation suitable for multiprogrammed environments on Intel...
This paper presents COBRA (Continuous Binary Re-Adaptation), a runtime binary optimization framework...
This archive contains the benchmarks used in the conference paper "Multipurpose Cacheing to accelera...
In this paper, we present an alternative implementation of the NANOS OpenMP runtime library (NthLib)...
The emergence of System-on-Chip (SOC) design shows the growing popularity of the integration of mult...
Cyclops is a new architecture for high performance parallel computers being developed at the IBM T. ...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
Nowadays clusters are one of the most used platforms in High Performance Computing and most programm...