The design and evaluation of high performance computers has concentrated on increasing computational speed for applications. This performance is often measured on a well configured dedicated system to show the best case. In the real environment, resources are not always dedicated to a single task, and systems run tasks that may influence each other, so run times vary, sometimes to an unreasonably large extent. This paper explores the amount of variation seen across four large distributed memory systems in a systematic manner. It then analyzes the causes for the variations seen and discusses what can be done to decrease the variation without impacting performance
Variation in performance and power across manufactured parts and their operating conditions is an ac...
This paper presents a framework for characterizing the distribution of fine-grained parallelism, dat...
Power consumption and process variability are two important, interconnected, challenges of future ge...
International audienceIncreasingly complex consumer electronics applications call for embedded proce...
International audienceIn [8], we demonstrated that contrary to sequential applications, parallel Ope...
Systems for high performance computing are getting increasingly complex. On the one hand, the number...
Highly variable parallel application execution time is a persistent issue in cluster computing envir...
Today’s challenging problems in science and industry are solved by complex data-driven al...
The recent growth in the number of precessing units in today's multicore processor architectures ena...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Performance and scalability of high performance scientific applications on large scale parallel mach...
Supercomputers are used to solve some of the world’s most computationally demanding problems. Exasc...
Simulation remains an important component in the design of multicore processor architectures, just a...
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and appl...
In this paper, we introduce an analytical technique based on queueing networks and Petrinets for mak...
Variation in performance and power across manufactured parts and their operating conditions is an ac...
This paper presents a framework for characterizing the distribution of fine-grained parallelism, dat...
Power consumption and process variability are two important, interconnected, challenges of future ge...
International audienceIncreasingly complex consumer electronics applications call for embedded proce...
International audienceIn [8], we demonstrated that contrary to sequential applications, parallel Ope...
Systems for high performance computing are getting increasingly complex. On the one hand, the number...
Highly variable parallel application execution time is a persistent issue in cluster computing envir...
Today’s challenging problems in science and industry are solved by complex data-driven al...
The recent growth in the number of precessing units in today's multicore processor architectures ena...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Performance and scalability of high performance scientific applications on large scale parallel mach...
Supercomputers are used to solve some of the world’s most computationally demanding problems. Exasc...
Simulation remains an important component in the design of multicore processor architectures, just a...
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and appl...
In this paper, we introduce an analytical technique based on queueing networks and Petrinets for mak...
Variation in performance and power across manufactured parts and their operating conditions is an ac...
This paper presents a framework for characterizing the distribution of fine-grained parallelism, dat...
Power consumption and process variability are two important, interconnected, challenges of future ge...