The size and complexity of high-performance computing applications present a serious challenge to manual reasoning about program behavior. The vastness and diversity of code bases often break automatic analysis tools, which could otherwise be used. As a consequence, developers resort to mini-apps, i.e., trimmed-down proxies of the original programs that retain key performance characteristics. Unfortunately, their construction is difficult and time consuming and prevents their mass production. In this paper, we propose a systematic and tool-supported approach to extract mini-apps from large-scale applications that reduces the manual effort needed to create them. Our approach covers the stages kernel identification, data capture, code extract...
Achieving high-performance of large scientific codes is a difficult task. This has led to the develo...
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
The parallelization of real-world compute intensive Fortran application codes is generally not a tri...
The size and complexity of high-performance computing applications present a serious challenge to ma...
In high-performance computing, performance analysis, tuning, and exploration are relevant throughout...
Computational science and engineering application programs are typically large, complex, and dynamic...
Abstract—Nowadays, a challenge faced by many developers is the profiling of parallel applications so...
International audienceNowadays, a challenge faced by many developers is the profiling of parallel ap...
Application performance is determined by a combination of many choices: hardware platform, runtime e...
As access to supercomputing resources is becoming more and more commonplace, performance analysis to...
The design of high-performance computing architectures requires performance analysis of large-scale ...
Abstract. When computer architects re-invented parallelism through multi-core processors, applicatio...
In this work, several mini-apps have been created to enhance a real-world application performance, n...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
With the age of Exascale computing causing a diversification away from traditional CPU-based homogen...
Achieving high-performance of large scientific codes is a difficult task. This has led to the develo...
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
The parallelization of real-world compute intensive Fortran application codes is generally not a tri...
The size and complexity of high-performance computing applications present a serious challenge to ma...
In high-performance computing, performance analysis, tuning, and exploration are relevant throughout...
Computational science and engineering application programs are typically large, complex, and dynamic...
Abstract—Nowadays, a challenge faced by many developers is the profiling of parallel applications so...
International audienceNowadays, a challenge faced by many developers is the profiling of parallel ap...
Application performance is determined by a combination of many choices: hardware platform, runtime e...
As access to supercomputing resources is becoming more and more commonplace, performance analysis to...
The design of high-performance computing architectures requires performance analysis of large-scale ...
Abstract. When computer architects re-invented parallelism through multi-core processors, applicatio...
In this work, several mini-apps have been created to enhance a real-world application performance, n...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
With the age of Exascale computing causing a diversification away from traditional CPU-based homogen...
Achieving high-performance of large scientific codes is a difficult task. This has led to the develo...
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
The parallelization of real-world compute intensive Fortran application codes is generally not a tri...