thesisTo address the need of understanding and optimizing the performance of complex applications and achieving sustained application performance across different architectures, we need performance models and tools that could quantify the theoretical performance and the resultant gap between theoretical and observed performance. This thesis proposes a benchmark-driven Roofline Model Toolkit to provide theoretical and achievable performance, and their resultant gap for multicore, manycore, and accelerated architectures. Roofline micro benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these micro benchmarks focus on capturing the performance of ...
Modern supercomputers have complex features: many hardware threads, deep memory hierarchies, and man...
This article consists of a collection of slides from the authors' conference presentation. The Roofl...
The software performance optimizations process is one of the most challenging aspects of developing ...
We present preliminary results of theRooflineToolkit formulticore, manycore, and accelerated archite...
Understanding the performance of applications on modern multi- and manycore platforms is a difficult...
Manufacturers will likely offer multiple products with differing numbers of cores to cover multiple ...
With energy-efficient architectures, including accelerators and many-core processors, gaining tracti...
The Roofline model offers insight on how to improve the performance of software and hardware
This dissertation maps various kernels and applications to a spectrum of programming models and arch...
During the past decades, High-Performance Computing (HPC) has been widely used in various industries...
Recent trends in computing architecture development have focused on exploiting task- and data-level ...
The objective of the proposed research is to develop an analytical model that predicts performance a...
dissertationWith the explosion of chip transistor counts, the semiconductor industry has struggled w...
Model-based performance prediction is a well-known concept to ensure the quality of software.Current...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Modern supercomputers have complex features: many hardware threads, deep memory hierarchies, and man...
This article consists of a collection of slides from the authors' conference presentation. The Roofl...
The software performance optimizations process is one of the most challenging aspects of developing ...
We present preliminary results of theRooflineToolkit formulticore, manycore, and accelerated archite...
Understanding the performance of applications on modern multi- and manycore platforms is a difficult...
Manufacturers will likely offer multiple products with differing numbers of cores to cover multiple ...
With energy-efficient architectures, including accelerators and many-core processors, gaining tracti...
The Roofline model offers insight on how to improve the performance of software and hardware
This dissertation maps various kernels and applications to a spectrum of programming models and arch...
During the past decades, High-Performance Computing (HPC) has been widely used in various industries...
Recent trends in computing architecture development have focused on exploiting task- and data-level ...
The objective of the proposed research is to develop an analytical model that predicts performance a...
dissertationWith the explosion of chip transistor counts, the semiconductor industry has struggled w...
Model-based performance prediction is a well-known concept to ensure the quality of software.Current...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Modern supercomputers have complex features: many hardware threads, deep memory hierarchies, and man...
This article consists of a collection of slides from the authors' conference presentation. The Roofl...
The software performance optimizations process is one of the most challenging aspects of developing ...