Multi-core and many-core were already major trends for the past six years, and are expected to continue for the next decades. With these trends of parallel computing, it becomes increasingly difficult to decide on which architecture to run a given application. In this work, we use an algorithm classification to predict performance prior to algorithm implementation. For this purpose, we modify the roofline model to include class information. In this way, we enable architectural choice through performance prediction prior to the development of architecture specific code. The new model, the boat hull model, is demonstrated using a GPU as a target architecture. We show for 6 example algorithms that performance is predicted accurately without re...
Scientific applications often require massive amounts of compute time and power. With the constantly...
This paper introduces a predictive modeling framework for GPU performance. The key innovation underl...
We propose an easy-to-understand, visual performance model that offers insights to programmers and a...
Multi-core and many-core were already major trends for the past six years, and are expected to conti...
Multi-core and many-core were already major trends for the past six years, and are expected to conti...
Multi-core and many-core were already major trends for the past six years and are expected to contin...
Multi-core and many-core were already major trends for the past six years and are expected to contin...
Manufacturers will likely offer multiple products with differing numbers of cores to cover multiple ...
We present preliminary results of theRooflineToolkit formulticore, manycore, and accelerated archite...
The significant growth in computational power of mod-ern Graphics Processing Units(GPUs) coupled wit...
Abstract. Using Graphics Processing Units (GPUs) to solve general purpose problems has received sign...
Abstract. Branch Prediction is a common function in nowadays microprocessors. Branch pre-dictor is d...
Understanding the performance of applications on modern multi- and manycore platforms is a difficult...
CPUs and dedicated accelerators (namely GPUs and FPGAs) continue to grow increasingly large and comp...
Scientific applications often require massive amounts of compute time and power. With the constantly...
This paper introduces a predictive modeling framework for GPU performance. The key innovation underl...
We propose an easy-to-understand, visual performance model that offers insights to programmers and a...
Multi-core and many-core were already major trends for the past six years, and are expected to conti...
Multi-core and many-core were already major trends for the past six years, and are expected to conti...
Multi-core and many-core were already major trends for the past six years and are expected to contin...
Multi-core and many-core were already major trends for the past six years and are expected to contin...
Manufacturers will likely offer multiple products with differing numbers of cores to cover multiple ...
We present preliminary results of theRooflineToolkit formulticore, manycore, and accelerated archite...
The significant growth in computational power of mod-ern Graphics Processing Units(GPUs) coupled wit...
Abstract. Using Graphics Processing Units (GPUs) to solve general purpose problems has received sign...
Abstract. Branch Prediction is a common function in nowadays microprocessors. Branch pre-dictor is d...
Understanding the performance of applications on modern multi- and manycore platforms is a difficult...
CPUs and dedicated accelerators (namely GPUs and FPGAs) continue to grow increasingly large and comp...
Scientific applications often require massive amounts of compute time and power. With the constantly...
This paper introduces a predictive modeling framework for GPU performance. The key innovation underl...
We propose an easy-to-understand, visual performance model that offers insights to programmers and a...