This technical memo presents a case study of performance prediction for the Hybrid Technology Multi-Threaded architecture. We use a dense matrix multiply to introduce an analytical methodology to predict the performance of the percolation model on the HTMT architecture. HTMT introduces a percolation program and execution model that (1) is explicitly multi-threaded; (2) incorporates global memory address space; and (3) explicitly exposes the HTMT memory hierarchy to the programmer. The percolation model extends dynamic prefetching to allow the management of contexts that include data, program instructions, and control states. An analytical study of our algorithm and the percolation process is used to determine the number of operations that a...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
A method is presented for modeling application performance on parallel computers in terms of the per...
In modern clustering environments where the memory hierarchy has many layers (distributed memory, sh...
In this report we summarize findings from a study of the predicted performance of a suite of applica...
The Hybrid Technology Multi-Threaded (HTMT) Architecture has been proposed to meet the challenges of...
Percolation has recently been proposed as a key component of an advanced program execution model for...
Percolation has recently been proposed as a key component of an advanced program exe-cution model fo...
The Hybrid Technology Multi-Threaded (HTMT) Architecture has been proposed to meet the challenges of...
Efficient data supply to the processor is the one of the keys to achieve high performance. However, ...
Accurately modeling and predicting performance for large-scale applications becomes increasingly dif...
International audienceThe increasing computation capability of servers comes with a dramatic increas...
Application performance often depends on achieved memory bandwidth. Achieved memory bandwidth varies...
Presented at HiPEAC Conference 2020, Bologna (Italy)Time series analysis is an important research to...
Performance modeling, the science of understanding and predicting application performance, is import...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
A method is presented for modeling application performance on parallel computers in terms of the per...
In modern clustering environments where the memory hierarchy has many layers (distributed memory, sh...
In this report we summarize findings from a study of the predicted performance of a suite of applica...
The Hybrid Technology Multi-Threaded (HTMT) Architecture has been proposed to meet the challenges of...
Percolation has recently been proposed as a key component of an advanced program execution model for...
Percolation has recently been proposed as a key component of an advanced program exe-cution model fo...
The Hybrid Technology Multi-Threaded (HTMT) Architecture has been proposed to meet the challenges of...
Efficient data supply to the processor is the one of the keys to achieve high performance. However, ...
Accurately modeling and predicting performance for large-scale applications becomes increasingly dif...
International audienceThe increasing computation capability of servers comes with a dramatic increas...
Application performance often depends on achieved memory bandwidth. Achieved memory bandwidth varies...
Presented at HiPEAC Conference 2020, Bologna (Italy)Time series analysis is an important research to...
Performance modeling, the science of understanding and predicting application performance, is import...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
A method is presented for modeling application performance on parallel computers in terms of the per...
In modern clustering environments where the memory hierarchy has many layers (distributed memory, sh...