This work presents an end-to-end methodology for quantifying the performance and power benefits of simultaneous multithreading (SMT) for HPC centers and applies this methodology to a production system and workload. Ultimately, SMT’s value system-wide depends on whether users effectively employ SMT at the application level. However, predicting SMT’s benefit for HPC applications is challenging; by doubling the number of threads, the application’s characteristics may change. This work proposes statistical modeling techniques to predict the speedup SMT confers to HPC applications. This approach, accurate to within 8%, uses only lightweight, transparent performance monitors collected during a single run of the application
Many studies have shown that load imbalancing causes significant performance degradation in High Per...
High-performance computing systems have become increasingly dynamic, complex, and unpredictable. To ...
Simultaneous Multithreading (SMT) has been proposed for improving processor throughput by overlappin...
In this whitepaper we describe the effort we have made to measure performance of applications and sy...
Power-Aware computing is gaining an increasing attention both in academic and industrial settings. T...
Energy demanding trend in Information and Communication Technologies makes power consumption an impo...
This paper proposes a cycle accounting architecture for Simultaneous Multithreading (SMT) processors...
Simultaneous Multithreading, often abbreviated SMT, is a technique for improving the overall efficie...
Current operating systems (OS) perceive the different contexts of simultaneous multithreaded (SMT) p...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
State-of-the-art high-performance processors like the IBM POWER5 and Intel i7 show a trend in indust...
Simultaneous multithreading (SMT) seeks to improve the computation throughput of a processor core by...
Abstract—One of the key challenges for improving efficiency in warehouse scale computers (WSCs) is t...
Granularity control is an effective means for trading power consumption with performance on dense sh...
In this dissertation we present a methodology for predicting the best priority pair for a given co-s...
Many studies have shown that load imbalancing causes significant performance degradation in High Per...
High-performance computing systems have become increasingly dynamic, complex, and unpredictable. To ...
Simultaneous Multithreading (SMT) has been proposed for improving processor throughput by overlappin...
In this whitepaper we describe the effort we have made to measure performance of applications and sy...
Power-Aware computing is gaining an increasing attention both in academic and industrial settings. T...
Energy demanding trend in Information and Communication Technologies makes power consumption an impo...
This paper proposes a cycle accounting architecture for Simultaneous Multithreading (SMT) processors...
Simultaneous Multithreading, often abbreviated SMT, is a technique for improving the overall efficie...
Current operating systems (OS) perceive the different contexts of simultaneous multithreaded (SMT) p...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
State-of-the-art high-performance processors like the IBM POWER5 and Intel i7 show a trend in indust...
Simultaneous multithreading (SMT) seeks to improve the computation throughput of a processor core by...
Abstract—One of the key challenges for improving efficiency in warehouse scale computers (WSCs) is t...
Granularity control is an effective means for trading power consumption with performance on dense sh...
In this dissertation we present a methodology for predicting the best priority pair for a given co-s...
Many studies have shown that load imbalancing causes significant performance degradation in High Per...
High-performance computing systems have become increasingly dynamic, complex, and unpredictable. To ...
Simultaneous Multithreading (SMT) has been proposed for improving processor throughput by overlappin...