High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. The monitoring systems collect a tremendous amount of data about different parameters or Key Performance Indicators (KPIs), such as resource usage, IO waiting time, etc. A proper analysis of this data, usually stored as time series, can provide insight in choosing the right management strategies as well as the early detection of issues. In this paper, we introduce a methodology to cluster HPC jobs according to their KPI indicators. Our approach reduces the inherent high dimensionality of the collected data by applying two techniques to the time series: literature-based and variance-based feature extraction. We also define a procedure to visua...
HPC-ODA is a collection of datasets acquired on production HPC systems, which are representative of ...
Given the complexity of modern HPC systems, achieving theoretical peak performance depends on a myri...
HPC systems and parallel applications are increasing their complexity. Therefore the possibility of ...
Performance analysis is an essential task in high-performance computing (HPC) systems, and it is app...
Large high-performance computers (HPC) are expensive tools responsible for supporting thousands of s...
The monitoring and system analysis of high performance computing (HPC) clusters is of increasing imp...
Every day, supercomputers execute 1000s of jobs with different characteristics. Data centers monitor...
As multicore architectures become mainstream, an in-depth understanding of how applications behave o...
Performance analysis tools allow application developers to identify and characterize the inefficienc...
Doctor of PhilosophyDepartment of Computer ScienceDaniel A. AndresenOverestimation of High Performan...
AbstractHigh-performance computing (HPC) benchmarks are widely used to evaluate and rank system perf...
Hundreds of petabytes of experimental data in high energy and nuclear physics (HENP) have been colle...
Hundreds of petabytes of experimental data in high energy and nuclear physics (HENP) have already be...
Hundreds of petabytes of experimental data in high energy and nuclear physics (HENP) have already be...
Contemporary microprocessors provide a rich set of integrated performance counters that allow applic...
HPC-ODA is a collection of datasets acquired on production HPC systems, which are representative of ...
Given the complexity of modern HPC systems, achieving theoretical peak performance depends on a myri...
HPC systems and parallel applications are increasing their complexity. Therefore the possibility of ...
Performance analysis is an essential task in high-performance computing (HPC) systems, and it is app...
Large high-performance computers (HPC) are expensive tools responsible for supporting thousands of s...
The monitoring and system analysis of high performance computing (HPC) clusters is of increasing imp...
Every day, supercomputers execute 1000s of jobs with different characteristics. Data centers monitor...
As multicore architectures become mainstream, an in-depth understanding of how applications behave o...
Performance analysis tools allow application developers to identify and characterize the inefficienc...
Doctor of PhilosophyDepartment of Computer ScienceDaniel A. AndresenOverestimation of High Performan...
AbstractHigh-performance computing (HPC) benchmarks are widely used to evaluate and rank system perf...
Hundreds of petabytes of experimental data in high energy and nuclear physics (HENP) have been colle...
Hundreds of petabytes of experimental data in high energy and nuclear physics (HENP) have already be...
Hundreds of petabytes of experimental data in high energy and nuclear physics (HENP) have already be...
Contemporary microprocessors provide a rich set of integrated performance counters that allow applic...
HPC-ODA is a collection of datasets acquired on production HPC systems, which are representative of ...
Given the complexity of modern HPC systems, achieving theoretical peak performance depends on a myri...
HPC systems and parallel applications are increasing their complexity. Therefore the possibility of ...