Performance analysis is an essential task in high-performance computing (HPC) systems, and it is applied for different purposes, such as anomaly detection, optimal resource allocation, and budget planning. HPC monitoring tasks generate a huge number of key performance indicators (KPIs) to supervise the status of the jobs running in these systems. KPIs give data about CPU usage, memory usage, network (interface) traffic, or other sensors that monitor the hardware. Analyzing this data, it is possible to obtain insightful information about running jobs, such as their characteristics, performance, and failures. The main contribution in this paper was to identify which metric/s (KPIs) is/are the most appropriate to identify/classify different ty...
International audienceWhen applying non-supervised clustering, the concepts discovered by the cluste...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
In recent years big data has emerged as a universal term and its management has become a crucial res...
High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. ...
Large high-performance computers (HPC) are expensive tools responsible for supporting thousands of s...
Every day, supercomputers execute 1000s of jobs with different characteristics. Data centers monitor...
HPC systems and parallel applications are increasing their complexity. Therefore the possibility of ...
International audienceIn HPC community the System Utilization metric enables to determine if the res...
Cluster became main platform as parallel and distributed computing structure for high performance co...
The monitoring and system analysis of high performance computing (HPC) clusters is of increasing imp...
The amount of information that must be processed daily by computer systems has reached huge quantiti...
This doctoral Thesis describes a novel way to select the best computer node out of a pool of availab...
Performance analysis tools allow application developers to identify and characterize the inefficienc...
Large scale computer clusters have during the last years become dominant for making computations in ...
AbstractHigh-performance computing (HPC) benchmarks are widely used to evaluate and rank system perf...
International audienceWhen applying non-supervised clustering, the concepts discovered by the cluste...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
In recent years big data has emerged as a universal term and its management has become a crucial res...
High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. ...
Large high-performance computers (HPC) are expensive tools responsible for supporting thousands of s...
Every day, supercomputers execute 1000s of jobs with different characteristics. Data centers monitor...
HPC systems and parallel applications are increasing their complexity. Therefore the possibility of ...
International audienceIn HPC community the System Utilization metric enables to determine if the res...
Cluster became main platform as parallel and distributed computing structure for high performance co...
The monitoring and system analysis of high performance computing (HPC) clusters is of increasing imp...
The amount of information that must be processed daily by computer systems has reached huge quantiti...
This doctoral Thesis describes a novel way to select the best computer node out of a pool of availab...
Performance analysis tools allow application developers to identify and characterize the inefficienc...
Large scale computer clusters have during the last years become dominant for making computations in ...
AbstractHigh-performance computing (HPC) benchmarks are widely used to evaluate and rank system perf...
International audienceWhen applying non-supervised clustering, the concepts discovered by the cluste...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
In recent years big data has emerged as a universal term and its management has become a crucial res...