Every day, supercomputers execute 1000s of jobs with different characteristics. Data centers monitor the behavior of jobs to support the users and improve the infrastructure, for instance, by optimizing jobs or by determining guidelines for the next procurement. The classification of jobs into groups that express similar run-time behavior aids this analysis as it reduces the number of representative jobs to look into. This work utilizes machine learning techniques to cluster and classify parallel jobs based on the similarity in their temporal I/O behavior. Our contribution is the qualitative and quantitative evaluation of different I/O characterizations and similarity measurements and the development of a suitable clustering algorithm. In...
With the onset of ICT and big data capabilities, the physical asset and data computation is integrat...
Clustering is an attempt to form groups of similar objects, and it is a powerful tool for discoverin...
Improving the reliability and performance are of utmost importance for any system. This thesis prese...
Large high-performance computers (HPC) are expensive tools responsible for supporting thousands of s...
I/O is one of the main performance bottlenecks for many data-intensive scientific applications. Accu...
The paper is devoted to machine learning methods and algorithms for the supercomputer jobs executio...
High-performance computing (HPC) systems consist of thousands of compute nodes, storage systems and ...
High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. ...
Temporal Data Mining is a rapidly evolving and new area of research that is at the intersection of s...
HPC applications with suboptimal I/O behavior interfere with well-behaving applications and lead to...
Temporal Data Mining is a rapidly evolving area of research that is at the intersection of several d...
The complexity of resource usage and power consumption on cloud-based applications makes the underst...
Performance analysis is an essential task in high-performance computing (HPC) systems, and it is app...
grantor: University of TorontoUnderstanding the characteristics of parallel workloads aids...
This paper presents a comprehensive statistical analysis of a variety of workloads collected on prod...
With the onset of ICT and big data capabilities, the physical asset and data computation is integrat...
Clustering is an attempt to form groups of similar objects, and it is a powerful tool for discoverin...
Improving the reliability and performance are of utmost importance for any system. This thesis prese...
Large high-performance computers (HPC) are expensive tools responsible for supporting thousands of s...
I/O is one of the main performance bottlenecks for many data-intensive scientific applications. Accu...
The paper is devoted to machine learning methods and algorithms for the supercomputer jobs executio...
High-performance computing (HPC) systems consist of thousands of compute nodes, storage systems and ...
High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. ...
Temporal Data Mining is a rapidly evolving and new area of research that is at the intersection of s...
HPC applications with suboptimal I/O behavior interfere with well-behaving applications and lead to...
Temporal Data Mining is a rapidly evolving area of research that is at the intersection of several d...
The complexity of resource usage and power consumption on cloud-based applications makes the underst...
Performance analysis is an essential task in high-performance computing (HPC) systems, and it is app...
grantor: University of TorontoUnderstanding the characteristics of parallel workloads aids...
This paper presents a comprehensive statistical analysis of a variety of workloads collected on prod...
With the onset of ICT and big data capabilities, the physical asset and data computation is integrat...
Clustering is an attempt to form groups of similar objects, and it is a powerful tool for discoverin...
Improving the reliability and performance are of utmost importance for any system. This thesis prese...