This work examines the challenges and opportunities of Machine Learning (ML) for Monitoring and Operational Data Analytics (MODA) in the context of Quantitative Codesign of Supercomputers (QCS). MODA is employed to gain insights into the behavior of current High Performance Computing (HPC) systems to improve system efficiency, performance, and reliability (e.g. through optimizing cooling infrastructure, job scheduling, and application parameter tuning). In this work, we take the position that QCS in general, and MODA in particular, require close exchange with the ML community to realize the full potential of data-driven analysis for the benefit of existing and future HPC systems. This exchange will facilitate identifying the appropriate ML ...
The experiments running at the Large Hadron Collider (LHC) at CERN work in challenging conditions an...
Industry 4.0 is coming into the industry by storm. The leveraging of the data that is already presen...
In the production, the efficient employment of machines is realized as a source of industry competit...
In the 21st century, we no longer try to turn lead into gold, but data into money. Machine Learning ...
We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning...
High-performance Computing (HPC) systems play pivotal roles in societal and scientific advancements,...
In the near future, large scientific collaborations will face unprecedented computing challenges. Pr...
Machine learning has recently emerged as a powerful technique to increase operational efficiency or ...
Modern computer systems expose diverse configurable parameters whose complicated interactions have s...
In this paper, we explore how some of the principal ideas in machine learning can be applied to the ...
Improving the reliability and performance are of utmost importance for any system. This thesis prese...
abstract: The dawn of Internet of Things (IoT) has opened the opportunity for mainstream adoption of...
Machine learning has recently emerged as a powerful technique to increase operational efficiency or ...
In recent years, we have seen increased interest in applying machine learning to system problems. Fo...
Machine learning teaches computers to think in a similar way to how humans do. An ML models work by ...
The experiments running at the Large Hadron Collider (LHC) at CERN work in challenging conditions an...
Industry 4.0 is coming into the industry by storm. The leveraging of the data that is already presen...
In the production, the efficient employment of machines is realized as a source of industry competit...
In the 21st century, we no longer try to turn lead into gold, but data into money. Machine Learning ...
We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning...
High-performance Computing (HPC) systems play pivotal roles in societal and scientific advancements,...
In the near future, large scientific collaborations will face unprecedented computing challenges. Pr...
Machine learning has recently emerged as a powerful technique to increase operational efficiency or ...
Modern computer systems expose diverse configurable parameters whose complicated interactions have s...
In this paper, we explore how some of the principal ideas in machine learning can be applied to the ...
Improving the reliability and performance are of utmost importance for any system. This thesis prese...
abstract: The dawn of Internet of Things (IoT) has opened the opportunity for mainstream adoption of...
Machine learning has recently emerged as a powerful technique to increase operational efficiency or ...
In recent years, we have seen increased interest in applying machine learning to system problems. Fo...
Machine learning teaches computers to think in a similar way to how humans do. An ML models work by ...
The experiments running at the Large Hadron Collider (LHC) at CERN work in challenging conditions an...
Industry 4.0 is coming into the industry by storm. The leveraging of the data that is already presen...
In the production, the efficient employment of machines is realized as a source of industry competit...