The IHEP local cluster is a middle-sized HEP data center which consists of 20'000 CPU slots, hundreds of data servers, 20 PB disk storage and 10 PB tape storage. After data taking of JUNO and LHAASO experiment, the data volume processed at this center will approach 10 PB data per year. Facing the current cluster scale, anomaly detection is a non-trivial task in daily maintenance. Traditional methods such as static thresholding of performance metrics, key words searching in system logs, etc., require expertise of certain software systems, and cannot be easy to transplant. Besides, these methods cannot easily adapt to the changes of workloads and hardware configurations. Anomalies are data points which are either different from the majority o...
Abstract: High-performance computing clusters have be-come critical computing resources in many sens...
Monitoring the health of large data centers is a major concern with the ever-increasing demand of gr...
Anomaly detection methods are devoted to target detection schemes in which no priori information ab...
This paper introduces a generic and scalable anomaly detection framework. Anomaly detection can impr...
A Grid computing site consists of various services including Grid middlewares, such as Computing Ele...
This thesis investigates the possibility of using anomaly detection on performance data of virtual s...
As the volume of data recorded from systems increases, there is a need to effectively analyse this d...
Anomaly detection, also called outlier detection, on the multivariate time-series data is applicable...
With the explosion of the number of distributed applications, a new dynamic server environment emerg...
Anomalies in data can be of great importance as they often indicate faulty behaviour. Locating these...
Modern scientific discoveries are driven by an unsatisfiable demand for computational resources. To ...
The occurrence of anomalies and unexpected, process-related faults is a major problem for manufactur...
International audienceEarly detection of anomalies in data centers is important to reduce downtimes ...
The impact of an anomaly is domain-dependent. In a dataset of network activities, an anomaly can imp...
Annually, the Large Hadron Collider (LHC) demands a huge amount of computing resources to deal with ...
Abstract: High-performance computing clusters have be-come critical computing resources in many sens...
Monitoring the health of large data centers is a major concern with the ever-increasing demand of gr...
Anomaly detection methods are devoted to target detection schemes in which no priori information ab...
This paper introduces a generic and scalable anomaly detection framework. Anomaly detection can impr...
A Grid computing site consists of various services including Grid middlewares, such as Computing Ele...
This thesis investigates the possibility of using anomaly detection on performance data of virtual s...
As the volume of data recorded from systems increases, there is a need to effectively analyse this d...
Anomaly detection, also called outlier detection, on the multivariate time-series data is applicable...
With the explosion of the number of distributed applications, a new dynamic server environment emerg...
Anomalies in data can be of great importance as they often indicate faulty behaviour. Locating these...
Modern scientific discoveries are driven by an unsatisfiable demand for computational resources. To ...
The occurrence of anomalies and unexpected, process-related faults is a major problem for manufactur...
International audienceEarly detection of anomalies in data centers is important to reduce downtimes ...
The impact of an anomaly is domain-dependent. In a dataset of network activities, an anomaly can imp...
Annually, the Large Hadron Collider (LHC) demands a huge amount of computing resources to deal with ...
Abstract: High-performance computing clusters have be-come critical computing resources in many sens...
Monitoring the health of large data centers is a major concern with the ever-increasing demand of gr...
Anomaly detection methods are devoted to target detection schemes in which no priori information ab...