With the introduction of federated data access to the workows of WLCG, it is becoming increasingly important for data centers to understand specific data ows regarding storage element accesses, firewall configurations, as well as the scheduling of batch jobs themselves. As existing batch system monitoring and related system monitoring tools do not support measurements at batch job level, a new tool has been developed and put into operation at the GridKa Tier 1 center for monitoring continuous data streams and characteristics of WLCG jobs and pilots. Long term measurements and data collection are in progress. These measurements already have been proven to be useful analyzing misbehaviors and various issues. Therefore we aim for an automated,...
International audienceIn HPC community the System Utilization metric enables to determine if the res...
Processing of large data sets with high through put is one of the major focus of Grid computing toda...
ATLAS Distributed Computing (ADC) uses the pilot model to submit jobs to Grid computing resources. T...
Recent developments in high energy physics (HEP) including multi-core jobs and multi-core pilots req...
For data centres it is increasingly important to monitor the network usage, and learn from network u...
Monitoring of the large-scale data processing of the ATLAS experiment includes monitoring of product...
The Job Execution Monitor (JEM) is a job-centric grid job monitoring software developed at the Unive...
The rising number of executed programs (jobs) enabled by thegrowing amount of available resources fr...
The IHEP local cluster is a middle-sized HEP data center which consists of 20'000 CPU slots, hundred...
Monitoring the WLCG infrastructure requires the gathering and analysis of a high volume of heterogen...
This paper introduces a generic and scalable anomaly detection framework. Anomaly detection can impr...
The rising number of executed programs (jobs) enabled by the growing amount of available resources f...
ATLAS is the largest experiment at the LHC. It generates vast volumes of scientific data accompanied...
International audienceThe ever increasing scale and complexity of large computational systems ask fo...
The 300,000 CPU-core HTCondor Batch farm at CERN provides the computing power for the initial proces...
International audienceIn HPC community the System Utilization metric enables to determine if the res...
Processing of large data sets with high through put is one of the major focus of Grid computing toda...
ATLAS Distributed Computing (ADC) uses the pilot model to submit jobs to Grid computing resources. T...
Recent developments in high energy physics (HEP) including multi-core jobs and multi-core pilots req...
For data centres it is increasingly important to monitor the network usage, and learn from network u...
Monitoring of the large-scale data processing of the ATLAS experiment includes monitoring of product...
The Job Execution Monitor (JEM) is a job-centric grid job monitoring software developed at the Unive...
The rising number of executed programs (jobs) enabled by thegrowing amount of available resources fr...
The IHEP local cluster is a middle-sized HEP data center which consists of 20'000 CPU slots, hundred...
Monitoring the WLCG infrastructure requires the gathering and analysis of a high volume of heterogen...
This paper introduces a generic and scalable anomaly detection framework. Anomaly detection can impr...
The rising number of executed programs (jobs) enabled by the growing amount of available resources f...
ATLAS is the largest experiment at the LHC. It generates vast volumes of scientific data accompanied...
International audienceThe ever increasing scale and complexity of large computational systems ask fo...
The 300,000 CPU-core HTCondor Batch farm at CERN provides the computing power for the initial proces...
International audienceIn HPC community the System Utilization metric enables to determine if the res...
Processing of large data sets with high through put is one of the major focus of Grid computing toda...
ATLAS Distributed Computing (ADC) uses the pilot model to submit jobs to Grid computing resources. T...