The distributed monitoring infrastructure of the Compact Muon Solenoid (CMS) experiment at the European Organization for Nuclear Research (CERN) records on a Hadoop infrastructures a broad variety of computing and storage logs. They represent a valuable source of information for system tuning and capacity planning. In this paper we analyze machine learning (ML) techniques on large amount of traces to discover patterns and correlations useful to classify the popularity of experiment-related datasets. We implement a scalable pipeline of Spark components which collect the dataset access logs from heterogeneous monitoring sources and group them into weekly snapshots organized by CMS sites. Predictive models are trained on these snapshots and fo...
This report represents continued study where ML algorithms were used to predict databases popularity...
During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of dat...
International audienceIn Content Delivery Networks (CDNs), knowing the popularity of video content h...
The distributed monitoring infrastructure of the Compact Muon Solenoid (CMS) experiment at the Europ...
The Compact Muon Solenoid (CMS) expe- riment at the European Organization for Nuclear Research (CERN...
The Compact Muon Solenoid (CMS) experiment at the European Organization for Nuclear Research (CERN) ...
The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at ...
This thesis presents a study of the Grid data access patterns in distributed analysis in the CMS exp...
During the LHC Run-1 data taking, all experiments collected large data volumes from proton-proton an...
Recent rapid growth of data traffic in mobile networks has stretched the capability of current netwo...
Efficient distribution of physics data over ATLAS grid sites is one of the most important tasks for ...
peer reviewedEdge caching is an effective solution to reduce delivery latency and network congestion...
The ATLAS Experiment at the LHC generates petabytes of data that is distributed among 160 computing ...
During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of dat...
ATLAS (A Toroidal LHC Apparatus) is one of several experiments of at the Large Hadron Collider (LHC)...
This report represents continued study where ML algorithms were used to predict databases popularity...
During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of dat...
International audienceIn Content Delivery Networks (CDNs), knowing the popularity of video content h...
The distributed monitoring infrastructure of the Compact Muon Solenoid (CMS) experiment at the Europ...
The Compact Muon Solenoid (CMS) expe- riment at the European Organization for Nuclear Research (CERN...
The Compact Muon Solenoid (CMS) experiment at the European Organization for Nuclear Research (CERN) ...
The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at ...
This thesis presents a study of the Grid data access patterns in distributed analysis in the CMS exp...
During the LHC Run-1 data taking, all experiments collected large data volumes from proton-proton an...
Recent rapid growth of data traffic in mobile networks has stretched the capability of current netwo...
Efficient distribution of physics data over ATLAS grid sites is one of the most important tasks for ...
peer reviewedEdge caching is an effective solution to reduce delivery latency and network congestion...
The ATLAS Experiment at the LHC generates petabytes of data that is distributed among 160 computing ...
During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of dat...
ATLAS (A Toroidal LHC Apparatus) is one of several experiments of at the Large Hadron Collider (LHC)...
This report represents continued study where ML algorithms were used to predict databases popularity...
During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of dat...
International audienceIn Content Delivery Networks (CDNs), knowing the popularity of video content h...