It has become commonplace to observe frequent multiple disk failures in big data centers in which thousands of drives operate simultaneously. Disks are typically protected by replication or erasure coding to guarantee a predetermined reliability. However, in order to optimize data protection, real life disk failure trends need to be modeled appropriately. The classical approach to modeling is to estimate the probability density function of failures using non-parametric estimation techniques such as Kernel Density Estimation (KDE). However, these techniques are suboptimal in the absence of the true underlying density function. Moreover, insufficient data may lead to overfitting. In this study, we propose to use a set of transformations to th...
In the present study, we investigate kernel density estimation (KDE) and its application to the Gumb...
Modern storage systems orchestrate a group of disks to achieve their performance and reliability goa...
I describe in this report an experimental system for using classification and regression trees to ge...
It has become commonplace to observe frequent multiple disk failures in big data centers in which th...
With the prosperity of Big Data, the performance and robustness of storage systems have become ever ...
Data centers use large numbers of hard drives as data storage devices and it is an increasing challe...
It is estimated that over 90 % of all new information produced in the world is being stored on magne...
Designing highly dependable systems requires a good understanding of failure characteristics. Unfort...
Today's most reliable data storage systems are made of redundant arrays of inexpensive disks (RAID)....
Component failure in large-scale IT installations such as cluster supercomputers or internet service...
Motivated by three failure data sets (lifetime of patients, failure time of hard drives and failure ...
Archiving and systematic backup of large digital data generates a quick demand for multi-petabyte sc...
Abstract—Predicting the impending failure of hard disk drives (HDDs) is crucial for preventing essen...
Today, cloud systems provide many key services to development and production environments; reliable ...
Various studies have attempted to predict individual disk failures based on the values of the SMART ...
In the present study, we investigate kernel density estimation (KDE) and its application to the Gumb...
Modern storage systems orchestrate a group of disks to achieve their performance and reliability goa...
I describe in this report an experimental system for using classification and regression trees to ge...
It has become commonplace to observe frequent multiple disk failures in big data centers in which th...
With the prosperity of Big Data, the performance and robustness of storage systems have become ever ...
Data centers use large numbers of hard drives as data storage devices and it is an increasing challe...
It is estimated that over 90 % of all new information produced in the world is being stored on magne...
Designing highly dependable systems requires a good understanding of failure characteristics. Unfort...
Today's most reliable data storage systems are made of redundant arrays of inexpensive disks (RAID)....
Component failure in large-scale IT installations such as cluster supercomputers or internet service...
Motivated by three failure data sets (lifetime of patients, failure time of hard drives and failure ...
Archiving and systematic backup of large digital data generates a quick demand for multi-petabyte sc...
Abstract—Predicting the impending failure of hard disk drives (HDDs) is crucial for preventing essen...
Today, cloud systems provide many key services to development and production environments; reliable ...
Various studies have attempted to predict individual disk failures based on the values of the SMART ...
In the present study, we investigate kernel density estimation (KDE) and its application to the Gumb...
Modern storage systems orchestrate a group of disks to achieve their performance and reliability goa...
I describe in this report an experimental system for using classification and regression trees to ge...