Modern applications, such as smart cities, home automation, and eHealth, demand a new approach to improve cloud application dependability and availability. Due to the enormous scope and diversity of the cloud environment, most cloud services, including hardware and software, have encountered failures. In this study, we first analyze and characterize the behaviour of failed and completed jobs using publicly accessible traces. We have designed and developed a failure prediction model to determine failed jobs before they occur. The proposed model aims to enhance resource consumption and cloud application efficiency. Based on three publicly available traces: the Google cluster, Mustang, and Trinity, we evaluate the proposed model. In addition, ...
Cloud computing is increasingly attracting huge attention both in academic research and industry ini...
AbstractThe growing complexity and size of High Performance Computing systems (HPCs) lead to frequen...
High performance computing systems can have high failure rates as they feature a large number of ser...
Cloud Services are the on-demand availability of resources like storage, data, and compute power. No...
Cloud failure is one of the critical issues since it can cost millions of dollars to cloud service p...
Most cloud computing clusters are built from unreliable, commercial off-the-shelf components compar...
Cloud computing is a widely adopted platform for executing tasks of different application types that...
yesFailure in a cloud system is defined as an even that occurs when the delivered service deviates f...
Motivated by frequent failures in cloud computing systems, we analyze failure frequency and failure ...
Cloud computing provides various types of computing utilities where clients pay for services dependi...
Large high-performance computing systems are built with increasing number of components with more CP...
Abstract—This work presents models characterizing failures observed during the execution of large sc...
Cloud computing is a novel technology in the field of distributed computing. Usage of Cloud computin...
ith the revolution of the internet, new applications have emerged in our daily life. People are depe...
Node downtime and failed jobs in a computing cluster translate into wasted resources and user dissat...
Cloud computing is increasingly attracting huge attention both in academic research and industry ini...
AbstractThe growing complexity and size of High Performance Computing systems (HPCs) lead to frequen...
High performance computing systems can have high failure rates as they feature a large number of ser...
Cloud Services are the on-demand availability of resources like storage, data, and compute power. No...
Cloud failure is one of the critical issues since it can cost millions of dollars to cloud service p...
Most cloud computing clusters are built from unreliable, commercial off-the-shelf components compar...
Cloud computing is a widely adopted platform for executing tasks of different application types that...
yesFailure in a cloud system is defined as an even that occurs when the delivered service deviates f...
Motivated by frequent failures in cloud computing systems, we analyze failure frequency and failure ...
Cloud computing provides various types of computing utilities where clients pay for services dependi...
Large high-performance computing systems are built with increasing number of components with more CP...
Abstract—This work presents models characterizing failures observed during the execution of large sc...
Cloud computing is a novel technology in the field of distributed computing. Usage of Cloud computin...
ith the revolution of the internet, new applications have emerged in our daily life. People are depe...
Node downtime and failed jobs in a computing cluster translate into wasted resources and user dissat...
Cloud computing is increasingly attracting huge attention both in academic research and industry ini...
AbstractThe growing complexity and size of High Performance Computing systems (HPCs) lead to frequen...
High performance computing systems can have high failure rates as they feature a large number of ser...