One of the important design criteria for distributed systems and their applications is their reliability and robustness to hardware and software failures. The increase in complexity, interconnectedness, dependency and the asynchronous interactions between the components that include hardware resources (computers, servers, network devices), and software (application services, middleware, web services, etc.) makes the fault detection and tolerance a challenging research problem. In this dissertation, we present a self healing methodology based on the principles of autonomic computing, statistical and data mining techniques to detect faults (hardware or software) and also identify the source of the fault. In our approach, we monitor and analyz...
The growing complexity of distributed systems demands for new ways of control. This work addresses s...
The rapid advancement of networking, computing, sensing, and control systems has introduced a wide r...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
The complexity of systems is considered an obstacle to the progress of the IT industry. Autonomic co...
Networked computer systems continue to grow in scale and in the complexity of their components and i...
Funding for this research was provided by the Scottish Informatics and Computer Science Alliance.Aut...
Large-scale decentralized systems of autonomous agents interacting via asynchronous communication of...
Abstract. The initiatives Organic Computing and Autonomic Comput-ing introduced challenging visions ...
Abstract—The increasing complexity of distributed enterprise systems has made the task of managing t...
A novel approach to application fault recovery based on autonomic computing works by accurately moni...
A main requirement for the Future Internet is to enable self-management behaviors facilitating the n...
Complex distributed Internet services form the basis not only of e-commerce but increasingly of miss...
Distributed systems have become pervasive in current society. From laptops and mobile phones, to ser...
This research was partially supported by the Scottish Informatics and Computer Science Alliance (SIC...
With the increasing complexity of data center networks, the operations, management and diagnosis of ...
The growing complexity of distributed systems demands for new ways of control. This work addresses s...
The rapid advancement of networking, computing, sensing, and control systems has introduced a wide r...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...
The complexity of systems is considered an obstacle to the progress of the IT industry. Autonomic co...
Networked computer systems continue to grow in scale and in the complexity of their components and i...
Funding for this research was provided by the Scottish Informatics and Computer Science Alliance.Aut...
Large-scale decentralized systems of autonomous agents interacting via asynchronous communication of...
Abstract. The initiatives Organic Computing and Autonomic Comput-ing introduced challenging visions ...
Abstract—The increasing complexity of distributed enterprise systems has made the task of managing t...
A novel approach to application fault recovery based on autonomic computing works by accurately moni...
A main requirement for the Future Internet is to enable self-management behaviors facilitating the n...
Complex distributed Internet services form the basis not only of e-commerce but increasingly of miss...
Distributed systems have become pervasive in current society. From laptops and mobile phones, to ser...
This research was partially supported by the Scottish Informatics and Computer Science Alliance (SIC...
With the increasing complexity of data center networks, the operations, management and diagnosis of ...
The growing complexity of distributed systems demands for new ways of control. This work addresses s...
The rapid advancement of networking, computing, sensing, and control systems has introduced a wide r...
Failures in computing systems are unavoidable. Therefore, it is important to detect and diagnose fai...