OSIRIS is a middleware for the composition and orchestra-tion of distributed web services that follows a P2P decen-tralized approach to process execution, providing already some degree of resilience to faults and high performance in large-scale computational clusters. In this paper, we present on-going work aimed at improving OSIRIS ’ fault tolerance capabilities. We introduce in OSIRIS new architectural el-ements for the maintenance of a virtual stable storage and the monitoring of activities of service instances, together with algorithms that allow execution to survive also failures that the system is currently not able to cope with
We present Bungie, an approach based on applicationlevel protocols that precisely capture the causal...
Abstract. The capability of dynamically adapting to distinct run-time condi-tions is an important is...
Traditionally, fault-tolerant systems assume that failures are independent, often expressed as a thr...
Workflows provide an easy to use programming model for the construction of complex services that are...
The functionality of applications is increasingly being made available by services. General concepts...
As human dependence on computing technology increases, so does the need for computer system dependab...
Clusters of message-passing computing nodes provide high-performance platforms for distributed appli...
We present a new software architecture in which all concepts necessary to achieve fault tolerance ca...
Today’s software engineering and application development trend is to take advantage of reusable soft...
Abstract – – Embedded high performance computing is being called upon to provide critical computing ...
Future information spaces such as digital libraries require new infrastructures that allow to use an...
The proliferation of software and hardware sensors which continuously create large amounts of data h...
Fault tolerance is the ability to a system to continue its functionality despite the presence of fau...
The FARGOS/VISTA ™ suite of technologies implements an infrastructure for the development, deploymen...
Crash and omission failures are common in service providers: a disk can break down or a link can fai...
We present Bungie, an approach based on applicationlevel protocols that precisely capture the causal...
Abstract. The capability of dynamically adapting to distinct run-time condi-tions is an important is...
Traditionally, fault-tolerant systems assume that failures are independent, often expressed as a thr...
Workflows provide an easy to use programming model for the construction of complex services that are...
The functionality of applications is increasingly being made available by services. General concepts...
As human dependence on computing technology increases, so does the need for computer system dependab...
Clusters of message-passing computing nodes provide high-performance platforms for distributed appli...
We present a new software architecture in which all concepts necessary to achieve fault tolerance ca...
Today’s software engineering and application development trend is to take advantage of reusable soft...
Abstract – – Embedded high performance computing is being called upon to provide critical computing ...
Future information spaces such as digital libraries require new infrastructures that allow to use an...
The proliferation of software and hardware sensors which continuously create large amounts of data h...
Fault tolerance is the ability to a system to continue its functionality despite the presence of fau...
The FARGOS/VISTA ™ suite of technologies implements an infrastructure for the development, deploymen...
Crash and omission failures are common in service providers: a disk can break down or a link can fai...
We present Bungie, an approach based on applicationlevel protocols that precisely capture the causal...
Abstract. The capability of dynamically adapting to distinct run-time condi-tions is an important is...
Traditionally, fault-tolerant systems assume that failures are independent, often expressed as a thr...