We describe the architecture, operational practices, and failure characteristics of three very large-scale Internet services. Our research on architecture and operational practices took the form of interviews with architects and operations staff at those (and several other) services. Our research on component and service failure took the form of examining the operations problem tracking databases from two of the services and a log of service failure post-mortem reports from the third. Architecturally, we find convergence on a common structure: division of nodes into service front-ends and back-ends, multiple levels of redundancy and load-balancing, and use of cus-tom-written software for both production services and administrative tools. Op...
The past 20 years have seen the Internet evolve from a network connecting academics, to a critical p...
Problems with digital services still occur at times, even for the most reliable services. Considerin...
Abstract — We present a client-based characterization of end-to-end Internet faults. Unlike prior st...
services are the newest and arguably the most commercially important class of systems requiring 24x7...
Service composition provides a flexible way to quickly enable new application functionalities in nex...
The increasing complexity of computer networks and our increasing dependence on them means enforcing...
Problems with digital services still occur at times, even for the most reliable services. Considerin...
We present a study of end-to-end web access failures in the Internet. Part of our characterization o...
nternet technology advances have benefited society and increased our productivity, but have also mad...
Despite the ‘dangers’ posed by e-service failures, there has not been a study to-date that explores ...
Distributed systems have changed the face of the world. When your web browser connects to a web serv...
This report investigates the causes and prevalence of failure in Web applications. Data was collecte...
An oft-repeated adage among telecommunication providers goes, “There are ve things that matter: reli...
© 27th European Conference on Information Systems - Information Systems for a Sharing Society, ECIS ...
The past 20 years have seen the Internet evolve from a network connecting academics, to a critical p...
The past 20 years have seen the Internet evolve from a network connecting academics, to a critical p...
Problems with digital services still occur at times, even for the most reliable services. Considerin...
Abstract — We present a client-based characterization of end-to-end Internet faults. Unlike prior st...
services are the newest and arguably the most commercially important class of systems requiring 24x7...
Service composition provides a flexible way to quickly enable new application functionalities in nex...
The increasing complexity of computer networks and our increasing dependence on them means enforcing...
Problems with digital services still occur at times, even for the most reliable services. Considerin...
We present a study of end-to-end web access failures in the Internet. Part of our characterization o...
nternet technology advances have benefited society and increased our productivity, but have also mad...
Despite the ‘dangers’ posed by e-service failures, there has not been a study to-date that explores ...
Distributed systems have changed the face of the world. When your web browser connects to a web serv...
This report investigates the causes and prevalence of failure in Web applications. Data was collecte...
An oft-repeated adage among telecommunication providers goes, “There are ve things that matter: reli...
© 27th European Conference on Information Systems - Information Systems for a Sharing Society, ECIS ...
The past 20 years have seen the Internet evolve from a network connecting academics, to a critical p...
The past 20 years have seen the Internet evolve from a network connecting academics, to a critical p...
Problems with digital services still occur at times, even for the most reliable services. Considerin...
Abstract — We present a client-based characterization of end-to-end Internet faults. Unlike prior st...