textThis thesis discusses techniques for improving the fault tolerance of multithreaded applications. We consider the impact on fault tolerance methods of sharing address space and resources. We develop techniques in two broad categories: conservative multithreaded fault-tolerance (C-MTFT), which recovers an entire application on the failure of a single thread, and optimistic multithreaded fault-tolerance (OMTFT), which recovers threads independently as necessary. In the latter category, we provide a novel approach to recover hung threads while improving recovery time by managing access to shared resources so that hung threads can be restarted while other threads continue execution.Computer Science
Technology scaling has led to growing concerns about reliability in micro-processors. Currently, fau...
Distributed Shared Memory (DSM) systems are becoming increasingly more significant as a result of be...
The widespread popularity of languages allowing explicitly parallel, multi-threaded programming, e.g...
textThis thesis discusses techniques for improving the fault tolerance of multithreaded applications...
textFor the last 40 years, the systems community has invested a lot of effort in designing technique...
The advent of multicore architecture has increased the demand for multithreaded programs. It is noto...
textChip multiprocessors (CMPs) commonly share a large portion of memory system resources among dif...
As machine sizes have increased and application runtimes have lengthened, research into fault tolera...
Fault tolerance in distributed shared memory through replication has yet to be explored. This resear...
Recent increases in hard fault rates in modern chip multi-processors have led to a variety of approa...
Thesis (Ph.D.) - Indiana University, Computer Sciences, 2010Scientists use advanced computing techni...
Graduation date: 1995There appears to be a broad agreement that high-performance computers of the fu...
Large machines with tens or even hundreds of thousands of processors are currently in use. As the nu...
Our accelerating computational demand and the rise of multicore hardware have made parallel programs...
The objective of this work is to investigate the algorithm design and the programming model of mult...
Technology scaling has led to growing concerns about reliability in micro-processors. Currently, fau...
Distributed Shared Memory (DSM) systems are becoming increasingly more significant as a result of be...
The widespread popularity of languages allowing explicitly parallel, multi-threaded programming, e.g...
textThis thesis discusses techniques for improving the fault tolerance of multithreaded applications...
textFor the last 40 years, the systems community has invested a lot of effort in designing technique...
The advent of multicore architecture has increased the demand for multithreaded programs. It is noto...
textChip multiprocessors (CMPs) commonly share a large portion of memory system resources among dif...
As machine sizes have increased and application runtimes have lengthened, research into fault tolera...
Fault tolerance in distributed shared memory through replication has yet to be explored. This resear...
Recent increases in hard fault rates in modern chip multi-processors have led to a variety of approa...
Thesis (Ph.D.) - Indiana University, Computer Sciences, 2010Scientists use advanced computing techni...
Graduation date: 1995There appears to be a broad agreement that high-performance computers of the fu...
Large machines with tens or even hundreds of thousands of processors are currently in use. As the nu...
Our accelerating computational demand and the rise of multicore hardware have made parallel programs...
The objective of this work is to investigate the algorithm design and the programming model of mult...
Technology scaling has led to growing concerns about reliability in micro-processors. Currently, fau...
Distributed Shared Memory (DSM) systems are becoming increasingly more significant as a result of be...
The widespread popularity of languages allowing explicitly parallel, multi-threaded programming, e.g...