Hardware techniques to improve the robustness of a computing system can be very expensive, difficult to implement and validate. Moreover, they require long evalua- tion processes that could lead to the redesign of the hardware itself when reliability requirements are not satisfied. This chapter will cover the software techniques that allow improving the tolerance of the system to hardware faults by acting at soft- ware level only. We will cover the recently proposed approaches to detect and correct transient and permanent faults
As chip technology keeps on shrinking towards higher densities and lower operating vol- tages, memo...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
134 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2009.In summary, this dissertation...
Preprint (before reviews) version of the book chapter "Design techniques to improve the resilience o...
Reliability has always been a major concern in designing computing systems. However, the increasing ...
To meet an insatiable consumer demand for greater performance at less power, silicon technology has ...
Nanoscale technology nodes bring reliability concerns back to the center stage of digital system des...
A reliable computer system needs to provide its normal level of service in the presence of hardware ...
Transient faults are emerging as a critical reliability concern for modern microproces-sors. Recentl...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
The theory of fault-tolerant computer design has developed rapidly. Several techniques using hardwar...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Fault-tolerant computing began between 1965 and 1970, probably with the highly reliable ...
International audienceResilient computing is defined as the ability of a system to stay dependable w...
This report provides an introduction to resilience methods. The emphasis is on checkpointing, the de...
As chip technology keeps on shrinking towards higher densities and lower operating vol- tages, memo...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
134 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2009.In summary, this dissertation...
Preprint (before reviews) version of the book chapter "Design techniques to improve the resilience o...
Reliability has always been a major concern in designing computing systems. However, the increasing ...
To meet an insatiable consumer demand for greater performance at less power, silicon technology has ...
Nanoscale technology nodes bring reliability concerns back to the center stage of digital system des...
A reliable computer system needs to provide its normal level of service in the presence of hardware ...
Transient faults are emerging as a critical reliability concern for modern microproces-sors. Recentl...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
The theory of fault-tolerant computer design has developed rapidly. Several techniques using hardwar...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Fault-tolerant computing began between 1965 and 1970, probably with the highly reliable ...
International audienceResilient computing is defined as the ability of a system to stay dependable w...
This report provides an introduction to resilience methods. The emphasis is on checkpointing, the de...
As chip technology keeps on shrinking towards higher densities and lower operating vol- tages, memo...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
134 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2009.In summary, this dissertation...