It is a great challenge to build reliable computer systems with unreliable hardware and buggy software. On one hand, software bugs account for as much as 40% of system failures and incur high cost, an estimate of $59.5B a year, on the US economy. On the other hand, under the current trends of technology scaling, transient faults (also known as soft errors) in the underlying hardware are predicted to grow at least in proportion to the number of devices being integrated, which further exacerbates the problem of system reliability. We propose several methods to improve system reliability both in terms of detecting and correcting soft-errors as well as facilitating software debugging. In our first approach, we detect instruction-level anomalies...
There are many ways to find bugs in programs. For example, observed input and output values can be c...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
As hardware performance and dependability have dramatically improved in the past few decades, the so...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
Debugging software is challenging because of the increasing complexity of software and hardware, and...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Intermittent hardware faults are hard to diagnose as they occur non-deterministically. Hardware-only...
Soft errors (or Transient faults) are temporary faults that arise in a circuit due to a variety of i...
Abstract—As silicon technology continues to scale down and validation expenses continue to increase,...
The ever-increasing parallelism in computer systems has made software more prone to concurrency fail...
In this paper we propose a unified architectural support that can be used flexibly for either soft-e...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Intermittent hardware faults are hard to diagnose as they occur non-deterministically at th...
Recent impressive performance improvements in computer architecture have not led to significant gain...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
There are many ways to find bugs in programs. For example, observed input and output values can be c...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
As hardware performance and dependability have dramatically improved in the past few decades, the so...
It is a great challenge to build reliable computer systems with unreliable hardware and buggy softwa...
Debugging software is challenging because of the increasing complexity of software and hardware, and...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Intermittent hardware faults are hard to diagnose as they occur non-deterministically. Hardware-only...
Soft errors (or Transient faults) are temporary faults that arise in a circuit due to a variety of i...
Abstract—As silicon technology continues to scale down and validation expenses continue to increase,...
The ever-increasing parallelism in computer systems has made software more prone to concurrency fail...
In this paper we propose a unified architectural support that can be used flexibly for either soft-e...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
Abstract—Intermittent hardware faults are hard to diagnose as they occur non-deterministically at th...
Recent impressive performance improvements in computer architecture have not led to significant gain...
According to Moore’s law, technology scaling is continuously providing smaller and faster devices. T...
There are many ways to find bugs in programs. For example, observed input and output values can be c...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
As hardware performance and dependability have dramatically improved in the past few decades, the so...