Abstract: Soft errors are emerging with the ongoing reduction of structure sizes in current and future hardware designs. This problematic is generally tackled by employing fault detection or tolerance measures from an applications ’ point of view. At the same time, research commences to harden the operating system, often considered as remaining single point of failure. Certainly, these measures can effectively treat the symptoms of hardware faults. However, we argue that the operating system design per se can offer an intrinsic resilience against errors. Dynamic operating system designs, often resembling Unix-like interfaces, are obliged to cope with pointers and list-based data structures to provide the demanded flexibility. In contrast, e...
Despite many decades of research, the management of errors in a live operating system remains a chal...
Nowadays, the reliability has become one of the main issues for safety-critical embedded systems, li...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...
Abstract—Because of shrinking structure sizes and operating voltages, computing hardware exhibits an...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
Reliability is of great concern to the scalability of extreme-scale systems. Of particular concern a...
International audienceThe computing continuum's actual trend is facing a growth in terms of devices ...
Transient hardware faults have become one of the major concerns affecting the reliability of modern ...
Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardwar...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2012In moder...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Technology scaling of integrated circuits is making transistors increasingly sensitive to process va...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Abstract: For embedded systems, the use of software-based error detection and correction approaches ...
Unpredictable hardware faults and software bugs lead to application crashes, incorrect computations,...
Despite many decades of research, the management of errors in a live operating system remains a chal...
Nowadays, the reliability has become one of the main issues for safety-critical embedded systems, li...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...
Abstract—Because of shrinking structure sizes and operating voltages, computing hardware exhibits an...
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected e...
Reliability is of great concern to the scalability of extreme-scale systems. Of particular concern a...
International audienceThe computing continuum's actual trend is facing a growth in terms of devices ...
Transient hardware faults have become one of the major concerns affecting the reliability of modern ...
Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardwar...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2012In moder...
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field...
Technology scaling of integrated circuits is making transistors increasingly sensitive to process va...
As machines increase in scale, it is predicted that failure rates of supercomputers will correspondi...
Abstract: For embedded systems, the use of software-based error detection and correction approaches ...
Unpredictable hardware faults and software bugs lead to application crashes, incorrect computations,...
Despite many decades of research, the management of errors in a live operating system remains a chal...
Nowadays, the reliability has become one of the main issues for safety-critical embedded systems, li...
Embedded systems are increasingly deployed in harsh environments that their components were not nece...