Design and Evaluation of a Master/Checker Method for an Embedded Processor

Ebrahimi, Mojtaba | 2010

409 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 40951 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Miremadi, Ghasem
  7. Abstract:
  8. Ever increasing applications of embedded systems have motivated the designers to pay special attention to the design requirements of such systems. Among embedded applications, safety-critical systems have high reliability requirements as failures in such systems may endanger human life or result in catastrophic consequences. Embedded processors as the computation cores of embedded systems are very crucial from reliability point of view. This is because; a failure in the processor most probably leads to a system failure. One effective way to protect embedded processors against environmental faults is to use system level fault-tolerant techniques such as Master/Checker (M/C) or Triple Modular Redundancy (TMR). However, the inability of these techniques in error recovery makes them inefficient in safety critical systems working in a harsh environment for long period of time. In this thesis, two error recovery techniques are proposed namely, ScTMR, and ScMC. ScMC is a rollback recovery technique for M/C based Systems. In this technique, the scan chains in a processor are reused to store the correct states of the processor in a checkpointing memory. Once an error is detected by the M/C, the latest checkpoint are copied to both master and checker units using the scan chains. ScTMR is a roll-forward error recovery technique for TMR based systems. In this technique, upon detecting an error, the correct processor states are copied from one of the error free modules to the erroneous module. Both the ScMC and ScTMR techniques have also the capability of detecting permanent faults. In case of detecting a permanent fault, the ScMC technique stops the operation while the ScTMR technique degrades the system to the master/checker configuration by disregarding the faulty module. The main attribute of the both proposed technique is that the error recovery time is negligible as compared to the similar techniques. The proposed techniques are implemented on a SPARC architecture based processor. To evaluate the reliability of the ScMC and ScTMR techniques, we have conducted simulation based fault injection experiments. The experimental results reveal that the proposed techniques impose very low area and performance overheads to the conventional M/C and TMR techniques.

  9. Keywords:
  10. Fault Injection ; Embedded Processor ; Master-Checker Method ; TMR Method ; Scan Chain

 Digital Object List

  • محتواي پايان نامه
  •   view