Loading...

Improving Fault Tolerance in Safety-Critical Distributed Automotive Communication Networks

Sedaghat, Yasser | 2011

588 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 42063 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Miremadi, Ghassem
  7. Abstract:
  8. Nowadays, distributed embedded systems are extensively employed in safety-critical automotive applications, e.g. Steer-By-Wire and Brake-By-Wire. Moreover, according to the present and future needs of the automotive industry of Iran, in this Ph.D. thesis, fault tolerance in safety-critical distributed automotive communication networks has been improved. This thesis consists of three research layers. In the first layer, several comprehensive studies on all automotive communication protocols and their architectures have been done. Among the protocols, the FlexRay communication protocol has been selected to employ in safety-critical distributed embedded systems. In the second layer, the fault tolerance of the FlexRay protocol has been evaluated to identify its vulnerabilities and weak points. To evaluate the protocol, a FlexRay bus network composed of four nodes has been modeled. Utilizing simulation-based fault injection method, a total of 135,600 transient single-bit flip faults and a total of 112,600 transient double-bit flip faults have injected into all 408 registers of the FlexRay communication controller in one node. The evaluation results show that about 6.9% of the injected single-bit flip faults and about 5.7% of the injected double-bit flip faults have leaded to at least one error in the network. The reasons why faults in some registers were frequently activated and why in the other registers were overwritten have been discussed. Moreover, the type of errors has been identified and the occurrence rate of each error type has been calculated. In the third research layer, fault tolerance of the FlexRay communication controller has been improved. To do this, single points of failure in the controller have been identified and resolved. In addition, a behavioral monitoring technique has been proposed to differentiate follow-up errors originated from a faulty sending node, and original errors originated from a fault in the communication controller. Moreover, a comparison between fault tolerance improvement and hardware overhead has been provided. The comparison shows that if all vulnerable registers in the FlexRay controller are protected, the hardware overhead is only about 11%. Finally, a central bus guardian technique, based on the FlexRay specifications, has been implemented and its fault tolerance has been evaluated by injecting 47,250 transient single-bit flip faults. The evaluation results show that about 22% of injected faults have resulted in at least one error in the network. Moreover, two fundamental design flaws in the FlexRay central bus guardian, from the fault tolerance point of view, have been identified and resolved.

  9. Keywords:
  10. Fault Tolerance ; Distributed Embedded System ; Safety-Critical Application ; Flexray-Protocol

 Digital Object List

 Bookmark

No TOC