DOI QR코드

DOI QR Code

Performance Analysis of Checkpointing and Dual Modular Redundancy for Fault Tolerance of Real-Time Control System

실시간 제어 시스템의 결함 극복을 위한 이중화 구조와 체크포인팅 기법의 성능 분석

  • 유상문 (군산대학교 전자정보공학부)
  • Published : 2008.04.01

Abstract

This paper deals with a performance analysis of real-time control systems, which engages DMR(dual modular redundancy) to detect transient errors and checkpointing technique to tolerate transient errors. Transient errors are caused by transient faults and the most significant type of errors in reliable computer systems. Transient faults are assumed to occur according to a Poisson process and to be detected by a dual modular redundant structure. In addition, an equidistant checkpointing strategy is considered. The probability of the successful task completion in a real-time control system where periodic checkpointing operations are performed during the execution of a real-time control task is derived. Numerical examples show how checkpoiniting scheme influences the probability of task completion. In addition, the result of the analysis is compared with the simulation result.

Keywords

References

  1. R. Harboe-Sorensen, E. Daly, F. Teston, H. Schweitzer, R. Nartallo, P. Perol, F. Vandenbussche, H. Dzitko, and J. Cretolle, "Observation and analysis of single event effects on-board the SOHO satellite," IEEE Trans. Nuclear Science, vol. 49, no. 3, pp. 1345-1350, Jun., 2002 https://doi.org/10.1109/TNS.2002.1039665
  2. A. Taber and E. Normand, "Single event upset in avionics," IEEE Trans. Nuclear Science, vol. 40, no. 2, pp. 120-126, Apr., 1993 https://doi.org/10.1109/23.212327
  3. E. Normand, "Signle event upset at ground level," IEEE Trans. Nuclear Science, vol. 43, no. 6, pp. 2742-2750, Dec., 1996 https://doi.org/10.1109/23.556861
  4. D. P. Siewiorek, Reliable Computer Systems: Design and Evaluation, A K Peters, 1998
  5. E. Dupont, M. Nicolaidis, and P. Rohr, "Embedded Robustness IPs for Transient-Error-Free ICs," IEEE Design & Test of Computers, vol. 19, pp. 56-70, May-Jun., 2002 https://doi.org/10.1109/MDT.2002.1033793
  6. B. Randell, "System Structure for Software Fault Tolerance," IEEE Trans. Software Engineering, vol. 1, no. 2, pp. 220-232, June 1975
  7. A. Ziv and J. Bruck, "An On-Line Algorithm for Checkpoint Placement," IEEE Trans. Computers, vol. 46, no. 9, pp. 976-985, Sep. 1997 https://doi.org/10.1109/12.620479
  8. Y. Ling, J. Mi, and X. Lin, "A variational calculus approach to optimal checkpoint placement," IEEE Trans. Computers, vol. 50, no. 7, pp. 699-708, Jul. 2001 https://doi.org/10.1109/12.936236
  9. K. G. Shin, T.-H. Lin, and Y.-H. Lee, "Optimal Checkpointing of Real-Time Tasks," IEEE Trans. Computers, vol. C-36, no. 11, pp. 1328-1341, Nov. 1987 https://doi.org/10.1109/TC.1987.5009472
  10. S. Punnekkat, A. Burns, and R. Davis, "Analysis of Checkpointing for Real-Time Systems," The Int'l Journal of Time-Critical Computing Systems (Real-Time Systems), vol. 20, no. 1, pp. 83-102, Jan. 2001
  11. Y. Zhang and K. Chakrabarty, "Dynamic Adaptation for Fault Tolerance and Power Management in Embedded Real-Time Systems," ACM Trans. Embedded Computing Systems, vol. 3, no. 2, pp. 336-360, May 2004 https://doi.org/10.1145/993396.993402
  12. J. W. S. Liu, Real-Time Systems, Prentice-Hall, 2000