Techniques for System Dependability Evaluation

  • Jogesh K. Muppala
  • Ricardo M. Fricks
  • Kishor S. Trivedi
Part of the International Series in Operations Research & Management Science book series (ISOR, volume 24)

Abstract

A major application area for the probabilistic and numerical techniques explored in the earlier chapters is in characterizing the behavior of complex computer and communication systems. While system performance has received a lot of attention in the past, increasingly system dependability is gaining importance. The proliferation of computer and computer-based communication systems has contributed to this in no small measure. This chapter is thus a step in the direction of summarizing the techniques, tools and recent developments in the field of system dependability evaluation.

Keywords

State Space Model Fault Tree Computational Probability Continuous Time Markov Chain Reward Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Ammar and Islam, 1989]
    Ammar, H. H. and Islam, S. M. R. (1989). Time scale decomposition of a class of generalized stochastic Petri net models. IEEE Transactions on Software Engineering, 15 (6): 809–820.CrossRefGoogle Scholar
  2. [Barlow and Proschan, 1975]
    Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing - Probability Models. Holt, Rinehart and Winston, New York, NY, USA.Google Scholar
  3. [Bobbio et al., 1995]
    Bobbio, A., Kulkarni, V. G., Puliafito, A., Telek, M., and Trivedi, K. S. (1995). Preemptive repeat identical transitions in Markov regenerative stochastic petri nets. In Proceedings of the 6th International Workshop on Petri Nets and Performance Models - PNPM’95, pages 113122, Durham, NC, USA.Google Scholar
  4. [Bobbio and Telek, 1995]
    Bobbio, A. and Telek, M. (1995). Markov regenerative spn with non-overlapping activity cycles. In Proc. of the International Computer, Performance and Dependability Symposium - IPDS’95.Google Scholar
  5. [Bobbio and Trivedi, 1986]
    Bobbio, A. and Trivedi, K. S. (1986). An aggregation technique for the transient analysis of stiff Markov chains. IEEE Transactions on Computers, C-35(9): 803–814.Google Scholar
  6. [Botta and Harris, 1986]
    Botta, R. F. and Harris, C. M. (1986). Generalized hyperexponential distributions: Weak convergence results. Queueing Systems–Theory and Applications, 1 (2): 169–190.CrossRefGoogle Scholar
  7. [Botta et al., 1987]
    Botta, R. F., Harris, C. M., and Marchal, W. G. (1987). Characterizations of generalized hyperexponential distribution functions. Communications in Statistics–Stochastic Models, 3 (1): 115–148.CrossRefGoogle Scholar
  8. [Brokemeyer et al., 1948]
    Brokemeyer, F., Halstron, H. S., and Jensen, A. (1948). The life and works of A. K. Erlang. Transactions of the Danish Academy of Technical Sciences, 2.Google Scholar
  9. [Buzacott, 1970]
    Buzacott, J. (1970). Network approaches to finding the reliability of repairable systems. IEEE Transactions on Reliability, R-19(4): 140–146.Google Scholar
  10. [Cao, 1994]
    Cao, J. (1994). Reliability analysis of M/G/1 queueing system with repairable service station of reliability series structure. Microelectronics and Reliability, 34 (4): 721–725.CrossRefGoogle Scholar
  11. [Çinlar, 1975]
    Çinlar, E. (1975). Introduction to Stochastic Processes. Prentice-Hall, Englewood Cliffs, NJ, USA.Google Scholar
  12. [Chen et al., 1993]
    Chen, Y.-M., Fujisawa, T., and Osawa, H. (1993). Availability of the system with general repair time distributions and shut-off rules. Microelectronics and Reliability, 33 (1): 13–19.CrossRefGoogle Scholar
  13. [Choi et al., 1994]
    Choi, H., Kulkarni, V. G., and Trivedi, K. S. (1994). Markov regenerative stochastic Petri nets. Performance Evaluation, 20: 337–357.CrossRefGoogle Scholar
  14. [Ciardo et al., 1992a]
    Ciardo, G., Blakemore, A., Chimento, P., Muppala, J., and Trivedi, K. (1992a). Automatic generation and analysis of Markov reward models using stochastic reward nets. In Linear Algebra, Markov Chains, and Queueing Models, IMA Volumes in Mathematics and its Applications, volume 48, Heidelberg, Germany. Springer-Verlag.Google Scholar
  15. [Ciardo et al., 1994]
    Ciardo, G., German, R., and Lindemann, C. (1994). A characterization of the stochastic process underlying a stochastic Petri net. IEEE Transactions on Software Engineering, 20: 506–515.CrossRefGoogle Scholar
  16. [Ciardo et al., 1992b]
    Ciardo, G., Muppala, J. K., and Trivedi, K. S. (1992b). Analyzing concurrent and fault-tolerant software using stochastic reward nets. Journal of Parallel and Distributed Computing, 15: 255–269.CrossRefGoogle Scholar
  17. [Ciardo and Trivedi, 1993]
    Ciardo, G. and Trivedi, K. S. (1993). A decomposition approach for stochastic Petri net models. Performance Evaluation, 18 (1): 37–59.CrossRefGoogle Scholar
  18. [Clarotti, 1986]
    Clarotti, C. (1986). The Markov approach to calculating system reliability: Computational problems. In Serra, A. and Barlow, R., editors, Proceedings of the International School of Physics, Course XCIV,pages 55–66. North-Holland.Google Scholar
  19. [Cox, 1955a]
    Cox, D. R. (1955a). The analysis of non-Markovian stochastic processes by the inclusion of supplementary variables. Proc. Camb. Philos. Soc., 51 (3): 433–441.CrossRefGoogle Scholar
  20. [Cox, 1955b]
    Cox, D. R. (1955b). Use of complex probabilities in the theory of stochastic processes. Proc. Camb. Philos. Soc., 51: 313–318.CrossRefGoogle Scholar
  21. [Dhillon and Anude, 1993]
    Dhillon, B. S. and Anude, O. C. (1993). Common-cause failure analysis of a non-identical unit parallel system with arbitrarily distributed repair times. Microelectronics and Reliability, 33 (1): 87–103.CrossRefGoogle Scholar
  22. [Dhillon and Anude, 1994]
    Dhillon, B. S. and Anude, O. C. (1994). Income optimization of repairable and redundant system. Microelectronics and Reliability, 34 (11): 1709–1720.CrossRefGoogle Scholar
  23. [Dhillon and Yang, 1993]
    Dhillon, B. S. and Yang, N. (1993). Availability of a man-machine system with critical and non-critical human error. Microelectronics and Reliability, 33 (10): 1511–1521.CrossRefGoogle Scholar
  24. [Dugan et al., 1992]
    Dugan, J. B., Bavuso, S., and Boyd, M. (1992). Dynamic fault-tree models for fault-tolerant computer systems. IEEE Transactions on Reliability, R-41(9): 363–377.Google Scholar
  25. [Dugan et al., 1986]
    Dugan, J. B., Trivedi, K. S., Smotherman, M. K., and Geist, R. M. (1986). The Hybrid Automated Reliability Predictor. AIAA Journal of Guidance, Control and Dynamics, 9 (3): 319–331.CrossRefGoogle Scholar
  26. [Fricks et al., 1997]
    Fricks, R., Telek, M., Puliafito, A., and Trivedi, K. S. (1997). Markov renewal theory applied to performability evaluation. In Bagchi, K. and Zobrist, G., editors, State-of-the Art in Performance Modeling and Simulation. Modeling and Simulation of Advanced Computer Systems: Applications and Systems, pages 193–236, Newark, NJ, EUA. Gordon and Breach Publishers.Google Scholar
  27. [Fröberg, 1969]
    Fröberg, C. (1969). Introduction to Numerical Analysis, 2nd. ed. Addison-Wesley, Reading, MA, USA.Google Scholar
  28. [Garg et al., 1997]
    Garg, S., Puliafito, A., M. T., and Trivedi, K. (1997). Analysis of preventive maintenance in transactions based software systems. submitted for publication.Google Scholar
  29. [German and Lindemann, 1994]
    German, R. and Lindemann, C. (1994). Analysis of deterministic and stochastic Petri nets by the method of supplementary variables. Performance Evaluation, 20 (1–3): 317–335.CrossRefGoogle Scholar
  30. [German et al., 1995]
    German, R., Logothetis, D., and Trivedi, K. S. (1995). Transient analysis of Markov regenerative stochastic Petri nets: A comparison of approaches. In Proceedings of the 6th International Workshop on Petri Nets and Performance Models–PNPM’95, pages 103–111, Durham, NC, USA.CrossRefGoogle Scholar
  31. [Golub and van Loan, 1989]
    Golub, G. and van Loan, C. F. (1989). Matrix Computations. Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, 2nd edition.Google Scholar
  32. [Gopalan and Dinesh Kumar, 1996]
    Gopalan, M. N. and Dinesh Kumar (1996). On the transient behaviour of a repairable system with a warm standby. Microelectronics and Reliability, 36 (4): 525–532.CrossRefGoogle Scholar
  33. [Goyal et al., 1986]
    Goyal, A., Carter, W. C., de Souza e Silva, E., Lavenberg, S. S., and Trivedi, K. S. (1986). The system availability estimator. In Proceedings of the Sixteenth International Symposium on Fault-Tolerant Computing, pages 84–89, Los Alamitos, CA. IEEE Computer Society Press.Google Scholar
  34. [Goyal et al., 1987]
    Goyal, A., Lavenberg, S. S., and Trivedi, K. S. (1987). Probabilistic modeling of computer system availability. Annals of Operations Research, 8: 285–306.CrossRefGoogle Scholar
  35. [Heimann et al., 1990]
    Heimann, D., Mittal, N., and Trivedi, K. S. (1990). Availability and reliability modeling of computer systems. In Yovits, M., editor, Advances in Computers, volume 31, pages 176–233. Academic Press, San Diego, CA.Google Scholar
  36. [Howard, 1971]
    Howard, R. A. (1971). Dynamic Probabilistic Systems, Vol II: Semi-Markov and Decision Processes. John Wiley and Sons, New York, NY, USA.Google Scholar
  37. [Kulkarni, 1995]
    Kulkarni, V. G. (1995). Modeling and Analysis of Stochastic Systems. Chapman Hall.Google Scholar
  38. [Laprie, 1985]
    Laprie, J. C. (1985). Dependable computing and fault-tolerance: Concepts and terminology. In Proceedings of the Fifteenth International Symposium on Fault-Tolerant Computing, pages 2–7, Los Alamitos, CA. IEEE Computer Society Press.Google Scholar
  39. [Leemis, 1995]
    Leemis, L. M. (1995). Reliability: Probability Models and Statistical Methods. Prentice-Hall, Englewood Cliffs, NJ, USA.Google Scholar
  40. [Li and Silvester, 1984]
    Li, V. and Silvester, J. (1984). Performance analysis of networks with unreliable components. IEEE Transactions on Commun., COM-32(10): 1105–1110.Google Scholar
  41. [Logothetis and Trivedi, 1995]
    Logothetis, D. and Trivedi, K. (1995). Time—dependent behavior of redundant systems with deterministic repair. In Stewart, W. J., editor, Computations with Markov Chains. Kluwer Academic Publishers.Google Scholar
  42. [Logothetis and Trivedi, 1997]
    Logothetis, D. and Trivedi, K. S. (1997). The effect of detection and restoration times for error recovery in communication networks. to appear in the Journal of Network and Systems Management.Google Scholar
  43. [Mainkar and Trivedi, 1993]
    Mainkar, V. and Trivedi, K. S. (1993). Approximate analysis of priority scheduling systems using stochastic reward nets. In Proceedings of the.13th International Conference on Distributed Computing Systems–ICDCS’93, pages 466–473, Pittsburgh, PA, USA.Google Scholar
  44. [Mainkar and Trivedi, 1996]
    Mainkar, V. and Trivedi, K. S. (1996). Sufficient conditions for the existence of a fixed point in stochastic reward net-based iterative models. IEEE Transactions on Software Engineering, 22 (9): 640–653.CrossRefGoogle Scholar
  45. [Malhotra, 1996]
    Malhotra, M. (1996). A computationally efficient technique for transient analysis of repairable Markovian systems. Performance Evaluation, 24 (4): 311–331.CrossRefGoogle Scholar
  46. [Malhotra and Trivedi, 1993]
    Malhotra, M. and Trivedi, K. S. (1993). A methodology for formal expression of hierarchy in model solution. In Proceedings of the Fifth International Workshop of Petri Nets and Performance Models, PNPM’93, pages 258–267, Toulouse, France.CrossRefGoogle Scholar
  47. [Malhotra and Trivedi, 1994]
    Malhotra, M. and Trivedi, K. S. (1994). Power-hierarchy of dependability-model types. IEEE Transactions on Reliability, R-43(3): 493–502.Google Scholar
  48. [Miranker, 1981]
    Miranker, W. (1981). Numerical Methods for Stiff Equations and Singular Perturbation Problems. D. Reidel, Dordrecht, Holland.Google Scholar
  49. [Muntz et al., 1989]
    Muntz, R. R., de Souza e Silva, E., and Goyal, A. (1989). Bounding availability of repairable computer systems. IEEE Transactions on Computers, C-38(12): 1714–1723.Google Scholar
  50. [Muppala et al., 1996]
    Muppala, J. K., Malhotra, M., and Trivedi, K. S. (1996). Markov dependability models of complex systems: Analysis techniques. In Ozekici, S., editor, Reliability and Maintenance of Complex Systems, pages 442–486, Berlin, Germany. Springer.Google Scholar
  51. [Muppala et al., 1992]
    Muppala, J. K., Sathaye, A., Howe, R., and Trivedi, K. (1992). Dependability modeling of a heterogeneous VAXcluster system usingGoogle Scholar
  52. stochastic reward nets. In Avresky, D., editor, Hardware and Software Fault Tolerance in Parallel Computing Systems,pages 33–59. Ellis Horwood Ltd.Google Scholar
  53. [Muppala and Trivedi, 1995]
    Muppala, J. K. and Trivedi, K. S. (1995). System dependencies in Markov dependability modelling. In Mittal, R., Muthukrishnan, C. R., and Bhatkar, V. P., editors, Fault-Tolerant Systems and Software, Proceedings of FTS-95, pages 38–47. Narosa Publishing House, New Delhi, India.Google Scholar
  54. [Muppala et al., 1993]
    Muppala, J. K., Woolet, S. P., and Trivedi, K. S. (1993). On modeling performance of real-time systems in the presence of failures. In Readings in Real-Time Systems, pages 219–239, Los Alamitos, CA, USA.Google Scholar
  55. [Neuts, 1978]
    Neuts, M. F. (1978). Renewal process of phase type. Naval Research Logistics Quartely, 25 (3): 445–454.CrossRefGoogle Scholar
  56. [Rai et al., 1995]
    Rai, S., Veeraraghavan, M., and Trivedi, K. S. (1995). A survey of efficient reliability computation using disjoint products approach. Networks, 25: 147–163.CrossRefGoogle Scholar
  57. [Reibman and Trivedi, 1988]
    Reibman, A. L. and Trivedi, K. S. (1988). Numerical transient analysis of Markov models. Computers and Operations Research, 15 (1): 19–36.CrossRefGoogle Scholar
  58. [Ross, 1970]
    Ross, S. M. (1970). Applied Probability Models with Optimization Applications. Holden-Day, San Francisco.Google Scholar
  59. [Sahner and Trivedi, 1987a]
    Sahner, R. A. and Trivedi, K. S. (1987a). Performance and reliability analysis using directed acyclic graphs. IEEE Transactions on Software Engineering, SE-13(10): 1105–1114.Google Scholar
  60. [Sahner and Trivedi, 1987b]
    Sahner, R. A. and Trivedi, K. S. (1987b). Reliability modeling using SHARPE. IEEE Transactions on Reliability, R-36(2): 186193.Google Scholar
  61. [Sahner et al., 1995]
    Sahner, R. A., Trivedi, K. S., and Puliafito, A. (1995). Performance and Reliability Analysis of Computer Systems: An Example-Based Approach Using the SHARPE Software Package. Kluwer Academic Publishers, Dordrecht, The Netherlands.Google Scholar
  62. [Shooman, 1970]
    Shooman, M. L. (1970). The equivalence of reliability diagram and fault-tree analysis. IEEE Transactions on Reliability, R-19(5): 74–75.Google Scholar
  63. [Singh et al., 1977]
    Singh, C., Billington, R., and Lee, S. Y. (1977). The method of stages for non-Markovian models. IEEE Transactions on Reliability, 26 (2): 135–137.CrossRefGoogle Scholar
  64. [Telek, 1994]
    Telek, M. (1994). Some Advanced Reliability Modeling Techniques. PhD thesis, Technical University of Budapest, Departament of Telecomunications, Budapest, Hungary.Google Scholar
  65. [Telek et al., 1995]
    Telek, M., Bobbio, A., Jereb, L., and Trivedi, K. (1995). Steady state analysis of Markov regenerative spn with age memory policy. In Proceedings of the International Conference on Performance Tools and MMB ‘85,Heidelberg, Germany.Google Scholar
  66. [Tomek and Trivedi, 1991]
    Tomek, L. A. and Trivedi, K. S. (1991). Fixed point iteration in availability modeling. In Cin, M. D. and Hohl, W., editors, Proceedings of the 5th International GI/ITG/GMA Conference on Fault-Tolerant Computing Systems, pages 229–240, Berlin. Springer-Verlag.CrossRefGoogle Scholar
  67. [Trivedi, 1982]
    Trivedi, K. S. (1982). Probability e4 Statistics with Reliability, Queueing, and Computer Science Applications. Prentice-Hall, Englewood Cliffs, NJ, USA.Google Scholar
  68. [Trivedi et al., 1992]
    Trivedi, K. S., Muppala, J. K., Woolet, S. P., and Haverkort, B. R. (1992). Composite performance and dependability analysis. Performance Evaluation, 14 (3 & 4): 197–216.CrossRefGoogle Scholar
  69. [Veeraraghavan and Trivedi, 1991]
    Veeraraghavan, M. and Trivedi, K. S. (1991). An improved algorithm for the symbolic reliability analysis of networks. IEEE Transactions on Reliability, R-40(3): 347–358.Google Scholar
  70. [Wu et al., 1994]
    Wu, S., Huang, R., and Wan, D. (1994). Reliability analysis of a repairable system without being repaired “as good as new”. Microelectronics and Reliability, 34 (2): 357–360.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2000

Authors and Affiliations

  • Jogesh K. Muppala
    • 1
  • Ricardo M. Fricks
    • 2
  • Kishor S. Trivedi
    • 3
  1. 1.Dept. of Computer ScienceThe Hong Kong University of Science and TechnologyHong Kong
  2. 2.SIMEPAR — The Meteorological System of ParanáParaná State Power CompanyCuritibaBrazil
  3. 3.Dept. of Electrical and Computer EngineeringDuke UniversityDurhamUSA

Personalised recommendations