EE 7765 Distributed Computing System Reliability

General: Units 3 hrs.

Catalog Description

Prereq: EE3720 and 3140 or equivalent.

Topic:

    Reliability, availability, and maintainability: concepts, definitions and analysis, MIL and Bell-core standards, Reliability prediction for electronic equipments, parts count technique, Reliability modeling, RBD and Fault trees, Redundancy techniques-static/ dynamic, Hot/cold, self purging and sift-out modular.

    Multiprocessor and multicomputer system reliability, Path, cutset and spanning tree enumeration, Terminal, multiterminal and constrained reliability measures, multimode and dependent failure analysis, case studies-VAX cluster system.

    File distribution and reliability, graceful degradation, performability analysis.

    Reliability polynomial, reliability bounds, combining the bounds, complexity analysis.

    An idea of reliable network synthesis.

    Introduction to software reliability technique.

Reference Books

  1. S. Rai and D.P. Agrawal, Distributed Computing Network Reliability, Computer Society Press, Washington DC, 1990.

  2. S. Rai and D.P. Agrawal, Advances in distributed system reliability, Computer Society Press, Washington DC, 1990.

  3. C.J. Colbourn, The combinatorics of network reliability, Oxford University Press, 1987.

  4. R. A. Sahner, K. S. Trivedi, and A. Puliafito, Performability and reliability analysis of computer systems, Kluwer Academic Publishers, 1996.

  5. Recent Journal Articles.