How Autonomic Fault-Management Can Address Current Challenges in Fault-Management Faced in IT and Telecommunication Networks

  • Ranganai Chaparadza
  • Nikolay Tcholtchev
  • Vassilios Kaldanis
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 63)


In this paper we discuss the perspectives that should be taken into account by the research community while trying to evolve Fault-Management towards Autonomic Fault-Management. The well known and established FCAPS Management Framework for Fault-management, Configuration-management, Accounting-management, Performance-management and Security-management, assumes the involvement of human technicians in the management of systems and networks as is the practice today. Due to the growing complexity of networks, services and the management of both, it is now widely believed within the academia and the industry that the concept of Self-Managing Networks will address some of the current challenges in the management of networks and services. Emerging Self-Management technologies are promising to reduce OPEX for the network operator. There is still a lot of work to be done before we can see advanced, production level self-manageability aspects of systems and networks, beyond what has been achieved through scripting based automation techniques that have been successfully applied to management and network operation processes. The concept of autonomicity—realized through control-loop structures and feed-back mechanisms and processes, as well as the information/knowledge flow used to drive the control-loops), becomes an enabler for advanced self-manageability of networks and services, beyond what has been achieved through scripting based automation techniques. A control-loop can be introduced to bind the processes involved in each of the FCAPS areas, and the “autonomic manager components” that drive the control loops and are specific for different FCAPS should interwork with each other in order to close the gaps characterized by dependencies among FCAPS functional areas as the FCAPS functional areas go autonomic and realize self-management. The dependencies among FCAPS functional areas need to be studied such that the functions/operations and processes that belong to the different areas can be well interconnected to achieve global system goals, such as integrity, resilience and high degree guarantee of system and service availability.


Autonomic Fault-Management GANA architectural Reference Model for Autonomic Networking and Self-Management Resilience Self-Healing/Self-Repair dependencies among FCAPS functional areas Interactions between the Operator and the Autonomic Network 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wallin, S., Leijon, V.: Telecom network and service management: An operator survey. In: Pfeifer, T., Bellavista, P. (eds.) MMNS 2009. LNCS, vol. 5842, pp. 15–26. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Autonomic Computing: An Architectural Blueprint for Autonomic Computing. IBM White Paper (2006),
  3. 3.
    Chaparadza, R.: Requirements for a Generic Autonomic Network Architecture (GANA), suitable for Standardizable Autonomic Behavior Specifications for Diverse Networking Environments. International Engineering Consortium (IEC), Annual Review of Communications 61 (2008)Google Scholar
  4. 4.
    Chaparadza, R., Papavassiliou, S., Kastrinogiannis, T., Vigoureux, M., Dotaro, E., Davy, A., Quinn, K., Wodczak, M., Toth, A.: Towards the future internet - A European research perspective. In: Creating a viable Evolution Path towards Self-Managing Future Internet via a Standardizable Reference Model for Autonomic Network Engineering, pp. 136–147. IOS Press, Amsterdam (2009); published by the Future Internet Assembly (FIA) in EuropeGoogle Scholar
  5. 5.
    Chaparadza, R.: Unifaff: a unified framework for implementing autonomic fault management and failure detection for self-managing networks. Int. J. Netw. Manag. 19(4), 271–290 (2009)CrossRefGoogle Scholar
  6. 6.
    Tcholtchev, N.: Scalabale Markov Chain based Algorithm for Fault-Isolation in Autonomic Networks. Accepted to appear in the Proceedings of the NGN Symposium of Globecom (2010)Google Scholar
  7. 7.
    Li, N., Chen, G., Zhao, M.: Autonomic fault management for wireless mesh networks. Electronic Journal for E-Commence Tools and Applications (eJETA) 2(4) (January 2009),
  8. 8.
    The FCAPS Management Framework. ITU-T Rec. M.3400 (February 2000)Google Scholar
  9. 9.
    Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secur. Comput. 1(1), 11–33 (2004)CrossRefGoogle Scholar
  10. 10.
    Autenrieth, A.: Differentiated Resilience in IP-Based Multilayer Transport Networks. Ph.D. thesis, Technische Universität München (2003); Presented in 2003 at ”Lehrstuhl für Kommunikationsnetze”Google Scholar
  11. 11.
    Tcholtchev, N., Grajzer, M., Vidalenc, B.: Towards a Unified Architecture for Resilience, Survivability and Autonomic Fault-Management for Self-Managing Networks. In: MONA 2009: Proc. of 2nd Workshop on Monitoring, Adaptation and Beyond, MONA+ (2009)Google Scholar
  12. 12.
    Tcholtchev, N., Chaparadza, R., Prakash, A.: Addressing stability of control-loops in the context of the GANA architecture: Synchronization of actions and policies. In: Spyropoulos, T., Hummel, K.A. (eds.) IWSOS 2009. LNCS, vol. 5918, pp. 262–268. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    Markopoulou, A., Iannaccone, G., Bhattacharyya, S., Chuah, C.N., Ganjali, Y., Diot, C.: Characterization of Failures in an Operational IP Backbone Network. IEEE/ACM Trans. Netw. 16(4), 749–762 (2008)CrossRefGoogle Scholar
  14. 14.
    Touvet, F., Harle, D.: Network Resilience in Multilayer Networks: A Critical Review and Open Issues. In: Lorenz, P. (ed.) ICN 2001. LNCS, vol. 2093, pp. 829–838. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  15. 15.
    Types and Characteristics of SDH Network Protection Architectures. ITU-T Rec. G.841 (December 1997)Google Scholar
  16. 16.
    Steinder, M., Sethi, A.S.: A survey of fault localization techniques in computer networks. Science of Computer Programming 53(2), 165–194 (2004), MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Tcholtchev, N., Chaparadza, R.: On Self-Healing based on collaborating End-Systems, Access, Edge and Core Network Components. In: SELFMAGICNETS 2010: Proc. of the International Workshop on Autonomic Networking and Self-Management. ICST ACCESSNETS 2010 (November 2010)Google Scholar
  18. 18.
    Information Technology - Open Systems Interconnection - Systems Management: Alarm Reporting Function, ITU-T Rec. X.733 (February 1994)Google Scholar
  19. 19.
    Juniper Networks Inc.: Juniper Network Whitepaper: What’s Behind Network Downtime? Proactive Steps to Reduce Human Error and Improve Availability of Networks (2008)Google Scholar
  20. 20.
    Tcholtchev, N., Chaparadza, R.: Autonomic Fault-Management and Resilience from the Perspective of the Network Operation Personnel. In: IEEE MENS 2010: IEEE International Workshop on Management of Emerging Networks and Services (MENS), Miami (December 2010); in conjunction with IEEE Globecom 2010Google Scholar
  21. 21.
    EC FP7-IP EFIPSANS Project (2008-2010), INFSO-ICT-215549
  22. 22.
    Hasan, M., Sugla, B., Viswanathan, R.: A conceptual framework for network management event correlation and filtering systems. In: Sloman, et al. (eds.), pp. 233–246 (1999)Google Scholar

Copyright information

© ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2011

Authors and Affiliations

  • Ranganai Chaparadza
    • 1
  • Nikolay Tcholtchev
    • 1
  • Vassilios Kaldanis
    • 2
  1. 1.Fraunhofer FOKUS Institute for Open Communication SystemsBerlinGermany
  2. 2.VELTI S.A. - Mobile Marketing & AdvertisingAthensGreece

Personalised recommendations