Localizing Failures with Metadata

Abstract

In the paper the language of data presentation in the task of remote localization of failures and errors in a distributed information and computing system is built. The main idea is to reflect data of sensors on the oriented graph generated by the influence of some components of the distributed information and computing system on others. The main results of the paper are the conditions of unambiguous localization of implicit failures and errors based on information received from sensors, which revealed anomalies of some processes.

This is a preview of subscription content, access via your institution.

REFERENCES

  1. 1

    Lou, C., Huang, P., and Smith, S., Understanding, detecting and localizing partial failures in large system software, NSDI, 2020, pp. 1–16.

    Google Scholar 

  2. 2

    Yuan, D., Luo, Y., Zhuang, X., Rodrigues, G.R., Zhao, X., Zhang, Y., Jain, P., and Stumm, M., Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems, Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI'14), Broomfield, CO, 2014.

  3. 3

    Grusho, A.A., Zabezhailo, M.I., Smirnov, D.V., and Timonina, E.E., Search of empirical causes of failures and errors in computer systems and networks using metadata, Syst. Means Inf., 2019, vol. 29, no. 4, pp. 28–38.

    Google Scholar 

  4. 4

    Grusho, A.A., Grusho, N.A., and Timonina, E.E., Methods of identification of “weak” signs of violations of information security, Inf. Appl., 2019, vol. 13, no. 3, pp. 3–8.

    MATH  Google Scholar 

  5. 5

    Grusho, N.A., Method of integration of multiagent information search using security analysis and information services in digital infrastructures, Aut. Control Comput. Sci., 2019, vol. 53, pp. 922–931.

    Article  Google Scholar 

  6. 6

    HP OpenView. http://www1.hp.com/ctg/Manual/c01172605.

  7. 7

    IBM WebSphere Application Server. https://www.ibm.com/ru-ru/cloud/websphere-application-platform.

  8. 8

    Grusho, A., Grusho, N., and Timonina, E., Method of several information spaces for identification of anomalies, Stud. Comput. Intell., 2019, vol. 868, pp. 515–520.

    MATH  Google Scholar 

  9. 9

    Grusho, A.A., Zabezhailo, M.I., Smirnov, D.V., and Timonina, E.E., The model of the set of information spaces in the problem of insider detection, Inf. Appl., 2017, vol. 11, no. 4, pp. 65–69.

    Google Scholar 

  10. 10

    Grusho, A.A., Timonina, E.E., and Shorgin, S.Ya., The hierarchical method of metadata generation for managing network connections, Inf. Primen., 2018, vol. 12, no. 2, pp. 44–49.

    Google Scholar 

  11. 11

    Grusho, A.A., Zabezhailo, M.I., and Timonina, E.E., On causal representativeness of training samples of precedents in diagnostic type tasks, Inf. Appl., 2020, vol. 14, no. 1, pp. 80–86.

    Google Scholar 

  12. 12

    Grusho, A., Grusho, N., and Timonina, E., Detection of anomalies in non-numerical data, Proceedings of the 8th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops, Piscataway, NJ, 2016, pp. 273–276.

  13. 13

    Grusho, A.A., Zabezhailo, M.I., Zatsarinny, A.A., Nikolaev, A.V., Piskovski, V.O., Senchilo, V.V., Sudarikov, I.V., and Timonina, E.E., About the analysis of erratic statuses in the distributed computing systems, Syst. Means Inf., 2018, vol. 28, no. 1, pp. 99–109.

    Google Scholar 

  14. 14

    Grusho, A.A., Zabezhailo, M.I., Zatsarinny, A.A., Nikolaev, A.V., Piskovski, V.O., and Timonina, E.E., Erroneous states classification in distributed computing systems and sources of their occurrence, Syst. Means Inf., 2017, vol. 27, no. 2, pp. 30–41.

    Google Scholar 

  15. 15

    Minin, A. and Kalinin, M., Information security in computer networks with dynamic topology, ACM International Conference Proceeding Series, 2015. https://doi.org/10.1145/2799979.2800023

  16. 16

    Stepanova, T. and Zegzhda, D., Applying large-scale adaptive graphs to modeling Internet of Things security, Proceedings of the 7th International Conference on Security of Information and Networks, 2014, p. 479.

  17. 17

    Zegzhda, P.D., Zegzhda, D.P., and Nikolskiy, A.V., Using graph theory for cloud system security modeling, International Conference on Mathematical Methods, Models, and Architectures for Computer Network Security, Berlin, 2012, pp. 309–318.

  18. 18

    Belenko, V., Chernenko, V., Krundyshev, V., and Kalinin, M., Data-driven failure analysis for the cyber physical infrastructures, IEEE International Conference on Industrial Cyber Physical Systems, 2019, pp. 775–779. https://doi.org/10.1109/ICPHYS.2019.8854888

  19. 19

    Grusho, N.A., Grusho, A.A., Zabezhailo, M.I., and Timonina, E.E., Methods of finding the causes of information technology failures by means of meta data, Inf. Appl., 2020, vol. 14, no. 2, pp. 33–39.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to A. A. Grusho.

Ethics declarations

CONFLICT OF INTEREST

The authors declare that they have no conflicts of interest.

FUNDING

This work was partially supported by the Russian Foundation for Basic Research, projects no. 18-07-00274, 18-29-03102.

About this article

Verify currency and authenticity via CrossMark

Cite this article

Grusho, N.A., Grusho, A.A. & Timonina, E.E. Localizing Failures with Metadata. Aut. Control Comp. Sci. 54, 988–992 (2020). https://doi.org/10.3103/S0146411620080143

Download citation

Keywords:

  • localization of failures and errors
  • metadata
  • remote system administration