Detection of causally anomalous time-series

Abstract

Many complex and important real-life applications, such as surveillance, monitoring and fraud detection, need to identify entire time-series, from a given collection, as anomalous. In this paper, we formulate and propose a solution for this inter-time-series anomaly detection problem, which is different from the usual intra-time-series anomaly detection, which identifies an anomalous “region” within a given single time-series. We formulate the notion of causally anomalous multi-variate time-series, and propose algorithms to identify them in a given database, using well-established notions of both linear and nonlinear Granger causality. The idea is to use (either domain knowledge or frequently observed) causal relations that hold between the univariate time-series corresponding to individual attributes, and identify those time-series as anomalous where this expected causality is violated. We use the proposed algorithms to detect causally anomalous time-series in several public datasets, in different domains such as economics, engineering, and medicine. Our experiments show that the causally anomalous time-series are not detected by strong baseline algorithms, indicating that this is a new notion of anomaly that complements the more standard formulations of what makes a time-series anomalous. We then present a detailed real-life case-study in a large stock exchange, where these techniques were used to identify agents with suspicious order behavior. We also point out limitations of the proposed notion of causally anomalous time-series.

This is a preview of subscription content, access via your institution.

Notes

  1. 1.

    https://www.kaggle.com/wkirgsn/electric-motor-temperature.

  2. 2.

    https://databank.worldbank.org/source/world-development-indicators/.

  3. 3.

    This is a proprietary data and we are unable to share it.

References

  1. 1.

    Granger, C.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)

    Article  Google Scholar 

  2. 2.

    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)

    Article  Google Scholar 

  3. 3.

    Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Second International Conference on Knowledge Discovery and Data Mining (KDD), pp. 226–231 (1996)

  4. 4.

    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 93–104 (2000)

  5. 5.

    Chandola, V., Cheboli, D., Kumar, V.: Detecting anomalies in a time series database. In: UMN TR09-004 (2009)

  6. 6.

    Keogh, E., Lin, J., Fu, A.: Hot sax: efficiently finding the most unusual time series subsequence. In: International Conference on Data Mining (ICDM), pp. 226–233 (2005)

  7. 7.

    Keogh, E., Lin, J., Lee, S., Herle, H.V.: Finding the most unusual time series subsequence: algorithms and applications. Knowl. Inf. Syst. 11(1), 1–27 (2006)

    Article  Google Scholar 

  8. 8.

    Qiu, H., Liu, Y., Subrahmanya, N., Li. W.: Granger causality for time-series anomaly detection. In: 12th IEEE International Conference on Data Mining (ICDM), pp. 1074–1079 (2012)

  9. 9.

    Tatusch, M., Klassen, G., Bravidor, M., Conrad, S.: Show me your friends and i’ll tell you who you are. Finding anomalous time series by conspicuous cluster transitions. In: Australasian Conference on Data Mining (AusDM 2019): Data Mining, pp. 91–103 (2019)

  10. 10.

    Tatusch, M., Klassen, G., Conrad, S.: Behave or be detected! identifying outlier sequences by their group cohesion. In: International Conference on Big Data Analytics and Knowledge Discovery (DaWaK 2020), pp. 333–347 (2020)

  11. 11.

    Chakrabarti, D., Kumar, R., Tomkins, A.S.: Evolutionary clustering. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp. 554–560 (2006)

  12. 12.

    Brock, W.: Causality, chaos, explanation and prediction in economics and finance. In: Beyond Belief: Randomness, Prediction and Explanation in Science, pp. 230–279 (1991)

  13. 13.

    Baek, E., Brock, W.: A nonparametric test for independence of a multivariate time series. In: Statistica Sinica 2, pp. 137–156 (1992)

  14. 14.

    Denker, M., Keller, G.: On u-statistics and von-mises statistics for weakly dependent processes. Zeitschrift fur Wahrscheinlichkeitstheorie und Vervandte Gebiete 64, 505–522 (1983)

    Article  Google Scholar 

  15. 15.

    Hiemstra, C., Jones, J.: Testing for linear and nonlinear granger causality in the stock price-volume relation. J. Finance 49, 1639–1664 (1994)

    Google Scholar 

  16. 16.

    Milton, J.S., Arnold, J.C.: Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences, 4th edn. McGraw-Hill Education, London (2002)

    Google Scholar 

  17. 17.

    Bilgin, O., Kazan, F.A.: The effect of magnet temperature on speed, current and torque in PMSMs. In: XXII International Conference on Electrical Machines (ICEM), pp. 2080–2085 (2016)

  18. 18.

    Isenkul, M., Sakar, B., Kursun, O.: Improved spiral test using digitized graphics tablet for monitoring parkinson’s disease. In: 2nd International Conference on e-Health and Telemedicine (ICEHTM), pp. 171–175 (2014)

  19. 19.

    Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 427–438 (2000)

  20. 20.

    Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: Eighth IEEE International Conference on Data Mining (ICDM), pp. 413–422 (2008)

  21. 21.

    Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: Workshop on Knowledge Discovery in Databases (KDD) (1994)

  22. 22.

    Keogh, E.J., Pazzani, M.J.: Derivative dynamic time warping. In: SIAM International Conference on Data Mining, pp. 1–11 (2001)

  23. 23.

    Copeland, T.E.: A model of asset trading under the assumption of sequential information arrival. J. Finance 31(4), 1149–1168 (1976)

    Article  Google Scholar 

  24. 24.

    Jennings, R.H., Starks, L.T., Fellingham, J.C.: An equilibrium model of asset trading with sequential information arrival. J. Finance 36(1), 143–161 (1981)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Manoj Apte.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Apte, M., Vaishampayan, S. & Palshikar, G.K. Detection of causally anomalous time-series. Int J Data Sci Anal (2021). https://doi.org/10.1007/s41060-021-00248-2

Download citation

Keywords

  • Anomaly detection
  • Time-series
  • Granger causality
  • Stock market frauds
  • Stock market order book surveillance