Abstract
Many complex and important real-life applications, such as surveillance, monitoring and fraud detection, need to identify entire time-series, from a given collection, as anomalous. In this paper, we formulate and propose a solution for this inter-time-series anomaly detection problem, which is different from the usual intra-time-series anomaly detection, which identifies an anomalous “region” within a given single time-series. We formulate the notion of causally anomalous multi-variate time-series, and propose algorithms to identify them in a given database, using well-established notions of both linear and nonlinear Granger causality. The idea is to use (either domain knowledge or frequently observed) causal relations that hold between the univariate time-series corresponding to individual attributes, and identify those time-series as anomalous where this expected causality is violated. We use the proposed algorithms to detect causally anomalous time-series in several public datasets, in different domains such as economics, engineering, and medicine. Our experiments show that the causally anomalous time-series are not detected by strong baseline algorithms, indicating that this is a new notion of anomaly that complements the more standard formulations of what makes a time-series anomalous. We then present a detailed real-life case-study in a large stock exchange, where these techniques were used to identify agents with suspicious order behavior. We also point out limitations of the proposed notion of causally anomalous time-series.
Similar content being viewed by others
Notes
This is a proprietary data and we are unable to share it.
References
Granger, C.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Second International Conference on Knowledge Discovery and Data Mining (KDD), pp. 226–231 (1996)
Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 93–104 (2000)
Chandola, V., Cheboli, D., Kumar, V.: Detecting anomalies in a time series database. In: UMN TR09-004 (2009)
Keogh, E., Lin, J., Fu, A.: Hot sax: efficiently finding the most unusual time series subsequence. In: International Conference on Data Mining (ICDM), pp. 226–233 (2005)
Keogh, E., Lin, J., Lee, S., Herle, H.V.: Finding the most unusual time series subsequence: algorithms and applications. Knowl. Inf. Syst. 11(1), 1–27 (2006)
Qiu, H., Liu, Y., Subrahmanya, N., Li. W.: Granger causality for time-series anomaly detection. In: 12th IEEE International Conference on Data Mining (ICDM), pp. 1074–1079 (2012)
Tatusch, M., Klassen, G., Bravidor, M., Conrad, S.: Show me your friends and i’ll tell you who you are. Finding anomalous time series by conspicuous cluster transitions. In: Australasian Conference on Data Mining (AusDM 2019): Data Mining, pp. 91–103 (2019)
Tatusch, M., Klassen, G., Conrad, S.: Behave or be detected! identifying outlier sequences by their group cohesion. In: International Conference on Big Data Analytics and Knowledge Discovery (DaWaK 2020), pp. 333–347 (2020)
Chakrabarti, D., Kumar, R., Tomkins, A.S.: Evolutionary clustering. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp. 554–560 (2006)
Brock, W.: Causality, chaos, explanation and prediction in economics and finance. In: Beyond Belief: Randomness, Prediction and Explanation in Science, pp. 230–279 (1991)
Baek, E., Brock, W.: A nonparametric test for independence of a multivariate time series. In: Statistica Sinica 2, pp. 137–156 (1992)
Denker, M., Keller, G.: On u-statistics and von-mises statistics for weakly dependent processes. Zeitschrift fur Wahrscheinlichkeitstheorie und Vervandte Gebiete 64, 505–522 (1983)
Hiemstra, C., Jones, J.: Testing for linear and nonlinear granger causality in the stock price-volume relation. J. Finance 49, 1639–1664 (1994)
Milton, J.S., Arnold, J.C.: Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences, 4th edn. McGraw-Hill Education, London (2002)
Bilgin, O., Kazan, F.A.: The effect of magnet temperature on speed, current and torque in PMSMs. In: XXII International Conference on Electrical Machines (ICEM), pp. 2080–2085 (2016)
Isenkul, M., Sakar, B., Kursun, O.: Improved spiral test using digitized graphics tablet for monitoring parkinson’s disease. In: 2nd International Conference on e-Health and Telemedicine (ICEHTM), pp. 171–175 (2014)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 427–438 (2000)
Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: Eighth IEEE International Conference on Data Mining (ICDM), pp. 413–422 (2008)
Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: Workshop on Knowledge Discovery in Databases (KDD) (1994)
Keogh, E.J., Pazzani, M.J.: Derivative dynamic time warping. In: SIAM International Conference on Data Mining, pp. 1–11 (2001)
Copeland, T.E.: A model of asset trading under the assumption of sequential information arrival. J. Finance 31(4), 1149–1168 (1976)
Jennings, R.H., Starks, L.T., Fellingham, J.C.: An equilibrium model of asset trading with sequential information arrival. J. Finance 36(1), 143–161 (1981)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Apte, M., Vaishampayan, S. & Palshikar, G.K. Detection of causally anomalous time-series. Int J Data Sci Anal 11, 141–153 (2021). https://doi.org/10.1007/s41060-021-00248-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-021-00248-2