Abstract
Existing methods for finding correlations between bursty time series are limited to collections consisting of a small number of time series. In this paper, we present a novel approach for mining correlation in collections consisting of a large number of time series. In our approach, we use bursts co-occurring in different streams as the measure of their relatedness. By exploiting the pruning properties of our measure we develop new indexing structures and algorithms that allow for efficient mining of related pairs from millions of streams. An experimental study performed on a large time series collection demonstrates the efficiency and scalability of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alvanaki, F., Michel, S.: Tracking set correlations at large scale. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (2014)
Camerra, A., Palpanas, T., Shieh, J., Keogh, E.J.: iSAX 2.0: indexing and mining one billion time series. In: Proceedings of the 2010 IEEE International Conference on Data Mining (2010)
Camerra, A., Shieh, J., Palpanas, T., Rakthanmanon, T., Keogh, E.J.: Beyond one billion time series: indexing and mining very large time series collections with iSAX2+. Knowl. Inf. Syst. 39(1), 123–151 (2014)
Chien, S., Immorlica, N.: Semantic similarity between search engine queries using temporal correlation. In: Proceedings of the 14th International Conference on World Wide Web (2005)
Gehrke, J., Korn, F., Srivastava, D.: On computing correlated aggregates over continual data streams. SIGMOD Rec. 30(2), 13–24 (2001)
Kleinberg, J.: Bursty and hierarchical structure in streams. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)
Kotov, A., Zhai, C., Sproat, R.: Mining named entities with temporally correlated bursts from multilingual web news streams. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (2011)
Liao, T.W.: Clustering of time series data - a survey. Pattern Recognition 38, 1857–1874 (2005)
Mueen, A., Nath, S., Liu, J.: Fast approximate correlation for massive time-series data. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (2010)
Ratanamahatana, C., Lin, J., Gunopulos, D., Keogh, E., Vlachos, M., Das, G.: Mining time series data. In: Data Mining and Knowledge Discovery Handbook. CRC Press (2010)
Vlachos, M., Meek, C., Vagena, Z., Gunopulos, D.: Identifying similarities, periodicities and bursts for online search queries. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (2004)
Vlachos, M., Wu, K.-L., Chen, S.-K., Yu, P.S.: Correlating burst events on streaming stock market data. Data Mining and Knowledge Discovery 16(1), 109–133 (2008)
Wang, X., Zhai, C., Hu, X., Sproat, R.: Mining correlated bursty topic patterns from coordinated text streams. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
Wang, X., Zhang, K., Jin, X., Shen, D.: Mining common topics from multiple asynchronous text streams. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (2009)
Wu, D., Ke, Y., Yu, J.X., Yu, P.S., Chen, L.: Detecting leaders from correlated time series. In: Proceedings of the 15th International Conference on Database Systems for Advanced Applications (2010)
Wu, K.-L., Chen, S.-K., Yu, P.S.: Query indexing with containment-encoded intervals for efficient stream processing. Knowl. Inf. Syst. 9(1), 62–90 (2006)
Zhu, Y., Shasha, D.: StatStream: statistical monitoring of thousands of data streams in real time. In: Proceedings of the 28th International Conference on Very Large Data Bases (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kusmierczyk, T., Nørvåg, K. (2015). Mining Correlations on Massive Bursty Time Series Collections. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-18120-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18119-6
Online ISBN: 978-3-319-18120-2
eBook Packages: Computer ScienceComputer Science (R0)