Abstract
Uncertainty is inherent in data streams, and presents new challenges to data streams mining. For continuous arriving and large size of data streams, representations of uncertain time series data streams require significantly more space. Therefore, it is important to construct compressed representation for storing uncertain time series data. A granular sketch is designed to create hash-compressed storage and store granules. As the granular sketch may be saturated with the increasing of data streams, this paper presents an optimization strategy to delete the absolute sparse patterns. Based on the granular sketch, a sequential pattern mining algorithm is proposed for mining uncertain data streams. The experimental results illustrate the effectiveness of the pattern mining algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Leung, C., Hao, B.: Mining of frequent items from streams of uncertain data. In: Proc. IEEE Computer Society, pp. 1663–1670 (2009)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD, pp. 1–12 (2000)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. VLDB, pp. 487–499 (1994)
Chui, C.-K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 47–58. Springer, Heidelberg (2007)
Ackermann, M.R., Lammersen, C., Martens, M., Raupach, C., Swierkot, K., Sohler, C.: StreamKM++: A Clustering Algorithm for Data Streams. Journal of Experimental Algorithmics (JEA) 17(1) (July 2012)
Tran, T.T.-L., Peng, P., Li, B.D., Diao, Y., Liu, A.N.: PODS: a new model and processing algorithms for uncertain data streams. In: Proceedings of the 2010 International Conference on Management of Data, Indiana, USA, pp. 159–170 (2010)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. of 20th ICDE, pp. 487–499 (1997)
Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 114–124. Springer, Heidelberg (2005)
Cheung, W., Zaiane, R.: Incremental mining of frequent patterns without candidate generation or support constraint. In: Proc. IDEAS, pp. 111–116 (2003)
Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008)
Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency Moments. In: ACM Symposium on Theory of Computing, pp. 20–29 (1996)
Cormode, G., Muthukrishnan, S.: An Improved Data-Stream Summary: The Count-min Sketch and its Applications. Journal of Algorithms 55(1), 58–75 (2005)
Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: Tracking most frequent items dynamically. In: Proceedings of the 22nd ACM Symposium on Principles of Database Systems, pp. 296–306 (2003)
Manerikar, N., Palpanas, T.: Frequent items in streaming data: An experimental evaluation of the state-of-the-art. Technical Report DISI-08-017, University of Trento (March 2008)
Aggarwal, C.: A Framework for Clustering Massive-Domain Data Streams. In: IEEE 25th International Conference on Data Engineering (ICDE 2009), pp. 102–113 (2009)
Kaneiwa, K., Kudo, Y.: A sequential pattern mining algorithm using rough set theory. International Journal of Approximate Reasoning 52(6), 881–893 (2011)
Liu, Y., Zhang, L., Guan, Y.: Sketch-based Streaming PCA Algorithm for Network-wide Traffic Anomaly Detection. In: Proc. of IEEE 30th International Conference on Distributed Computing Systems (ICDCS), pp. 807–816 (2010)
Kanda, Y., Fukuda, K., Sugawara, T.: Evaluation of Anomaly Detection Based on Sketch and PCA. In: Proceedings of IEEE Global Telecommunications Conference (GLOBECOM), pp. 1–5 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, J., Chen, P., Sheng, X. (2013). Granular Sketch Based Uncertain Data Streams Pattern Mining. In: Yang, Y., Ma, M., Liu, B. (eds) Information Computing and Applications. ICICA 2013. Communications in Computer and Information Science, vol 391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53932-9_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-53932-9_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53931-2
Online ISBN: 978-3-642-53932-9
eBook Packages: Computer ScienceComputer Science (R0)