Abstract
In recent years, large amounts of uncertain data are emerged with the widespread employment of the new technologies, such as wireless sensor networks, RFID and privacy protection. According to the features of the uncertain data streams such as incomplete, full of noisy, non-uniform and mutable, this paper presents a probability frequent pattern tree called PFP-tree and a method called PFP-growth, to mine probability frequent patterns based on probability damped windows. The main characteristics of the suggested method include: (1) adopting time-based probability damped window model to enhance the accuracy of mined frequent patterns; (2) setting an item index table and a transaction index table to speed up retrieval on the PFP-tree; and (3) pruning the tree to remove the items that cannot become frequent patterns;. The experimental results demonstrate that PFP-growth method has better performance than the main existing schemes in terms of accuracy, processing time and storage space.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Zhou, A., Jin, C., Wang, G., Li, J.: A Survey on the Management of Uncertain Data. Journal of Computer 32(1), 1–16 (2009)
Zhang, C., Jin, C., Zhou, A.: Clustering Algorithm over Uncertain Data Streams. Journal of Software 21(9), 2173–2182 (2010)
Aggarwal, C.C., Yu, P.S.: A framework for clustering uncertain data streams. In: Proc. of the 24th Int’l Conf. on Data Engineering, ICDE 2008, pp. 150–159 (2008)
Aggarwal, C.C.: On high dimension projected clustering of uncertain data streams. In: Proc. of the 25th Int’l Conf. on Data Engineering, ICDE 2009, pp. 1152–1154 (2009)
Zhang, C., Gao, M., Zhou, A.: Tracking high quality clusters over uncertain data streams. In: Proc. of the 1st Workshop on Management and Mining of Uncertain Data (MOUND 2009) Joint with ICDE 2009, pp. 1641–1648 (2009)
Li, J., Yu, G., Zhou, A.: Requirements and Challenges of Uncertain Data Management. Communication of China Computer Federation 5(4), 6–14 (2009)
Chui, C.-K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery Data Mining, IEEE ICDM Workshops, pp. 47–58 (2007)
Chui, C.K.-S., Kao, B.: A decremental approach for mining frequent itemsets from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 64–75. Springer, Heidelberg (2008)
Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008)
Leung, C.K.-S., Carmichael, C.L., Hao, B.: Efficient mining of frequent patterns from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 489–494. Springer, Heidelberg (2007)
Leung, C.K.-S., Brajczuk, D.A.: Efficient algorithms for mining constrained frequent patterns from uncertain data. In: Proceedings of KDD Workshop on Knowledge Discovery from Uncertain Data, pp. 9–18 (2009)
Zhang, Q., Li, F., Yi, K.: Finding frequent items in probabilistic data. In: Proc. of 27th ACM International Conference on Management of Data, SIGMOD 2008, pp. 819–832 (2008)
Aggarwa, C.C., Li, Y., Wang, J., Wang, J.: Frequent Pattern Mining with Uncertain Data. In: Proc. of ACM KDD Conference, pp. 29–38 (2009)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of 19th ACM International Conference on Management of Data, SIGMOD 2000, pp. 1–12 (2000)
Leung, C.K.-S., Hao, B.: Mining of Frequent Itemsets from Streams of Uncertain Data. In: Proc. of the 1st Workshop on Management and Mining of Uncertain Data (MOUND) Joint with ICDE 2009, pp. 1663–1670 (2009)
Cortes, C., Fisher, K., Pregibon, D., et al.: ACM Transactions on Programming Languages and Systems 26(2), 301–308 (2004)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conf. on Very Large Data Bases, VLDB 1994, pp. 487–499 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liao, G., Wu, L., Wan, C., Xiong, N. (2011). A Practice Probability Frequent Pattern Mining Method over Transactional Uncertain Data Streams. In: Hsu, CH., Yang, L.T., Ma, J., Zhu, C. (eds) Ubiquitous Intelligence and Computing. UIC 2011. Lecture Notes in Computer Science, vol 6905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23641-9_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-23641-9_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23640-2
Online ISBN: 978-3-642-23641-9
eBook Packages: Computer ScienceComputer Science (R0)