Finding Top-k Fuzzy Frequent Itemsets from Databases

Li, Haifeng; Wang, Yue; Zhang, Ning; Zhang, Yuejin

doi:10.1007/978-3-319-61845-6_3

Haifeng Li¹⁶,
Yue Wang¹⁶,
Ning Zhang¹⁶ &
…
Yuejin Zhang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10387))

Included in the following conference series:

International Conference on Data Mining and Big Data

3815 Accesses

Abstract

Frequent itemset mining is an important in data mining. Fuzzy data mining can more accurately describe the mining results in frequent itemset mining. Nevertheless, frequent itemsets are redundant for the users. A better way is to show the top-k results accordingly. In this paper, we define the score of fuzzy frequent itemset and propose the problem of top-k fuzzy frequent itemset mining, which, to the best of our knowledge, has never been focused on before. To address this problem, we employ a data structure named TopKFFITree to store the superset of the mining results, which has a significantly reduced size in comparison to all the fuzzy frequent itemsets. Then, we present an algorithm named TopK-FFI to build and maintain the data structure. In this algorithm, we employ a method to prune most of the fuzzy frequent itemsets immediately based on the monotony of itemset score. Theoretical analysis and experimental studies over 4 datasets demonstrate that our proposed algorithm can efficiently decrease the runtime and memory cost, and significantly outperform the naive algorithm Top-k-FFI-Miner.

This research is supported by the National Natural Science Foundation of China(61100112,61309030), Beijing Higher Education Young Elite Teacher Project(YETP0987), State Key Program of the National Social Science Foundation of China(13AXW010), Discipline Construction Foundation of Central University of Finance and Economics(2016XX05).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Buckley, J.J., Hayashi, Y.: Fuzzy neural networks: a survey. Fuzzy Sets Syst. 66(1), 1–13 (1994)
Article MathSciNet Google Scholar
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: a maximal frequent itemset algorithm for transactional databases (2001)
Google Scholar
Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 74–86. Springer, Heidelberg (2002). doi:10.1007/3-540-45681-3_7
Chapter Google Scholar
Delgado, M., Marin, N., Sanchez, D., Vila, M.A.: Fuzzy association rules: general model and applications. IEEE Trans. Fuzzy Syst. 11(2), 214–225 (2003)
Article Google Scholar
Hong, T., Kuo, C., Wang, S.: A fuzzy aprioritid mining algorithm with reduced computational time. Appl. Soft Comput. 5(1), 1–10 (2004)
Article Google Scholar
Hong, T., Lin, C., Yulung, W.: Incrementally fast updated frequent pattern trees. Expert Syst. Appl. 34(4), 2424–2435 (2008)
Article Google Scholar
Kuok, C.M., Fu, A., Wong, M.H.: Mining fuzzy association rules in databases. SIGMOD Rec. 27(1), 41–46 (1998)
Article Google Scholar
Lin, C.W., Hong, T.P.: A survey of fuzzy web mining. WIREs Data Min. Knowl. Disc. 3(3), 190–199 (2013)
Article Google Scholar
Lin, C., Hong, T.: Mining fuzzy frequent itemsets based on UBFFP trees. J. Intell. Fuzzy Syst. 27(1), 535–548 (2014)
MathSciNet Google Scholar
Lin, C., Hong, T., Wenhsiang, L.: An efficient tree-based fuzzy data mining approach. Int. J. Fuzzy Syst. 12(2), 150–157 (2010)
Google Scholar
Lin, J.C.W., Li, T., Fournier-Viger, P., Hong, T.P.: A fast algorithm for mining fuzzy frequent itemsets. J. Intell. Fuzzy Syst. 29(6), 2373–2379 (2015)
Article Google Scholar
Pei, J., Han, J., Mao, R.: An efficient algorithm for mining frequent closed itemsets, Closet (2000)
Google Scholar
Wang, T., Li, Z., Yan, Y., Chen, H.: A survey of fuzzy decision tree classifier methodology. In: Cao, B.Y. (ed.) Fuzzy Information and Engineering. AISC, vol. 40, pp. 959–968. Springer, Heidelberg (2007)
Chapter Google Scholar
Yang, M.S.: A survey of fuzzy clustering. Math. Comput. Model. 18(11), 1–16 (1993)
Article MathSciNet MATH Google Scholar
Zadeh, L.A.: Fuzzy sets and systems. Int. J. General Syst. (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information, Central University of Finance and Economics, Beijing, China
Haifeng Li, Yue Wang, Ning Zhang & Yuejin Zhang

Authors

Haifeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ning Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuejin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haifeng Li .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Ying Tan
Kyushu University, Fukuoka, Japan
Hideyuki Takagi
Southern University of Science and Technology, Shenzhen, China
Yuhui Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Wang, Y., Zhang, N., Zhang, Y. (2017). Finding Top-k Fuzzy Frequent Itemsets from Databases. In: Tan, Y., Takagi, H., Shi, Y. (eds) Data Mining and Big Data. DMBD 2017. Lecture Notes in Computer Science(), vol 10387. Springer, Cham. https://doi.org/10.1007/978-3-319-61845-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-61845-6_3
Published: 24 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61844-9
Online ISBN: 978-3-319-61845-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics