Finding Interesting Rare Association Rules Using Rare Pattern Tree

Tsang, Sidney; Koh, Yun Sing; Dobbie, Gillian

doi:10.1007/978-3-642-37574-3_7

Sidney Tsang²¹,
Yun Sing Koh²¹ &
Gillian Dobbie²¹

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 7790))

762 Accesses
8 Citations

Abstract

Most association rule mining techniques concentrate on finding frequent rules. However, rare association rules are in some cases more interesting than frequent association rules since rare rules represent unexpected or unknown associations. All current algorithms for rare association rule mining use an Apriori level-wise approach which has computationally expensive candidate generation and pruning steps. We propose RP-Tree, a method for mining a subset of rare association rules using a tree structure, and an information gain component that helps to identify the more interesting association rules. Empirical evaluation using a range of real world datasets shows that RP-Tree itemset and rule generation is more time efficient than modified versions of FP-Growth and ARIMA, and discovers 92-100% of all the interesting rare association rules. Additional evaluation using synthetic datasets also shows that RP-Tree is more efficient, in addtion to showing how the execution time of RP-Tree changes with transaction length and rare-item itemset size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adda, M., Wu, L., Feng, Y.: Rare itemset mining. In: Proceedings of the Sixth International Conference on Machine Learning and Applications, ICMLA 2007, pp. 73–80. IEEE Computer Society, Washington, DC (2007)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, Santiago, Chile, pp. 487–499 (1994)
Google Scholar
Sotiris, K., Dimitris, K.: Association rules mining: A recent overview. GESTS International Transactions on Computer Science and Engineering 32 (1), 71–82 (2006)
Google Scholar
Liu, B., Hsu, W., Ma, Y.: Mining association rules with multiple minimum supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 337–341 (1999)
Google Scholar
Troiano, L., Scibelli, G., Birtolo, C.: A fast algorithm for mining rare itemsets. In: Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications, ISDA 2009, pp. 1149–1155. IEEE Computer Society, Washington, DC (2009)
Chapter Google Scholar
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, vol. 01, pp. 305–312. IEEE Computer Society, Washington, DC (2007)
Chapter Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: 3rd Intl. Conf. on Knowledge Discovery and Data Mining, pp. 283–286. AAAI Press (1997)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328 (1996)
Google Scholar
Koh, Y.S., Rountree, N.: Finding Sporadic Rules Using Apriori-Inverse. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 97–106. Springer, Heidelberg (2005)
Chapter Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 1–12. ACM, New York (2000)
Chapter Google Scholar
Mitchell, T.M.: Machine Learning, pp. 57–60. McGraw-Hill (1997)
Google Scholar
Wu, T., Chen, Y., Han, J.: Association Mining in Large Databases: A Re-examination of Its Measures. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 621–628. Springer, Heidelberg (2007)
Chapter Google Scholar
Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

The University of Auckland, New Zealand
Sidney Tsang, Yun Sing Koh & Gillian Dobbie

Authors

Sidney Tsang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Sing Koh
View author publications
You can also search for this author in PubMed Google Scholar
Gillian Dobbie
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IRIT, Paul Sabatier University,, 118 route de Narbonne, 31062, Toulouse Cedex, France
Abdelkader Hameurlain
Institute for Application Oriented Knowledge Processing, 4020, Linz, Austria
Josef Küng
FAW, University of Linz, Altenbergerstraße 69, 4040, Linz, Austria
Roland Wagner
ICAR-CNR, University of Calabria, via P. Bucci 41C, 87036, Rende (CS), Italy
Alfredo Cuzzocrea
Hewlett-Packard Laboratories, 1501 Page Mill Road, 94304, Palo Alto, CA, USA
Umeshwar Dayal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tsang, S., Koh, Y.S., Dobbie, G. (2013). Finding Interesting Rare Association Rules Using Rare Pattern Tree. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems VIII. Lecture Notes in Computer Science, vol 7790. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37574-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-37574-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37573-6
Online ISBN: 978-3-642-37574-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics