Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index

Okubo, Yoshiaki; Haraguchi, Makoto

doi:10.1007/978-3-319-62416-7_30

Yoshiaki Okubo¹⁴ &
Makoto Haraguchi¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10358))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

3779 Accesses

Abstract

The variety of concept’s specialization can be an index of how the concept is significant. From this viewpoint, given two incident relations as datasets, we consider formal concepts with many frequent subconcepts in one dataset, while those have few frequent subconcepts in another dataset. Instead of calculating the number of frequent subconcepts directly, we introduce a structural index that approximates the depth complexity of join semilattice of frequent concepts, and consider an anti-monotonic constraint for one dataset and a monotonic constraint for another one. Based on these two constraints, we develop a procedure to search for “emerging concepts” with respect to the structural index. Although it is generally a hard task to compute the structural index, the index we choose is known as efficient for large sparse data. The experimental results show the effectiveness of proposed method, involving some interesting output concepts contrasting two datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We have collected them from a CD-ROM edition of the newspapers.
2.
“Hokkaido” is the northernmost prefecture in Japan.
3.
The lattices have been drawn by Graphviz (http://www.graphviz.org) via FcaStone (http://fcastone.sourceforge.net).

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Databases (VLDB 1994), pp. 487–499 (1994)
Google Scholar
Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithm for frequent/closed/maximal itemsets. In: Proceedings of IEEE ICDM 2004 Workshop (FIMI 2004) (2004). http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126
Workshop on Frequent Itemset Mining Implementations (FIMI 2004) (2004). http://fimi.ua.ac.be/fimi04/
Zaki, M.J., Hsiao, C.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the 2002 SIAM International Conference on Data Mining (SDM 2002), pp. 457–453 (2002)
Google Scholar
Wang, J., Han, J., Pei, J.: CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp. 236–245 (2003)
Google Scholar
Leroy, V., Kirchgessner, M., Termier, A., Amer-Yahia, S.: TopPI: an efficient algorithm for item-centric mining. Inf. Syst. 64, 104–118 (2017). Elsevier
Article Google Scholar
Zida, S., Furnier-Viger, P., Lin, J.C., Wu, C., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51, 595–625 (2016). Online First Articles, Springer
Article Google Scholar
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3), Article 9 (2006)
Google Scholar
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining - current status and future directions. Data Mining Knowl. Disc. 15(1), 55–86 (2007). Springer
Article MathSciNet Google Scholar
Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining colossal frequent patterns by core pattern fusion. In: Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE 2007), pp. 706–715 (2007)
Google Scholar
Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Mining Knowl. Disc. 5(3), 213–246 (2001). Kluwer Academic Publishers
Article MATH Google Scholar
Omiecinski, E.R.: Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15(1), 57–69 (2003)
Article MathSciNet Google Scholar
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), pp. 305–312 (2007)
Google Scholar
Dong, G., Li, J.: Mining border descriptions of emerging patterns from dataset pairs. Knowl. Inf. Syst. 8(2), 178–202 (2005). Springer
Article Google Scholar
Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowl. Inf. Syst. 3(2), 131–145 (2001). Springer
Article MATH Google Scholar
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)
Book MATH Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999). Elsevier
Article MATH Google Scholar
Vychodil, V.: A new algorithm for computing formal concepts. In: Proceedings of The 19th European Meeting on Cybernetics and Systems Research, pp. 15–21 (2008)
Google Scholar
Bron, C., Kerbosch, J.: Algorithm 457 - finding all cliques of an undirected graph. Commun. ACM 16(9), 575–577 (1973)
Article MATH Google Scholar
Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363(1), 28–42 (2006). Elsevier
Article MathSciNet MATH Google Scholar
Tomita, E., Nakanishi, H.: Polynomial-time solvability of the maximum clique problem. In: Computing and Computational Intelligence, pp. 203–208. World Scientific and Engineering Academy and Society (2009)
Google Scholar
Eppstein, D., Strash, D.: Listing all maximal cliques in large sparse real-world graphs. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 364–375. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20662-7_31
Chapter Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NPCompleteness. W.H. Freeman and Company, New York (1979)
Google Scholar
Okubo, Y., Haraguchi, M.: An algorithm for extracting rare concepts with concise intents. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS (LNAI), vol. 5986, pp. 145–160. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11928-6_11
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo, 060-0814, Japan
Yoshiaki Okubo & Makoto Haraguchi

Authors

Yoshiaki Okubo
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Makoto Haraguchi .

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Okubo, Y., Haraguchi, M. (2017). Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2017. Lecture Notes in Computer Science(), vol 10358. Springer, Cham. https://doi.org/10.1007/978-3-319-62416-7_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-62416-7_30
Published: 02 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62415-0
Online ISBN: 978-3-319-62416-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics