Skip to main content

Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index

  • Conference paper
  • First Online:
Book cover Machine Learning and Data Mining in Pattern Recognition (MLDM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10358))

  • 3779 Accesses

Abstract

The variety of concept’s specialization can be an index of how the concept is significant. From this viewpoint, given two incident relations as datasets, we consider formal concepts with many frequent subconcepts in one dataset, while those have few frequent subconcepts in another dataset. Instead of calculating the number of frequent subconcepts directly, we introduce a structural index that approximates the depth complexity of join semilattice of frequent concepts, and consider an anti-monotonic constraint for one dataset and a monotonic constraint for another one. Based on these two constraints, we develop a procedure to search for “emerging concepts” with respect to the structural index. Although it is generally a hard task to compute the structural index, the index we choose is known as efficient for large sparse data. The experimental results show the effectiveness of proposed method, involving some interesting output concepts contrasting two datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We have collected them from a CD-ROM edition of the newspapers.

  2. 2.

    “Hokkaido” is the northernmost prefecture in Japan.

  3. 3.

    The lattices have been drawn by Graphviz (http://www.graphviz.org) via FcaStone (http://fcastone.sourceforge.net).

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Databases (VLDB 1994), pp. 487–499 (1994)

    Google Scholar 

  2. Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithm for frequent/closed/maximal itemsets. In: Proceedings of IEEE ICDM 2004 Workshop (FIMI 2004) (2004). http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126

  3. Workshop on Frequent Itemset Mining Implementations (FIMI 2004) (2004). http://fimi.ua.ac.be/fimi04/

  4. Zaki, M.J., Hsiao, C.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the 2002 SIAM International Conference on Data Mining (SDM 2002), pp. 457–453 (2002)

    Google Scholar 

  5. Wang, J., Han, J., Pei, J.: CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp. 236–245 (2003)

    Google Scholar 

  6. Leroy, V., Kirchgessner, M., Termier, A., Amer-Yahia, S.: TopPI: an efficient algorithm for item-centric mining. Inf. Syst. 64, 104–118 (2017). Elsevier

    Article  Google Scholar 

  7. Zida, S., Furnier-Viger, P., Lin, J.C., Wu, C., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51, 595–625 (2016). Online First Articles, Springer

    Article  Google Scholar 

  8. Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3), Article 9 (2006)

    Google Scholar 

  9. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining - current status and future directions. Data Mining Knowl. Disc. 15(1), 55–86 (2007). Springer

    Article  MathSciNet  Google Scholar 

  10. Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining colossal frequent patterns by core pattern fusion. In: Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE 2007), pp. 706–715 (2007)

    Google Scholar 

  11. Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Mining Knowl. Disc. 5(3), 213–246 (2001). Kluwer Academic Publishers

    Article  MATH  Google Scholar 

  12. Omiecinski, E.R.: Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15(1), 57–69 (2003)

    Article  MathSciNet  Google Scholar 

  13. Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), pp. 305–312 (2007)

    Google Scholar 

  14. Dong, G., Li, J.: Mining border descriptions of emerging patterns from dataset pairs. Knowl. Inf. Syst. 8(2), 178–202 (2005). Springer

    Article  Google Scholar 

  15. Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowl. Inf. Syst. 3(2), 131–145 (2001). Springer

    Article  MATH  Google Scholar 

  16. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)

    Book  MATH  Google Scholar 

  17. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999). Elsevier

    Article  MATH  Google Scholar 

  18. Vychodil, V.: A new algorithm for computing formal concepts. In: Proceedings of The 19th European Meeting on Cybernetics and Systems Research, pp. 15–21 (2008)

    Google Scholar 

  19. Bron, C., Kerbosch, J.: Algorithm 457 - finding all cliques of an undirected graph. Commun. ACM 16(9), 575–577 (1973)

    Article  MATH  Google Scholar 

  20. Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363(1), 28–42 (2006). Elsevier

    Article  MathSciNet  MATH  Google Scholar 

  21. Tomita, E., Nakanishi, H.: Polynomial-time solvability of the maximum clique problem. In: Computing and Computational Intelligence, pp. 203–208. World Scientific and Engineering Academy and Society (2009)

    Google Scholar 

  22. Eppstein, D., Strash, D.: Listing all maximal cliques in large sparse real-world graphs. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 364–375. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20662-7_31

    Chapter  Google Scholar 

  23. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NPCompleteness. W.H. Freeman and Company, New York (1979)

    Google Scholar 

  24. Okubo, Y., Haraguchi, M.: An algorithm for extracting rare concepts with concise intents. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS (LNAI), vol. 5986, pp. 145–160. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11928-6_11

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Makoto Haraguchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Okubo, Y., Haraguchi, M. (2017). Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2017. Lecture Notes in Computer Science(), vol 10358. Springer, Cham. https://doi.org/10.1007/978-3-319-62416-7_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62416-7_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62415-0

  • Online ISBN: 978-3-319-62416-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics