Skip to main content

Parallel Computation of Closed Itemsets and Implication Rule Bases

  • Conference paper
  • 783 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4742))

Abstract

Formal concept analysis has been successfully applied as a data mining framework whereby target patterns come in the form of intent families and implication bases. Since their extraction is a challenging task, especially for large datasets, parallel techniques should be helpful in reducing the computational effort and increasing the scalability of the approach. In this paper we describe a way to parallelize a recent divide-and-conquer method computing both the intents and the Duquenne-Guiges implication basis of dataset. Wile intents admit a straightforward computation, adding the basis — whose definition is recursive — poses harder problems, in particular, for parallel design. A first, and by no means final, solution relies on a partition of the basis that allows the crucial and inherently sequential step of redundancy removal to be nevertheless split into parallel subtasks. A prototype implementation of our method, called ParCIM, shows a nearly linear acceleration w.r.t. its sequential counter-part.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 1st edn. Morgan Kaufmann Publishers, San Francisco, California, USA (2001)

    Google Scholar 

  2. Wang, J., Han, J., Pei, J.: Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD 2003. Proceedings of the ninth ACM SIGKDD, ACM Press, New York (2003)

    Google Scholar 

  3. Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the Second SIAM International Conference on Data Mining, SIAM (2002)

    Google Scholar 

  4. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1997)

    Google Scholar 

  5. Duquenne, V., Guigues, J.L.: Famille minimale d’implications informatives résultant d’un tableau de données binaires. Mathématiques et Sciences Sociales 95, 5–18 (1986)

    MathSciNet  Google Scholar 

  6. Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Mach. Learn. 24, 95–122 (1996)

    Google Scholar 

  7. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24, 25–46 (1999)

    Article  Google Scholar 

  8. Zaki, M.J.: Parallel and distributed association mining: A survey. IEEE Concurrency 7, 14–25 (1999)

    Article  Google Scholar 

  9. Ganter, B.: Two basic algorithms in concept analysis. Technical Report Preprint 831, Technische Hochschule, Darmstadt, Germany (1984)

    Google Scholar 

  10. Valtchev, P., Missaoui, R., Lebrun, P.: A partition-based approach towards constructing galois (concept) lattices. Discrete Math. 256, 801–829 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  11. Valtchev, P., Duquenne, V.: Towards scalable divide-and-conquer methods for computing concepts and implications. Preprint accepted to Discrete Applied Mathematics (2006)

    Google Scholar 

  12. Djoufak, J.F.K., Valtchev, P., Djamegni, C.T.: A parallel algorithm for lattice construction. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 249–264. Springer, Heidelberg (2005)

    Google Scholar 

  13. Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge (1990)

    MATH  Google Scholar 

  14. Maier, D.: The Theory of Relational Databases. Computer Science Press (1983)

    Google Scholar 

  15. Armstrong, W.W.: Dependency structures of data base relationships. In: IFIP Congress, pp. 580–583 (1974)

    Google Scholar 

  16. Goetz, S.: Algorithms in cgm, bsp and bsp* model: A survey. Technical report, Carleton Unviversity, Ottawa (1997)

    Google Scholar 

  17. Dehne, F., Fabri, A., Rau-Chaplin, A.: Scalable parallel computational geometry for coarse grained multicomputers. Journal of Computational Geometry and Applications 6, 379–400 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  18. Robson, R.: Using the STL: the C++ standard template library, 2nd edn. Springer, Heidelberg (2000)

    Google Scholar 

  19. Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edn. MIT Press, Cambridge, MA (1999)

    Google Scholar 

  20. Parka, H.K., Chi, D.H., Lee, D.K., Ryu, K.W.: An efficient parallel algorithm for merging in the postal model. ETRI Journal 21, 31–39 (1999)

    Google Scholar 

  21. Djamegni, C.T.: Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays. J. Parallel Distrib. Comput. 64, 345–359 (2004)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ivan Stojmenovic Ruppa K. Thulasiram Laurence T. Yang Weijia Jia Minyi Guo Rodrigo Fernandes de Mello

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Djoufak Kengue, J.F., Valtchev, P., Tayou Djamegni, C. (2007). Parallel Computation of Closed Itemsets and Implication Rule Bases. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74742-0_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74741-3

  • Online ISBN: 978-3-540-74742-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics