Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 247))

Abstract

Publishing data about individuals without revealing sensitive information about them is an important problem. Distributed data mining applications use sensitive data from distributed databases held by different parties. This comes into direct conflict with an individual’s need and right to privacy. It is thus of great importance to develop adequate security techniques for protecting privacy of individual values used for data mining. Here, we study how to maintain privacy in distributed data mining. That is, we study how two (or more) parties can find frequent itemsets in a distributed database without revealing each party’s portion of the data to the other. In this paper, we consider privacy-preserving naïve-Bayes classifier for horizontally partitioned distributed data and propose data mining privacy by decomposition (DMPD) method that uses genetic algorithm to search for optimal feature set partitioning by classification accuracy and k-anonymity constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kargupta, H., Chan, P.: Advances in Distributed and Parallel Knowledge Discovery. MIT, AAAI Press, Cambridge, New York (2000)

    Google Scholar 

  2. Vaidya, J., Clifton, C.: Privacy-preserving data mining: Why, how and when. IEEE Security and Privacy, 19–27 (November/December 2004)

    Google Scholar 

  3. Evfimievski, A., Ramakrishnan, S., Agrawal, R., Gehrke, J.: Privacy- preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada (July 2002)

    Google Scholar 

  4. Kantarcioglu, M., Vaidya, J.: Privacy preserving naive Bayes classifier for horizontally partitioned data. In: Proceedings of IEEE Workshop on Privacy Preserving Data Mining (2003)

    Google Scholar 

  5. Vaidya, J., Clifton, C.: Privacy-preserving association rule mining in vertically partitioned data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. ACM Press, New York (2002)

    Google Scholar 

  6. Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16(4), 434–447 (2004)

    Article  Google Scholar 

  7. Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 682–693 (2002)

    Google Scholar 

  8. Clifton, C., Kantarcioglou, M., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Exploration 4(2), 1–7 (2002)

    Article  Google Scholar 

  9. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206–215. ACM Press, New York (2003)

    Google Scholar 

  10. Kantarcioglu, M., Vaidya, J.: Privacy-preserving naive Bayes classifier for horizontally partitioned data. In: IEEE Workshop on Privacy Preserving Data Mining (2003)

    Google Scholar 

  11. Vaidya, J., Clifton, C.: Privacy preserving naive Bayes classifier on vertically partitioned data. In: 2004 SIAM International Conference on Data Mining (2004)

    Google Scholar 

  12. Wright, R., Yang, Z.: Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. In: KDD 2004, Seattle, Washington, USA (August 2004)

    Google Scholar 

  13. Yang, Z., Zhong, S., Wright, R.: Privacy-preserving classification of customer data without loss of accuracy. In: Proceedings of the 5th SIAM International Conference on Data Mining, Newport Beach, CA (April 2005)

    Google Scholar 

  14. Alpaydin, E.: Combined 5 _ 2 CV F-test for comparing supervised classification learning classifiers. Neural Computation 11, 1975–1982 (1999)

    Google Scholar 

  15. Cohen, S., Rokach, L., Maimon, O.: Decision-tree instance-space decomposition with grouped gain-ratio. Information Sciences 177(17), 3592–3612 (2007)

    Article  Google Scholar 

  16. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: An overview. In: Advances in Knowledge Discovery and Data Mining, pp. 1–31. AAAI Press, Menlo Park (1996)

    Google Scholar 

  17. Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. In: Forrest, S. (ed.) Proc. of the Fifth International Conference on Genetic Algorithms, pp. 416–423. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  18. Friedman, A., Schuster, R.W.: Providing k-anonymity in data mining. VLDB 17(4), 789–804 (2008)

    Article  Google Scholar 

  19. Fung, B.C.M., Wang, K., Yu, P.S.: Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering 19(5), 711–725 (2007)

    Article  Google Scholar 

  20. Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proc. of the 21st IEEE International Conference on Data Engineering, ICDE 2005, pp. 205–216. IEEE Computer Society, Washington, DC (2005)

    Google Scholar 

  21. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston (1989)

    MATH  Google Scholar 

  22. Jones, D.F., Mirrazavi, S.K., Tamiz, M.: Multiobjective meta-heuristics: An overview of the current state-of-the-art. European Journal of Operational Research 137(1), 1–9 (2002)

    Google Scholar 

  23. Kim, S.W., Park, S., Won, J.I., Kim, A.W.: Privacy preserving data mining of sequential patterns for network traffic data. Information Sciences 178(3), 694–713 (2008)

    Article  Google Scholar 

  24. Konaka, D.W., Coitb, A.E.: Smithc, Multi-objective optimization using genetic algorithms: A tutorial. Reliability Engineering and System Safety 91, 992–1007 (2006)

    Article  Google Scholar 

  25. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996)

    Google Scholar 

  26. Meints, M., Moller, J.: Privacy preserving data mining – a process centric view from a European perspective (2004), http://www.fidis.net

  27. Sharpe, P.K., Glover, R.P.: Efficient GA based techniques for classification. Applied Intelligence 11, 277–284 (1999)

    Article  Google Scholar 

  28. Zhang, J., Zhuang, J., Du, H., Wang, S.: Self-organizing genetic algorithm based tuning of PID controllers. Information Sciences 179(7), 1007–1018 (2009)

    Article  MATH  Google Scholar 

  29. Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary Computation 8(2), 173–195 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lambodar Jena .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Jena, L., Kamila, N.K., Mishra, S. (2014). Privacy Preserving Distributed Data Mining with Evolutionary Computing. In: Satapathy, S., Udgata, S., Biswal, B. (eds) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2013. Advances in Intelligent Systems and Computing, vol 247. Springer, Cham. https://doi.org/10.1007/978-3-319-02931-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02931-3_29

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02930-6

  • Online ISBN: 978-3-319-02931-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics