Privacy Preserving Distributed Data Mining with Evolutionary Computing

Jena, Lambodar; Kamila, Narendra Ku.; Mishra, Sushruta

doi:10.1007/978-3-319-02931-3_29

Lambodar Jena^5,6,
Narendra Ku. Kamila⁷ &
Sushruta Mishra⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 247))

2295 Accesses
22 Citations

Abstract

Publishing data about individuals without revealing sensitive information about them is an important problem. Distributed data mining applications use sensitive data from distributed databases held by different parties. This comes into direct conflict with an individual’s need and right to privacy. It is thus of great importance to develop adequate security techniques for protecting privacy of individual values used for data mining. Here, we study how to maintain privacy in distributed data mining. That is, we study how two (or more) parties can find frequent itemsets in a distributed database without revealing each party’s portion of the data to the other. In this paper, we consider privacy-preserving naïve-Bayes classifier for horizontally partitioned distributed data and propose data mining privacy by decomposition (DMPD) method that uses genetic algorithm to search for optimal feature set partitioning by classification accuracy and k-anonymity constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kargupta, H., Chan, P.: Advances in Distributed and Parallel Knowledge Discovery. MIT, AAAI Press, Cambridge, New York (2000)
Google Scholar
Vaidya, J., Clifton, C.: Privacy-preserving data mining: Why, how and when. IEEE Security and Privacy, 19–27 (November/December 2004)
Google Scholar
Evfimievski, A., Ramakrishnan, S., Agrawal, R., Gehrke, J.: Privacy- preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada (July 2002)
Google Scholar
Kantarcioglu, M., Vaidya, J.: Privacy preserving naive Bayes classifier for horizontally partitioned data. In: Proceedings of IEEE Workshop on Privacy Preserving Data Mining (2003)
Google Scholar
Vaidya, J., Clifton, C.: Privacy-preserving association rule mining in vertically partitioned data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. ACM Press, New York (2002)
Google Scholar
Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16(4), 434–447 (2004)
Article Google Scholar
Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 682–693 (2002)
Google Scholar
Clifton, C., Kantarcioglou, M., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Exploration 4(2), 1–7 (2002)
Article Google Scholar
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206–215. ACM Press, New York (2003)
Google Scholar
Kantarcioglu, M., Vaidya, J.: Privacy-preserving naive Bayes classifier for horizontally partitioned data. In: IEEE Workshop on Privacy Preserving Data Mining (2003)
Google Scholar
Vaidya, J., Clifton, C.: Privacy preserving naive Bayes classifier on vertically partitioned data. In: 2004 SIAM International Conference on Data Mining (2004)
Google Scholar
Wright, R., Yang, Z.: Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. In: KDD 2004, Seattle, Washington, USA (August 2004)
Google Scholar
Yang, Z., Zhong, S., Wright, R.: Privacy-preserving classification of customer data without loss of accuracy. In: Proceedings of the 5th SIAM International Conference on Data Mining, Newport Beach, CA (April 2005)
Google Scholar
Alpaydin, E.: Combined 5 _ 2 CV F-test for comparing supervised classification learning classifiers. Neural Computation 11, 1975–1982 (1999)
Google Scholar
Cohen, S., Rokach, L., Maimon, O.: Decision-tree instance-space decomposition with grouped gain-ratio. Information Sciences 177(17), 3592–3612 (2007)
Article Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: An overview. In: Advances in Knowledge Discovery and Data Mining, pp. 1–31. AAAI Press, Menlo Park (1996)
Google Scholar
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. In: Forrest, S. (ed.) Proc. of the Fifth International Conference on Genetic Algorithms, pp. 416–423. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Friedman, A., Schuster, R.W.: Providing k-anonymity in data mining. VLDB 17(4), 789–804 (2008)
Article Google Scholar
Fung, B.C.M., Wang, K., Yu, P.S.: Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering 19(5), 711–725 (2007)
Article Google Scholar
Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proc. of the 21st IEEE International Conference on Data Engineering, ICDE 2005, pp. 205–216. IEEE Computer Society, Washington, DC (2005)
Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston (1989)
MATH Google Scholar
Jones, D.F., Mirrazavi, S.K., Tamiz, M.: Multiobjective meta-heuristics: An overview of the current state-of-the-art. European Journal of Operational Research 137(1), 1–9 (2002)
Google Scholar
Kim, S.W., Park, S., Won, J.I., Kim, A.W.: Privacy preserving data mining of sequential patterns for network traffic data. Information Sciences 178(3), 694–713 (2008)
Article Google Scholar
Konaka, D.W., Coitb, A.E.: Smithc, Multi-objective optimization using genetic algorithms: A tutorial. Reliability Engineering and System Safety 91, 992–1007 (2006)
Article Google Scholar
Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996)
Google Scholar
Meints, M., Moller, J.: Privacy preserving data mining – a process centric view from a European perspective (2004), http://www.fidis.net
Sharpe, P.K., Glover, R.P.: Efficient GA based techniques for classification. Applied Intelligence 11, 277–284 (1999)
Article Google Scholar
Zhang, J., Zhuang, J., Du, H., Wang, S.: Self-organizing genetic algorithm based tuning of PID controllers. Information Sciences 179(7), 1007–1018 (2009)
Article MATH Google Scholar
Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary Computation 8(2), 173–195 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Gandhi Engineering College, Bhubaneswar, India
Lambodar Jena & Sushruta Mishra
Department of Computer Science & Engineering, Utkal University, Bhubaneswar, India
Lambodar Jena
Department of Computer Science & Engineering, C.V.Raman College of Engineering, Bhubaneswar, India
Narendra Ku. Kamila

Authors

Lambodar Jena
View author publications
You can also search for this author in PubMed Google Scholar
Narendra Ku. Kamila
View author publications
You can also search for this author in PubMed Google Scholar
Sushruta Mishra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lambodar Jena .

Editor information

Editors and Affiliations

Dept. of Computer Science Engineering, Anil Neerukonda Institute of Technology and Sciences, Vishakapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy
University of Hyderabad, Hyderabad, Andhra Pradesh, India
Siba K Udgata
Bhubaneswar Engineering College, Bhubaneswar, India
Bhabendra Narayan Biswal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jena, L., Kamila, N.K., Mishra, S. (2014). Privacy Preserving Distributed Data Mining with Evolutionary Computing. In: Satapathy, S., Udgata, S., Biswal, B. (eds) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2013. Advances in Intelligent Systems and Computing, vol 247. Springer, Cham. https://doi.org/10.1007/978-3-319-02931-3_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-02931-3_29
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02930-6
Online ISBN: 978-3-319-02931-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics