Data owners worry about their private data in the information that is being uncovered without authorization in the cloud computing environment. While applying privacy preserving methods to the data, the data owners attempt to retain the knowledge inside the data. One approach to solve this problem is the concept of distributed databases where different parties have horizontal or vertical partitions of the data. Cluster analysis is a frequently used data mining task which aims at decomposing or partitioning a usually multivariate data set into groups such that the data objects in one group are more similar to each other. While using encryption based kernel k-means algorithm, large data’s can’t be encrypted in the distributed environment. To extend the privacy concept, a novel method based Privacy Preserving Distributed Data Mining is planned. According to this, a sanitization approach will be developed to improve the privacy of the user data. In sanitization process, a privacy based objective function will be developed and an optimal key will be generated based on the proposed objective function. Here artificial bee colony algorithm will be utilized for optimal key generation and large amount of data can be encrypted. Once the sanitization process is done, the sanitized information will be updated to service provider by the helper user for each cluster. Finally, the experimentation will be carried out with existing database to prove the efficiency of the proposed algorithm. The implementation will be done in JAVA using cloud simulator. Extensive execution assessments and security analysis exhibit the legitimacy and efficiency of the proposed technique.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Ahmed G, Zou J, Fareed MMS, Zeeshan M (2015) Sleep-awake energy efficient distributed clustering algorithm for wireless sensor networks. Comput Electr Eng 56:385–398
Akay B, Karaboga D (2012) A modified artificial bee colony algorithm for real-parameter optimization. Inf Sci 192:120–142
Azimi R, Sajedi H, Ghayekhloo M (2017) A distributed data clustering algorithm in P2P networks. Appl Soft Comput 51:147–167
Bhuyan HK, Kamila NK (2015) Privacy preserving sub-feature selection in distributed data mining. Appl Soft Comput 36:552–569
Chen J, Schizas ID (2016) Distributed information-based clustering of heterogeneous sensor data. Signal Process 126:35–51
Chitta R, Jin R, Jain AK (2012) Efficient kernel clustering using random fourier features. In: 2012 IEEE 12th international conference on data mining. Brussels, pp 161–170
Karaboga D, Ozturk C (2010) Fuzzy clustering with artificial bee colony algorithm. Sci Res Essays 5(14):1899–1902
Kokkinos Y, Margaritis KG (2015) Confidence ratio affinity propagation in ensemble selection of neural network classifiers for distributed privacy-preserving data mining. Neurocomputing 150:513–528
Lakshmi NSR, Babu S, Bhalaji N (2016) Analysis of clustered QoS routing protocol for distributed wireless sensor network. Comput Electr Eng 64:173–181
Limón X, Guerra-Hernández A, Cruz-Ramírez N, Acosta-Mesa HG, Grimaldo F (2016) A windowing strategy for distributed data mining optimized through GPUs. Pattern Recognit Lett 93:23–30
Lin CY (2016) A reversible data transform algorithm using integer transform for privacy-preserving data mining. J Syst Softw 117:104–112
Matatov N, Rokach L, Maimon O (2010) Privacy-preserving data mining: a feature set partitioning approach. Inf Sci 180(14):2696–2720
Movie Lens Dataset (2019). http://www.grouplens.org
Nagano J, Shinomiya N (2016) Efficient switch clustering for distributed controllers of OpenFlow network with bi-connectivity. Comput Netw 96:48–57
Naldi MC, Campello RJ (2015) Comparison of distributed evolutionary k-means clustering algorithms. Neurocomputing 163:78–93
Nayahi JJV, Kavitha V (2016) Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop. Future Gener Comput Syst 74:393–408
Nettleton DF, Salas J (2016) A data driven anonymization system for information rich online social network graphs. Expert Syst Appl 55:87–105
Peng T (2016) Collaborative trajectory privacy preserving scheme in location-based services. Inf Sci 387:165–179
Taheri H, Neamatollahi P, Younis OM, Naghibzadeh S, Yaghmaee MH (2012) An energy-aware distributed clustering protocol in wireless sensor networks using fuzzy logic. Ad Hoc Netw 10(7):1469–1481
Tian Z, Shi W, Wang Y, Zhu C, Du X, Su S, Sun Y, Guizani N (2019a) Real time lateral movement detection based on evidence reasoning network for edge computing environment. IEEE Trans Industr Inf. https://doi.org/10.1109/TII.2019.2907754
Tian Z, Li M, Qiu M, Sun Y, Su S (2019b) Block-DEF: a secure digital evidence framework using blockchain. Inf Sci 491:151–165
Tian Z, Su S, Shi W, Du X, Guizani M, Yu X (2019c) A data-driven method for future Internet route decision modeling. Future Gener Comput Syst 95:212–220
Tsapanos N (2015) A distributed framework for trimmed kernel k-means clustering. Pattern Recognit 48(8):2685–2698
Xie K, Ning X, Wang X, He S, Ning Z, Liu X, Qin Z (2016) An efficient privacy-preserving compressive data gathering scheme in WSNs. Inf Sci 390:82–94
Ximeng L, Deng RH, Yang Y, Tran HN, Zhong S (2017) Hybrid privacy-preserving clinical decision support system in fog–cloud computing. Future Gener Comput Syst 78:1–50
Yang JJ, Li JQ, Niu Y (2015) A hybrid solution for privacy preserving medical data sharing in the cloud environment. Future Gener Comput Syst 43:74–86
Ye A, Li Y, Xu L (2016) A novel location privacy-preserving scheme based on l-queries for continuous LBS. Comput Commun 98:1–10
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lekshmy, P.L., Rahiman, M.A. A sanitization approach for privacy preserving data mining on social distributed environment. J Ambient Intell Human Comput 11, 2761–2777 (2020). https://doi.org/10.1007/s12652-019-01335-w
- Service provider
- Helper user
- Optimal key
- User data
- Kernel k-means
- Artificial bee colony