Privacy Preserving Data Mining: A New Methodology for Data Transformation

  • A. K. Upadhayay
  • Abhijat Agarwal
  • Rachita Masand
  • Rajeev Gupta


Today, privacy preservation is one of the greater concerns in data mining. While the research to develop different techniques for data preservation is on, a concrete solution is awaited. We address the privacy issue in data mining by a novel privacy preserving data mining technique. We develop and introduce a novel ICT (inverse cosine based transformation) method to preserve the data before subjecting it to clustering or any kind of analysis. A novel ‘privacy preserved k-clustering algorithm’ (PrivClust) is developed by embedding our ICT method into existing K-means clustering algorithm. This algorithm is explicitly designed with conversion to a privacy-preserving version in mind. The challenge was how to meet privacy requirements and guarantee valid clustering results as well. Simulation was carried out using Matlab. Our analysis and simulation show that this algorithm efficiently preserves the intended information on the one hand and yields valid cluster results on the other.


Data Mining Privacy Preservation National Basketball Association Secure Multiparty Computation Privacy Preserve Data Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, C.C., and Yu P.S.: Privacy Preserving data mining, Springer (2008)Google Scholar
  2. 2.
    Clifton C., Kantarcioglu M., Vaidya J.: Defining Privacy for Data Mining. Purdue University, West Lafayette.Google Scholar
  3. 3.
    Elmasri, N., Gupta S.: Fundamentals of Database Systems, Pearson Education, Inc, First Impression, (2006)Google Scholar
  4. 4.
    Evfimievski, A.: Randomization in Privacy-Preserving Data Mining. In SIGKDD Explorations, 4(2): 43–48, December (2002)CrossRefGoogle Scholar
  5. 5.
    Hann, J., Kamber M.: Data Mining concepts and techniques, Elsevier, 2ed. (2006)Google Scholar
  6. 6.
    Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A New Privacy-Preserving Distributed k-Clustering Algorithm in proceedings of 2006 SIAM international conference on data mining on SDM-(2006)Google Scholar
  7. 7.
    Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining, Advances in Cryptology—Crypto’ 00 Proceedings, LNCS 1880, Springer-Verlag, pp. 20–24, August 2000. A full version appeared in the Journal of Cryptology, Volume 15-Number 3, (2002)Google Scholar
  8. 9.
    Oliveira, S. R. M., Zaïane, O. R.: Privacy Preserving Clustering By Data Transformation. In Proceedings of the 18th Brazilian Symposium on Databases, Manaus, Amazonas, Brazil, October (2003), pp. 304–318.Google Scholar
  9. 10.
    Oliveira, S. R. M., Zaïane, O. R.: Achieving Privacy Preservation When Sharing Data for Clustering. In Proceedings of the International Workshop on Secure Data Management in a Connected World (SDM’04) in conjunction with VLDB (2004), Toronto, Canada, August, (2004)Google Scholar
  10. 11.
    Pinkas, B.: Cryptographic Techniques for Privacy-Preserving Data Mining SIGKDD Explorations, the newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, January (2003)Google Scholar
  11. 12.
    Sweeny, L.: Achieving k-anonymity privacy protection using generalization and suppression. (2002) CMU.Google Scholar
  12. 13.
    Upadhyay, A.K., Gupta R., Kumar R.: Analytical model for revised K-clustering algorithm for privacy preservation in data mining. RACE (2007) at BEC Bikaner, IEEE sponsored international conference.Google Scholar
  13. 14.
    Vaidya, J., Clifton, C.: Privacy-Preserving K-Means Clustering over Vertically Partitioned Data. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August (2003) pp. 206–215.Google Scholar
  14. 16.
    Agrawal, R., Srikant, R.: Privacy-Preserving Data Mining in proceedings of (2000) ACM SIGMOD Conference on Management of Data, pages 439–450, Dallas, TX, May 14–19 (2000). ACM.Google Scholar
  15. 17.
    Adam, N. R., Wortmann, J. C.: Security-Control Methods for Statistical Databases. ACM Computing Surveys, 21(4):515–556, Dec. (1989)CrossRefGoogle Scholar
  16. 18.
    Murlidhar, K., Parsa, R., Sarathy, R.: A General Additive Data Perturbation Method for Database Security. Management Science, 45(10): 1399–1415, October (1999)CrossRefGoogle Scholar

Copyright information

© Indian Institute of Information Technology, India 2009

Authors and Affiliations

  • A. K. Upadhayay
    • 1
  • Abhijat Agarwal
    • 1
  • Rachita Masand
    • 1
  • Rajeev Gupta
    • 2
  1. 1.Amity School of Engineering and TechnologyNoida, U.P.India
  2. 2.Rajasthan Technical UniversityKota, RajasthanIndia

Personalised recommendations