Advertisement

Reconstruction Attack through Classifier Analysis

  • Sébastien Gambs
  • Ahmed Gmati
  • Michel Hurfin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7371)

Abstract

In this paper, we introduce a novel inference attack that we coin as the reconstruction attack whose objective is to reconstruct a probabilistic version of the original dataset on which a classifier was learnt from the description of this classifier and possibly some auxiliary information. In a nutshell, the reconstruction attack exploits the structure of the classifier in order to derive a probabilistic version of dataset on which this model has been trained. Moreover, we propose a general framework that can be used to assess the success of a reconstruction attack in terms of a novel distance between the reconstructed and original datasets. In case of multiple releases of classifiers, we also give a strategy that can be used to merge the different reconstructed datasets into a single coherent one that is closer to the original dataset than any of the simple reconstructed datasets. Finally, we give an instantiation of this reconstruction attack on a decision tree classifier that was learnt using the algorithm C4.5 and evaluate experimentally its efficiency. The results of this experimentation demonstrate that the proposed attack is able to reconstruct a significant part of the original dataset, thus highlighting the need to develop new learning algorithms whose output is specifically tailored to mitigate the success of this type of attack.

Keywords

Privacy Data Mining Inference Attacks Decision Trees 

References

  1. 1.
    Aggarwal, C.C., Yu, P.S. (eds.): Privacy-Preserving Data Mining - Models and Algorithms. Advances in Database Systems, vol. 34. Springer (2008)Google Scholar
  2. 2.
    Asuncion, A., Frank, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  3. 3.
    Bertino, E., Lin, D., Jiang, W.: A survey of quantification of privacy preserving data mining algorithms. In: Aggarwal, Yu (eds.) [1], pp. 183–205Google Scholar
  4. 4.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)zbMATHGoogle Scholar
  5. 5.
    Kifer, D.: Attacks on privacy and definetti’s theorem. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) SIGMOD Conf., pp. 127–138. ACM (2009)Google Scholar
  6. 6.
    Kuhn, H.W.: The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 83–97 (1955)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Li, C., Shirani-Mehr, H., Yang, X.: Protecting Individual Information Against Inference Attacks in Data Publishing. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 422–433. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Li, X.B., Sarkar, S.: Against classification attacks: A decision tree pruning approach to privacy protection in data mining. Operations Research 57(6), 1496–1509 (2009)CrossRefzbMATHGoogle Scholar
  9. 9.
    Li, X.B., Sarkar, S.: Protecting privacy against regression attacks in predictive data mining. In: Galletta, D.F., Liang, T.P. (eds.) ICIS, pp. 1–15. Association for Information Systems (2011)Google Scholar
  10. 10.
    Munkres, J.: Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5, 32–38 (1957)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)Google Scholar
  12. 12.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)Google Scholar
  13. 13.
    Shannon, C.E.: A mathematical theory of communication. The Bell Systems Technical Journal 27, 379–423, 623–656 (1948)Google Scholar
  14. 14.
    Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. SIGMOD Record 33(1), 50–57 (2004)CrossRefGoogle Scholar
  15. 15.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (1999)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2012

Authors and Affiliations

  • Sébastien Gambs
    • 1
    • 2
  • Ahmed Gmati
    • 1
  • Michel Hurfin
    • 2
  1. 1.Institut de Recherche en Informatique et Systèmes AléatoiresUniversité de Rennes 1Rennes CedexFrance
  2. 2.Institut National de Recherche en Informatique et en AutomatiqueINRIA Rennes - Bretagne AtlantiqueFrance

Personalised recommendations