Skip to main content

A Differentially Private Random Decision Forest Using Reliable Signal-to-Noise Ratios

  • Conference paper
  • First Online:
AI 2015: Advances in Artificial Intelligence (AI 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9457))

Included in the following conference series:

Abstract

When dealing with personal data, it is important for data miners to have algorithms available for discovering trends and patterns in the data without exposing people’s private information. Differential privacy offers an enforceable definition of privacy that can provide each individual in a dataset a guarantee that their personal information is no more at risk than it would be if their data was not in the dataset at all. By using mechanisms that achieve differential privacy, we propose a decision forest algorithm that uses the theory of Signal-to-Noise Ratios to automatically tune the algorithm’s parameters, and to make sure that any differentially private noise added to the results does not outweigh the true results. Our experiments demonstrate that our differentially private algorithm can achieve high prediction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The class attribute is the attribute that the user wishes to accurately predict the value of for future records, where the value is not known.

  2. 2.

    Our code can be found at http://csusap.csu.edu.au/zislam/, or you can email us.

References

  1. Bache, K., Lichman, M.: UCI Machine Learning Repository (2013). http://archive.ics.uci.edu/ml/

  2. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Breiman, L., Friedman, J., Stone, C., Olshen, R.: Classification and Regression Trees. Chapman & Hall/CRC, Boca Raton (1984)

    MATH  Google Scholar 

  4. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Dwork, C., Roth, A.: The Algorithmic Foundations of Differential Privacy. Now Publishers, Hanover (2013)

    MATH  Google Scholar 

  7. Fan, W., Wang, H., Yu, P., Ma, S.: Is random model better? On its accuracy and efficiency. In: Third IEEE International Conference on Data Mining (2003)

    Google Scholar 

  8. Fletcher, S., Islam, M.Z.: A differentially private decision forest. In: Proceedings of the 13th Australasian Data Mining Conference, Sydney, Australia (2015)

    Google Scholar 

  9. Fletcher, S., Islam, M.Z.: Quality evaluation of an anonymized dataset. In: 22nd International Conference on Pattern Recognition. IEEE, Stockholm (2014)

    Google Scholar 

  10. Friedman, A., Schuster, A.: Data mining with differential privacy. In: 16th SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 493–502. ACM, Washington, DC, USA (2010)

    Google Scholar 

  11. Fung, B., Wang, K., Chen, R., Yu, P.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. (CSUR) 42(4), 14 (2010)

    Article  Google Scholar 

  12. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  13. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2006)

    MATH  Google Scholar 

  14. Jagannathan, G., Pillaipakkamnatt, K., Wright, R.: A practical differentially private random decision tree classifier. Trans. Data Priv. 5(1), 273–295 (2012)

    MathSciNet  Google Scholar 

  15. McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 35th SIGMOD International Conference on Management of Data, pp. 19–30. ACM, Providence, USA (2009)

    Google Scholar 

  16. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: 48th Annual IEEE Symposium on Foundations of Computer Science, pp. 94–103 (2007)

    Google Scholar 

  17. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  18. UN General Assembly: Universal Declaration of Human Rights (1948)

    Google Scholar 

  19. Van Drongelen, W.: Signal processing for Neuroscientists: An Introduction to the Analysis of Physiological Signals. Academic Press, Burlington (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sam Fletcher .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Fletcher, S., Islam, M.Z. (2015). A Differentially Private Random Decision Forest Using Reliable Signal-to-Noise Ratios. In: Pfahringer, B., Renz, J. (eds) AI 2015: Advances in Artificial Intelligence. AI 2015. Lecture Notes in Computer Science(), vol 9457. Springer, Cham. https://doi.org/10.1007/978-3-319-26350-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26350-2_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26349-6

  • Online ISBN: 978-3-319-26350-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics