Skip to main content

Privacy in Data Mining Using Formal Methods

  • Conference paper
Typed Lambda Calculi and Applications (TLCA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3461))

Included in the following conference series:

Abstract

There is growing public concern about personal data collected by both private and public sectors. People have very little control over what kinds of data are stored and how such data is used. Moreover, the ability to infer new knowledge from existing data is increasing rapidly with advances in database and data mining technologies. We describe a solution which allows people to take control by specifying constraints on the ways in which their data can be used. User constraints are represented in formal logic, and organizations that want to use this data provide formal proofs that the software they use to process data meets these constraints. Checking the proof by an independent verifier demonstrates that user constraints are (or are not) respected by this software. Our notion of “privacy correctness” differs from general software correctness in two ways. First, properties of interest are simpler and thus their proofs should be easier to automate. Second, this kind of correctness is stricter; in addition to showing a certain relation between input and output is realized, we must also show that only operations that respect privacy constraints are applied during execution. We have therefore an intensional notion of correctness, rather that the usual extensional one. We discuss how our mechanism can be put into practice, and we present the technical aspects via an example. Our example shows how users can exercise control when their data is to be used as input to a decision tree learning algorithm. We have formalized the example and the proof of preservation of privacy constraints in Coq.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) 2000 ACM SIGMOD International Conference on Management of Data, pp. 439–450. ACM, New York (2000)

    Chapter  Google Scholar 

  2. Bove, A., Capretta, V.: Nested general recursion and partiality in type theory. In: Boulton, R.J., Jackson, P.B. (eds.) TPHOLs 2001. LNCS, vol. 2152, pp. 121–135. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  3. Bove, A., Capretta, V.: Modelling general recursion in type theory. To appear in Mathematical Structures in Computer Science (2004), Available at http://www.science.uottawa.ca/~vcapr396/

  4. Capretta, V.: Abstraction and Computation. PhD thesis, Computing Science Institute, University of Nijmegen (2002)

    Google Scholar 

  5. Coquand, T.: Infinite objects in type theory. In: Barendregt, H., Nipkow, T. (eds.) TYPES 1993. LNCS, vol. 806, pp. 62–78. Springer, Heidelberg (1994)

    Google Scholar 

  6. Coq Development Team. The Coq Proof Assistant reference manual: Version 7.4. Technical report, INRIA (2003)

    Google Scholar 

  7. Coquand, T., Paulin, C.: Inductively defined types. In: Martin-Löf, P., Mints, G. (eds.) COLOG 1988. LNCS, vol. 417. Springer, Heidelberg (1990)

    Google Scholar 

  8. EPIC. Electronic Privacy Information Center (2005), http://www.epic.org/

  9. Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery in Databases and Data Mining (July 2002)

    Google Scholar 

  10. Felty, A., Matwin, S.: Privacy-oriented data mining by proof checking. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 138–149. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Giménez, E.: Codifying guarded definitions with recursive schemes. In: Smith, J., Dybjer, P., Nordström, B. (eds.) TYPES 1994. LNCS, vol. 996, pp. 39–59. Springer, Heidelberg (1995)

    Google Scholar 

  12. Giménez, E.: A Tutorial on Recursive Types in Coq. Technical Report 0221, Unité de recherche INRIA Rocquencourt (May 1998)

    Google Scholar 

  13. Hayati, K., Abadi, M.: Language-based enforcement of privacy policies. In: Martin, D., Serjantov, A. (eds.) PET 2004. LNCS, vol. 3424, pp. 302–313. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Hagino, T.: A typed lambda calculus with categorical type constructors. In: Pitt, D.H., Rydeheard, D.E., Poigné, A. (eds.) Category Theory and Computer Science. LNCS, vol. 283, pp. 140–157. Springer, Heidelberg (1987)

    Google Scholar 

  15. IPCO. Data mining: Staking a claim on your privacy, Information and Privacy Commissioner/Ontario (January 1998), http://www.ipc.on.ca/scripts/index.asp?action=31&P_ID=11387&N_ID=1&PT_ID=11351&U_ID=0

  16. Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery in Databases and Data Mining, July 2002, pp. 279–287 (2002)

    Google Scholar 

  17. Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. In: The ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2002) (June 2002)

    Google Scholar 

  18. Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  19. McBride, C., McKinna, J.: The view from the left. Journal of Functional Programming 14(1), 69–111 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  20. Marché, C., Paulin-Mohring, C., Urbain, X.: The Krakatoa tool for certification of Java/JavaCard programs annotated in JML. Journal of Logic and Algebraic Programming 58(1–2), 89–106 (2004)

    Article  MATH  Google Scholar 

  21. Paulin-Mohring, C.: Inductive Definitions in the System Coq - Rules and Properties. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664. Springer, Heidelberg (1993)

    Google Scholar 

  22. Ries, D.G.: Protecting consumer online privacy – an overview (May 2001), http://www.pbi.org/Goodies/privacy/privacy_ries.htm

  23. USCM. Mayors attack urban redlining, mortgage discrimination, The US Conference of Mayors (1998), http://www.usmayors.org/uscm/news/press_releases/press_archive.asp?doc_id=98

  24. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery in Databases and Data Mining (July 2002)

    Google Scholar 

  25. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Matwin, S., Felty, A., Hernádvölgyi, I., Capretta, V. (2005). Privacy in Data Mining Using Formal Methods. In: Urzyczyn, P. (eds) Typed Lambda Calculi and Applications. TLCA 2005. Lecture Notes in Computer Science, vol 3461. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11417170_21

Download citation

  • DOI: https://doi.org/10.1007/11417170_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25593-2

  • Online ISBN: 978-3-540-32014-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics