Protecting Against Data Mining through Samples

Clifton, Chris

doi:10.1007/978-0-387-35508-5_13

Chris Clifton

Part of the book series: IFIP — The International Federation for Information Processing ((IFIPAICT,volume 43))

287 Accesses
5 Citations

Abstract

Data mining introduces new problems in database security. The basic problem of using non-sensitive data to infer sensitive data is made more difficult by the “probabilistic” inferences possible with data mining. This paper shows how lower bounds from pattern recognition theory can be used to determine sample sizes where data mining tools cannot obtain reliable results.

The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-0-387-35508-5_22

Download to read the full chapter text

Chapter PDF

An Analysis of Privacy Preservation Techniques in Data Mining

Responsibly Innovating Data Mining and Profiling Tools: A New Approach to Discrimination Sensitive and Privacy Sensitive Attributes

On Mining Sensitive Rules to Identify Privacy Threats

Keywords

References

Chowdhury, S. D., Duncan, G. T., Krishnan, R., Roehrig, S. and Mukherjee, S. (1996). Logical vs. numerical inference on statistical databases. Proceedings of the Twenty-Ninth Hawaii International Conference on System Sciences, pp. 3–10.
Google Scholar
Cohen, D. M., Kulikowski, C., and Berman, H. (1995). DEXTER: A system that experiments with choices of training data using expert knowledge in the domain of DNA hydration. Machine Learning, 21, pp. 81–101.
Google Scholar
Cox, L. H. (1996). Protecting confidentiality in small population health and environmental statistics. Statistics in Medicine, 15, pp. 1895–1905.
Article Google Scholar
Delugach, H. S. and Hinke, T. H. (1996). Wizard: A database inference analysis and detection system. IEEE Transactions on Knowledge and Data Engineering, 8 (1).
Google Scholar
Denning, D. E. (1980). Secure statistical databases with random sample queries. ACM Transactions on Database Systems, 5 (3), pp. 291–315.
Article MATH Google Scholar
Devroye, L., Györfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.
Book MATH Google Scholar
Devroye, L. and Lugosi, G. (1995). Lower bounds in pattern recognition and learning. Pattern Recognition, 28, pp. 1011–1018.
Article Google Scholar
Hinke, T. H. and Delugach, H. S. (1992). Aerie: An inference modeling and detection approach for databases. In Thuraisingham, B. and Landwehr, C., editors, Database Security, VI, Status and Prospects: Proceedings of the IFIP WG 11.3 Workshop on Database Security, pages 179–193, Vancouver, Canada. IFIP, Elsevier Science Publishers B.V. ( North-Holland ).
Google Scholar
Hinke, T. H., Delugach, H. S., and Wolf, R. P. (1997). Protecting databases from inference attacks. Computers and Security, 16 (8), pp. 687–708.
Article Google Scholar
Johnsten, T. and Raghavan, V. (1999). Impact of decision-region based classification algorithms on database security. Proceedings of the Thirteenth Annual IFIP WG 11.3 Working Conference on Database Security.
Google Scholar
Kohonen, T. (1990). The self organizing map. IEEE Transactions on Computers, 78 (9), pp. 1464–1480.
Google Scholar
Vapnik, V. N. (1982). Estimation of dependences based on empirical data. Springer-Verlag, New York.
MATH Google Scholar
Yang, J. and Honavar, V. (1998). Feature subset selection using a genetic algorithm. IEEE INTELLIGENT SYSTEMS, 13 (2), pp. 11–19.
Google Scholar
Yip, R. and Levitt, K. (1998). The design and implementation of a data level database inference detection system. Proceedings of the Twelfth Annual IFIP WG 11.3 Working Conference on Database Security.
Google Scholar

Download references

Authors

Chris Clifton
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Rutgers University, USA
Vijay Atluri
Universty of Tulsa, USA
John Hale

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Clifton, C. (2000). Protecting Against Data Mining through Samples. In: Atluri, V., Hale, J. (eds) Research Advances in Database and Information Systems Security. IFIP — The International Federation for Information Processing, vol 43. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-35508-5_13

Download citation

DOI: https://doi.org/10.1007/978-0-387-35508-5_13
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-6411-6
Online ISBN: 978-0-387-35508-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Protecting Against Data Mining through Samples

Abstract

Chapter PDF

Similar content being viewed by others

An Analysis of Privacy Preservation Techniques in Data Mining

Responsibly Innovating Data Mining and Profiling Tools: A New Approach to Discrimination Sensitive and Privacy Sensitive Attributes

On Mining Sensitive Rules to Identify Privacy Threats

Keywords

References

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Protecting Against Data Mining through Samples

Abstract

Chapter PDF

Similar content being viewed by others

An Analysis of Privacy Preservation Techniques in Data Mining

Responsibly Innovating Data Mining and Profiling Tools: A New Approach to Discrimination Sensitive and Privacy Sensitive Attributes

On Mining Sensitive Rules to Identify Privacy Threats

Keywords

References

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation