Skip to main content

Disclosure Analysis for Two-Way Contingency Tables

  • Conference paper
Privacy in Statistical Databases (PSD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4302))

Included in the following conference series:

Abstract

Disclosure analysis in two-way contingency tables is important in categorical data analysis. The disclosure analysis concerns whether a data snooper can infer any protected cell values, which contain privacy sensitive information, from available marginal totals (i.e., row sums and column sums) in a two-way contingency table. Previous research has been targeted on this problem from various perspectives. However, there is a lack of systematic definitions on the disclosure of cell values. Also, no previous study has been focused on the distribution of the cells that are subject to various types of disclosure. In this paper, we define four types of possible disclosure based on the exact upper bound and/or the lower bound of each cell that can be computed from the marginal totals. For each type of disclosure, we discover the distribution pattern of the cells subject to disclosure. Based on the distribution patterns discovered, we can speed up the search for all cells subject to disclosure.

Work of Lu and Li was supported in part by SMU Research Office. Work of Wu was supported in part by USA National Science Foundation Grant IIS-0546027.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Computing Surveys 21(4), 515–556 (1989)

    Article  Google Scholar 

  2. Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: PODS (2001)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)

    Google Scholar 

  4. Beck, L.L.: A security mechanism for statistical databases. ACM Trans. Database Syst. 5(3), 316–338 (1980)

    Article  MATH  Google Scholar 

  5. Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12(6), 900–919 (2000)

    Article  Google Scholar 

  6. Buzzigoli, L., Giusti, A.: An algorithm to calculate the lower and upper bounds of the elements of an array given its marginals. In: Proceedings of the conference for statistical data protection, pp. 131–147 (1999)

    Google Scholar 

  7. Causey, B.D., Cox, L.H., Ernst, L.R.: Applications of transportation theory to statistical problems. Journal of the American Statistical Association 80, 903–909 (1985)

    Article  MathSciNet  Google Scholar 

  8. Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: ICDM, pp. 589–592 (2005)

    Google Scholar 

  9. Chin, F.Y.L., Özsoyoglu, G.: Statistical database design. ACM Trans. Database Syst. 6(1), 113–139 (1981)

    Article  Google Scholar 

  10. Chin, F.Y.L., Özsoyoglu, G.: Auditing and inference control in statistical databases. IEEE Trans. Software Eng. 8(6), 574–582 (1982)

    Article  Google Scholar 

  11. Chowdhury, S., Duncan, G., Krishnan, R., Roehrig, S., Mukherjee, S.: Disclosure detection in multivariate categorical databases: auditing confidentiality protection through two new matrix operators. Management Sciences 45, 1710–1723 (1999)

    Article  Google Scholar 

  12. Cox, L.: Bounding entries in 3-dimensional contingency tables. In: SDC: From Theory to Practice (2001), http://vneumann.etse.urv.es/amrads/papers/coxlux.pdf

  13. Cox, L.: On properties of multi-dimensional statistical tables. Journal of Statistical Planning and Inference 117(2), 251–273 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  14. Cox, L.H.: Suppression methodology and statistical disclosure control. Journal of American Statistical Association 75, 377–385 (1980)

    Article  MATH  Google Scholar 

  15. Cox, L.H.: A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association 82, 520–524 (1987)

    Article  MATH  Google Scholar 

  16. Cox, L.H., George, J.A.: Controlled rounding for tables with subtotals. Annuals of operations research 20(1-4), 141–157 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  17. Cox, L.H.: Network models for complementary cell suppression. Journal of the American Statistical Association 90, 1453–1462 (1995)

    Article  MATH  Google Scholar 

  18. Dandekar, R.A., Cox, L.H.: Synthetic tabular data: An alternative to complementary cell suppression. Manuscript available from URL, http://mysite.verizon.net/vze7w8vk/

  19. Denning, D.E., Schlorer, J.: Inference controls for statistical databases. IEEE Computer 16(7), 69–82 (1983)

    Google Scholar 

  20. Dobkin, D.P., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Trans. Database Syst. 4(1), 97–106 (1979)

    Article  Google Scholar 

  21. Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables given fixed marginal totals and decomposable graphs. Proceedings of the National Academy of Sciences of the United States of America 97(22), 11885–11892 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  22. Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables induced by fixed marginal totals with applications to disclosure limitation. Statistical journal of the united states 18, 363–371 (2001)

    Google Scholar 

  23. Dobra, A., Karr, A., Sanil, A.: Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues. Statistics and Computing 13, 363–370 (2003)

    Article  MathSciNet  Google Scholar 

  24. Domingo-Ferrer, J.: Advances in inference control in statistical databases: An overview. In: Inference Control in Statistical Databases, pp. 1–7 (2002)

    Google Scholar 

  25. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)

    Article  Google Scholar 

  26. Farkas, C., Jajodia, S.: The inference problem: A survey. SIGKDD Explorations 4(2), 6–11 (2002)

    Article  Google Scholar 

  27. Fischetti, M., Salazar, J.: Solving the cell suppression problem on tabular data with linear constraints. Management sciences 47(7), 1008–1027 (2001)

    Article  Google Scholar 

  28. Fischetti, M., Salazar, J.J.: Solving the cell suppression problem on tabular data with linear constraints. Management Sciences 47, 1008–1026 (2000)

    Article  Google Scholar 

  29. Fischetti, M., Salazar, J.J.: Partial cell suppression: a new methodology for statistical disclosure control. Statistics and Computing 13, 13–21 (2003)

    Article  MathSciNet  Google Scholar 

  30. Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: SIGMOD Conference, pp. 37–48 (2005)

    Google Scholar 

  31. Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD, pp. 279–288 (2002)

    Google Scholar 

  32. Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp. 99–106 (2003)

    Google Scholar 

  33. Li, Y., Lu, H., Deng, R.H.: Practical inference control for data cubes (extended abstract). In: IEEE Symposium on Security and Privacy (2006)

    Google Scholar 

  34. Li, Y., Wang, L., Jajodia, S.: Preventing interval-based inference by random data perturbation. In: Privacy Enhancing Technologies, pp. 160–170 (2002)

    Google Scholar 

  35. Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1), 92–106 (2006)

    Article  Google Scholar 

  36. Muralidhar, K., Sarathy, R.: A general aditive data perturbation method for database security. Management Sciences 45, 1399–1415 (2002)

    Article  Google Scholar 

  37. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)

    Google Scholar 

  38. Schlörer, J.: Security of statistical databases: Multidimensional transformation. ACM Trans. Database Syst. 6(1), 95–112 (1981)

    Article  MATH  Google Scholar 

  39. Schlörer, J.: Information loss in partitioned statistical databases. Comput. J. 26(3), 218–223 (1983)

    Article  Google Scholar 

  40. Sturmfels, B.: Week 1: Two-way contingency tables, John von Neumann Lectures 2003 at the Technical University München (2003), http://www-m10.mathematik.tu-muenchen.de/neumann/lecturenotes/neumann_week1.pdf

  41. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  42. Traub, J.F., Yemini, Y., Wozniakowski, H.: The statistical security of a statistical database. ACM Trans. Database Syst. 9(4), 672–679 (1984)

    Article  Google Scholar 

  43. Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 249–256. Springer, Heidelberg (2004)

    Google Scholar 

  44. Wang, L., Jajodia, S., Wijesekera, D.: Securing olap data cubes against privacy breaches. In: IEEE Symposium on Security and Privacy, pp. 161–175 (2004)

    Google Scholar 

  45. Wang, L., Li, Y., Wijesekera, D., Jajodia, S.: Precisely Answering Multi-dimensional Range Queries without Privacy Breaches. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 100–115. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  46. Willenborg, L., de Walal, T.: Statistical Disclosure Control in Practice. Springer, Heidelberg (1996)

    MATH  Google Scholar 

  47. Yao, C., Wang, X.S., Jajodia, S.: Checking for k-anonymity violation by views. In: VLDB, pp. 910–921 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lu, H., Li, Y., Wu, X. (2006). Disclosure Analysis for Two-Way Contingency Tables. In: Domingo-Ferrer, J., Franconi, L. (eds) Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol 4302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11930242_6

Download citation

  • DOI: https://doi.org/10.1007/11930242_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49330-3

  • Online ISBN: 978-3-540-49332-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics