Skip to main content

A Generalized Negative Binomial Smoothing Model for Sample Disclosure Risk Estimation

  • Conference paper
Privacy in Statistical Databases (PSD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4302))

Included in the following conference series:

Abstract

We deal with the issue of risk estimation in a sample frequency table to be released by an agency. Risk arises from non-empty sample cells which represent small population cells and from population uniques in particular. Therefore risk estimation requires assessing which of the relevant population cells are indeed small. Various methods have been proposed for this task, and we present a new method in which estimation of a population cell frequency is based on smoothing using a local neighborhood of this cell, that is, cells having similar or close values in all attributes.

The statistical model we use is a generalized Negative Binomial model which subsumes the Poisson and Negative Binomial models. We provide some preliminary results and experiments with this method.

Comparisons of the new approach are made to a method based on Poisson regressionlog-linear hierarchical model, in which inference on a given cell is based on classical models of contingency tables. Such models connect each cell to a ‘neighborhood’ of cells with one or several common attributes, but some other attributes may differ significantly. We also compare to the Argus Negative Binomial method in which inference on a given cell is based only on sampling weights, without learning from any type of ‘neighborhood’ of the given cell and without making use of the structure of the table.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Benedetti, R., Franconi, L., Piersimoni, F.: Per-record risk of disclosure in dependent data. In: Proceedings of the Conference on Statistical Data Protection, Lisbon March 1998, European Communities, Luxembourg (1999)

    Google Scholar 

  2. Bethlehem, J., Keller, W., Pannekoek, J.: Disclosure Control of Microdata. J. Amer. Statist. soc. 85, 38–45 (1990)

    Google Scholar 

  3. Cameron, A.C., Trivedi, P.K.: Regression analysis of count data, Econometric Society Monographs, vol. 30. Cambridge University Press, Cambridge (1998)

    Google Scholar 

  4. Di Consiglio, L., Franconi, L., Seri, G.: Assessing individual risk of disclosure: an experiment. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxemburg, pp. 286–298 (2003)

    Google Scholar 

  5. Elamir, E., Skinner, C.: Record-level measures of disclosure risk for survey mi- crodata. J. Official Statist. 22 (to appear, 2006)

    Google Scholar 

  6. Franconi, L., Polettini, S.: Individual risk estimation in mu-argus: a review. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 262–272. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Polettini, S., Seri, G.: Guidelines for the protection of social micro-data using individual risk methodology - Application within mu-argus version 3.2. CASC Project Deliverable No. 1.2-D3 (2003), http://neon.vb.cbs.nl/casc/

  8. Rinott, Y.: On models for statistical disclosure risk estimation. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Luxemburg, pp. 275–285 (2003)

    Google Scholar 

  9. Rinott, Y., Shlomo, N.: A neighborhood regression model for sample disclosure risk estimation. In: Proceedings of the Joint UNECE/Eurostat work session on statis- tical data confidentiality Geneva, Switzerland, pp. 79–87 (2005)

    Google Scholar 

  10. Rinott, Y., Shlomo, N.: A smoothing model for sample disclosure risk estimation (submitted, 2006)

    Google Scholar 

  11. Simonoff, S.J.: Three sides of smoothing: categorical Data smoothing, nonparametric regression, and density estimation. International Statistical Review 66, 137–156 (1998)

    Article  MATH  Google Scholar 

  12. Skinner, C., Holmes, D.: Estimating the Re-identification Risk Per Record in Microdata. J. Official Statist. 14, 361–372 (1998)

    Google Scholar 

  13. Skinner, C., Shlomo, N.: Assessing disclosure risk in microdata using record- level measures. In: Proceedings of the Joint UNECE/Eurostat work session on sta- tistical data confidentiality Geneva, Switzerland, pp. 69–78 (2005)

    Google Scholar 

  14. Skinner, C., Shlomo, N.: Assessing identification risk in survey microdata using log-linear models (submitted, 2006)

    Google Scholar 

  15. Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Lecture Notes in Statistics, vol. 155. Springer, New York (2001)

    MATH  Google Scholar 

  16. Zhang, C.-H.: Estimation of sums of random variables: examples and information bounds. Ann. Statist. 33, 2022–2041 (2005)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rinott, Y., Shlomo, N. (2006). A Generalized Negative Binomial Smoothing Model for Sample Disclosure Risk Estimation. In: Domingo-Ferrer, J., Franconi, L. (eds) Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol 4302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11930242_8

Download citation

  • DOI: https://doi.org/10.1007/11930242_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49330-3

  • Online ISBN: 978-3-540-49332-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics