Skip to main content

A Two-Stage Imputation Procedure to Balance the Risk–Utility Trade-Off

  • Chapter
  • First Online:
Synthetic Datasets for Statistical Disclosure Control

Part of the book series: Lecture Notes in Statistics ((LNS,volume 201))

Abstract

There has been little discussion in the literature on how many multiply imputed datasets an agency should release. From the perspective of the secondary data analyst, a large number of datasets is desirable. The additional variance introduced by the imputation decreases with the number of released datasets. For example, Reiter (2003) finds nearly a 100% increase in the variance of regression coefficients when going from 50 to two partially synthetic datasets. From the perspective of the agency, a small number of datasets is desirable. The information available to illintentioned users seeking to identify individuals in the released datasets increases with the number of released datasets. Thus, agencies considering the release of partially synthetic data generally are confronted with a trade-off between disclosure risk and data utility.

Most of this chapter is taken from Drechsler and Reiter (2009) and Reiter and Drechsler (2010).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jörg Drechsler .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Drechsler, J. (2011). A Two-Stage Imputation Procedure to Balance the Risk–Utility Trade-Off. In: Synthetic Datasets for Statistical Disclosure Control. Lecture Notes in Statistics(), vol 201. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-0326-5_9

Download citation

Publish with us

Policies and ethics