Background on Multiply Imputed Synthetic Datasets

Drechsler, Jörg

doi:10.1007/978-1-4614-0326-5_2

Jörg Drechsler²

Part of the book series: Lecture Notes in Statistics ((LNS,volume 201))

1081 Accesses
1 Citations

Abstract

In 1993, the Journal of Official Statistics published a special issue on data confidentiality. Two articles in this volume laid the foundation for the development of multiply imputed synthetic datasets (MISDs). In his discussion “Statistical Disclosure Limitation,” Rubin (1993) for the first time suggested generating synthetic datasets based on his ideas of multiple imputation for missing values (Rubin, 1987). He proposed to treat all the observations from the sampling frame that are not part of the sample as missing data and to impute them according to the multiple imputation framework. Afterwards, simple random samples from these fully imputed datasets should be released to the public. Because the released dataset does not contain any real data, disclosure of sensitive information is very difficult. On the other hand, if the imputation models are selected carefully and the predictive power of the models is high, most of the information contained in the original data will be preserved. This approach is now called generating fully synthetic datasets in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Department for Statistical Methods, Institute for Employment Research, Regensburger Straße 104, 90478, Nürnberg, Germany
Jörg Drechsler

Authors

Jörg Drechsler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jörg Drechsler .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Drechsler, J. (2011). Background on Multiply Imputed Synthetic Datasets. In: Synthetic Datasets for Statistical Disclosure Control. Lecture Notes in Statistics(), vol 201. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-0326-5_2

Download citation

DOI: https://doi.org/10.1007/978-1-4614-0326-5_2
Published: 08 June 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-0325-8
Online ISBN: 978-1-4614-0326-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics