An Evaluation of Two Synthetic Small-Area Microdata Simulation Methodologies: Synthetic Reconstruction and Combinatorial Optimisation

Chapter
Part of the Understanding Population Trends and Processes book series (UPTA, volume 6)

Abstract

An essential requirement for all microsimulation models is an initial set of population microdata. If processes are to be modelled at the subregional level, these microdata must be spatially detailed. Unfortunately, confidentiality and sample size restrictions preclude the provision of such microdata via either census or social survey. As a result, a number of techniques have been developed to synthetically generate the requisite microdata.

This chapter reports on two techniques developed in the UK: ‘synthetic reconstruction’ and ‘combinatorial optimisation’. The basic methodology of each approach is outlined, following which recent innovations in their detailed implementation are introduced. The performance of both approaches is necessarily affected by the nature of between-place differences, which are shown to be surprisingly resistant to conventional data reduction techniques. Having considered this spatial variability, a new framework is introduced for the evaluation and validation of synthetic microdata, which includes multiple realisations and the use of measures of fit based around the Z-score and two derivations: Σ Z 2 and RSSZ. Finally, an evaluation of the output from each approach is presented. Combinatorial optimisation is shown to produce estimates with less bias, and significantly less variance, than those produced via synthetic reconstruction. The resulting synthetic microdata are shown to have a very high degree of fit to estimation constraints and to produce good estimates for margin-constrained distributions.

Notes

Acknowledgements

All census data are Crown copyright. The Census Small-Area Statistics were provided through the Census Dissemination Unit and the Census Sample of Anonymised Records via the Census Microdata Unit, both at the University of Manchester and both funded by ESRC/JISC/DENI. Some of the work reported in this chapter was funded by the ESRC (R000237744). Thanks are due to Zengyi Huang and David Voas for their contributions to many elements of this work and to the chapter’s referee for comments which led to significant improvements.

References

  1. Beckman, R. J., Baggerly, K. A., & McKay, M. D. (1996). Creating synthetic baseline populations. Transportation Research A, 30, 415–429.CrossRefGoogle Scholar
  2. Birkin, M., & Clarke, M. (1988). SYNTHESIS – A synthetic spatial information system for urban and regional analysis: Methods and examples. Environment and Planning A, 20 (12), 1645–1671.CrossRefGoogle Scholar
  3. Birkin, M., & Clarke, G. (1995). Using microsimulation methods to synthesize census data. In S. Openshaw (Ed.), Census users’ handbook (pp. 363–388). Cambridge: GeoInformation International.Google Scholar
  4. Clarke, G. (1996). Microsimulation: An introduction. In G. P. Clarke (Ed.), Microsimulation for urban and regional policy analysis (p. 3). London: Pion.Google Scholar
  5. Dale, A., & Teague, A. (2002). Microdata from the Census: Samples of Anonymised Records (Chapter 14). In P. Rees, D. Martin, & P. Williamson (Eds.), The census data system (pp. 203–212). Chichester: Wiley.Google Scholar
  6. Duley, C. J. (1989). A model for updating census-based household and population information for intercensal years. Unpublished PhD thesis, School of Geography, University of Leeds.Google Scholar
  7. Ghosh, M., & Rao, J. N. K. (1994). Small area estimation: An appraisal. Statistical Science, 9, 55–93.CrossRefGoogle Scholar
  8. Huang, Z., & Williamson, P. (2001a). A modified sampling procedure for small area population simulation (Working Paper 2001/2). Liverpool: Population Microdata Unit, Department of Geography, University of Liverpool. (Available from: http://pcwww.liv.ac.uk/microdata)
  9. Huang, Z., & Williamson, P. (2001b). A comparison of synthetic reconstruction and combinatorial optimisation approaches to the creation of small-area microdata (Working Paper 2001/2). Liverpool: Population Microdata Unit, Department of Geography, University of Liverpool. (Available from: http://pcwww.liv.ac.uk/microdata)
  10. Martin, D., Nolan, A., & Tranmer, M. (2001). The application of zone-design methodology in the 2001 UK Census. Environment and Planning A, 33 (11), 1949–1962.CrossRefGoogle Scholar
  11. Voas, D., & Williamson, P. (1998). Testing the acceptability of random number generators (Working Paper 1998/2). Liverpool: Population Microdata Unit, Department of Geography, University of Liverpool. (Available from: http://pcwww.liv.ac.uk/microdata)
  12. Voas, D., & Williamson, P. (2000a). An evaluation of the combinatorial optimisation approach to the creation of synthetic microdata. International Journal of Population Geography, 6, 349–366.CrossRefGoogle Scholar
  13. Voas, D., & Williamson, P. (2000b). The scale of dissimilarity: Concepts, measurement and an application to socio-economic variation across England and Wales. Transactions of the Institute of British Geographers, 25, 465–481.CrossRefGoogle Scholar
  14. Voas, D., & Williamson, P. (2001a). The diversity of diversity: A critique of geodemographic classification. Area, 33(1), 63–76.CrossRefGoogle Scholar
  15. Voas, D., & Williamson, P. (2001b). Evaluating goodness-of-fit measures for synthetic microdata. Journal of Geographical and Environmental Modelling, 5(2), 177–200.CrossRefGoogle Scholar
  16. Wallace, M., Charlton, J., & Denham, C. (1995). The new OPCS area classification. Population Trends, 79, 15–30.Google Scholar
  17. Williamson, P. (2002). Synthetic microdata. In P. Rees, D. Martin, & P. Williamson (Eds.), The census data system (pp. 231–241). Chichester: Wiley.Google Scholar
  18. Williamson, P. (2007). Confidentiality and anonymised survey records: The UK experience. In A. Gupta & A. Harding (Eds.), Modelling our future: Population ageing, health and aged care (pp. 387–413). Amsterdam: Elsevier.CrossRefGoogle Scholar
  19. Williamson, P., Birkin, M., & Rees, P. H. (1998). The estimation of population microdata by using data from small area statistics and samples of anonymised records. Environment and Planning A, 30, 785–816.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht. 2012

Authors and Affiliations

  1. 1.Department of Geography, School of Environmental SciencesUniversity of LiverpoolLiverpoolUK

Personalised recommendations