An Evaluation of Two Synthetic Small-Area Microdata Simulation Methodologies: Synthetic Reconstruction and Combinatorial Optimisation
An essential requirement for all microsimulation models is an initial set of population microdata. If processes are to be modelled at the subregional level, these microdata must be spatially detailed. Unfortunately, confidentiality and sample size restrictions preclude the provision of such microdata via either census or social survey. As a result, a number of techniques have been developed to synthetically generate the requisite microdata.
This chapter reports on two techniques developed in the UK: ‘synthetic reconstruction’ and ‘combinatorial optimisation’. The basic methodology of each approach is outlined, following which recent innovations in their detailed implementation are introduced. The performance of both approaches is necessarily affected by the nature of between-place differences, which are shown to be surprisingly resistant to conventional data reduction techniques. Having considered this spatial variability, a new framework is introduced for the evaluation and validation of synthetic microdata, which includes multiple realisations and the use of measures of fit based around the Z-score and two derivations: Σ Z 2 and RSSZ. Finally, an evaluation of the output from each approach is presented. Combinatorial optimisation is shown to produce estimates with less bias, and significantly less variance, than those produced via synthetic reconstruction. The resulting synthetic microdata are shown to have a very high degree of fit to estimation constraints and to produce good estimates for margin-constrained distributions.
All census data are Crown copyright. The Census Small-Area Statistics were provided through the Census Dissemination Unit and the Census Sample of Anonymised Records via the Census Microdata Unit, both at the University of Manchester and both funded by ESRC/JISC/DENI. Some of the work reported in this chapter was funded by the ESRC (R000237744). Thanks are due to Zengyi Huang and David Voas for their contributions to many elements of this work and to the chapter’s referee for comments which led to significant improvements.
- Birkin, M., & Clarke, G. (1995). Using microsimulation methods to synthesize census data. In S. Openshaw (Ed.), Census users’ handbook (pp. 363–388). Cambridge: GeoInformation International.Google Scholar
- Clarke, G. (1996). Microsimulation: An introduction. In G. P. Clarke (Ed.), Microsimulation for urban and regional policy analysis (p. 3). London: Pion.Google Scholar
- Dale, A., & Teague, A. (2002). Microdata from the Census: Samples of Anonymised Records (Chapter 14). In P. Rees, D. Martin, & P. Williamson (Eds.), The census data system (pp. 203–212). Chichester: Wiley.Google Scholar
- Duley, C. J. (1989). A model for updating census-based household and population information for intercensal years. Unpublished PhD thesis, School of Geography, University of Leeds.Google Scholar
- Huang, Z., & Williamson, P. (2001a). A modified sampling procedure for small area population simulation (Working Paper 2001/2). Liverpool: Population Microdata Unit, Department of Geography, University of Liverpool. (Available from: http://pcwww.liv.ac.uk/microdata)
- Huang, Z., & Williamson, P. (2001b). A comparison of synthetic reconstruction and combinatorial optimisation approaches to the creation of small-area microdata (Working Paper 2001/2). Liverpool: Population Microdata Unit, Department of Geography, University of Liverpool. (Available from: http://pcwww.liv.ac.uk/microdata)
- Voas, D., & Williamson, P. (1998). Testing the acceptability of random number generators (Working Paper 1998/2). Liverpool: Population Microdata Unit, Department of Geography, University of Liverpool. (Available from: http://pcwww.liv.ac.uk/microdata)
- Wallace, M., Charlton, J., & Denham, C. (1995). The new OPCS area classification. Population Trends, 79, 15–30.Google Scholar
- Williamson, P. (2002). Synthetic microdata. In P. Rees, D. Martin, & P. Williamson (Eds.), The census data system (pp. 231–241). Chichester: Wiley.Google Scholar