In surveys with a sensitive characteristic X, such as income or tax evasion, direct questioning causes answer refusal and untruthful answers. To increase the respondents’ cooperation, nonrandomized response survey techniques are currently emerging. In this field of research, the diagonal model (DM) survey technique was recently published to gather data on multicategorical X. However, the estimation of the distribution of X from DM data is so far only derived for simple random sampling with replacement. To overcome this limitation, we develop in this article DM estimates for complex sampling designs that are often applied in practice, including stratified, cluster, multistage, and unequal probability sampling. Here, we apply quadratic programming to obtain admissible estimates in the unit simplex for the probability masses of X. Bootstrap variance estimates for the admissible estimators are described, and a method to investigate the connection between estimation efficiency in complex sample surveys and the degree of privacy protection by simulations is established. Our simulations show that larger efficiency corresponds to lower privacy protection and indicate optimal parameters for the DM. Such optimality results are rare in the existing literature on privacy-protecting survey models for multicategorical sensitive variables, especially for complex sampling designs.
Untruthful answers Answer refusal Randomized response Inadmissible estimates for multinomial proportions in complex sampling Bootstrap variance estimates Degree of privacy protection
This is a preview of subscription content, log in to check access.
Barbiero, A., and F. Mecatti. 2010. Bootstrap algorithms for variance estimation in π PS sampling. In Complex data modeling and computationally intensive statistical methods, ed. P. Mantovan and P. Secchi, 57–69. Milan, Italy: Springer.CrossRefGoogle Scholar
Tan, M. T., G. L. Tian, and M. L. Tang. 2009. Sample surveys with sensitive questions: A nonrandomized response approach. Am. Stat., 63, 9–16.MathSciNetCrossRefGoogle Scholar
Tang, M. L., G. L. Tian, N. S. Tang, and Z. Liu. 2009. A new non-randomized multi-category response model for surveys with a single sensitive question: Design and analysis. J. Korean Stat. Soc., 38, 339–349.MathSciNetCrossRefGoogle Scholar
Tian, G. L., J. W. Yu, M. L. Tang, and Z. Geng. 2007. A new non-randomized model for analysing sensitive questions with binary outcomes. Stat. Med., 26, 4238–4252.MathSciNetCrossRefGoogle Scholar
Yu, J. W., G. L. Tian, and M. L. Tang. 2008. Two new models for survey sampling with sensitive characteristic: design and analysis. Metrika, 67, 251–263.MathSciNetCrossRefGoogle Scholar