Abstract
Clustering techniques partition the unit of analysis or study subjects into similar groups by a certain variable, thus permitting a model to run on cases with related attributes as a control for sociodemographic differences. Though large-scale national surveys often provide a raw weight variable, when applied without transformation, yields no change in statistical results, thus leading to spurious conclusions about relationships between predictors and outcomes. For example, most research studies using various components of the National Longitudinal Survey on Youth (NLSY) data sets to test hypotheses do not employ a weighting technique or post-stratification procedure to normalize the sample against the population from which it is drawn. Therefore, this chapter will illustrate how an algebraic weight formula introduced by Oh and Scheuren (Weighting adjustment for unit non-response. In: Incomplete Data in Sample Surveys, Chap. 3. Academic Press, New York, 1983), can be used in path analysis to elucidate the relationship between underlying psychosocial mechanisms and health risk behaviors among adolescents in the 1998 NLSY Young Adult cohort. Using the NLSY sample originally surveyed from US population, the association between self-assessed risk perception or risk proneness and how that perception affects the likelihood of an adolescent to engage in substance use and sexual behavior is investigated, separated into clusters by mother’s race/ethnicity and educational attainment. To control for oversampling of under represented racial/ethnic groups, mathematically adjusted design weights are then implemented in the calculation of the covariance matrices for each cluster group by race and educational attainment, comparing non-normalized vs. normalized path analysis results. The impact of ignoring weights leading to serious bias in parameter estimates, with the underestimation of standards errors will be presented illustrating the distinction between weighted and non-weighted data. As an innovative statistical approach, this application uses a weighted case approach by testing the model on discrete cluster samples of youth by race/ethnicity and mother’s educational attainment. Determining public health policy initiatives and objectives requires that the data be representative of the population, ensured by transforming and applying the weight formula to the sample.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Cohen, J., Cohen, A.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Hillsdale (1983)
Crockett, L.J., Raffaelli, M., Shen, Y.-L.: Linking self-regulation and risk proneness to risky sexual behavior: pathways through peer pressure and early substance use. J. Res. Adolesc. 16, 503–525 (2006)
Green, B.F.: Parameter sensitivity in multivariate methods. J. Multivar. Behav. Res. 12, 263–287 (1977)
Hahs-Vaughn, D.L., Lomax, R.G.: Utilization of sample weights in single-level structural equation modeling. J. Exp. Educ. 74, 163–190 (2006)
Horowitz, J.L., Manski, C.F.: Censoring of outcomes and regressors due to survey nonresponse: identification and estimation using weights and imputations. J. Econ. 84, 37–58 (1998)
Lang, K., Zagorsky, J.L.: Does growing up with a parent absent really hurt? J. Hum. Resour. 36, 253–273 (2001)
Little, R.J.A., Rubin, D.A.: Statistical Analysis with Missing Data. Wiley, New York (1987)
MaCurdy, T., Mroz, T., Gritz, R.M.: An evaluation of the national longitudinal survey on youth. J. Hum. Resour. 33, 345–436 (1998)
McDonald, R.P.: Path analysis with composite variables. Multivar. Behav. Res. 31, 239–270 (1996)
Oh, H.L., Scheuren, F.J.: Weighting adjustment for unit nonresponse. In: Incomplete Data in Sample Surveys, Chap. 3. Academic Press, New York (1983)
Ohio State University, Center for Human Resource Research: NLSY 79 Child & Young Adult Data Users Guide. Ohio State University, Center for Human Resource Research, Columbus (2006). Retrieved from http://www.nlsinfo.org/pub/usersvc/Child-Young-Adult/2004ChildYA-DataUsersGuide.pdf
Olinsky, A., Chen, S., Harlow, L.: The comparative efficacy of imputation methods for missing data in structural equation modeling. Eur. J. Oper. Res. 151, 53–79 (2003)
Olobatuyi, R.: A User’s Guide to Path Analysis. University Press of America, Lanham (2006)
Pachter, L.M., Auinger, P., Palmer, R., Weitzman, M.: Do parenting and home environment, maternal depression, neighborhood and chronic poverty affect child behavioral problems differently in different age groups? Pediatrics 117, 1329–1338 (2006)
Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge (2000)
Rothman, K.J., Greenland, S.: Causation and causal inference in epidemiology. Am. J. Public Health 95, S144–S150 (2005)
Rubin, D.B., Olkin, I., Madow, W.G.: Incomplete Data in Sample Surveys. Academic Press, Inc., New York (1983)
Shafer, J.L., Oslen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivar. Behav. Res. 33, 545–571 (1998)
Stapleton, L.M.: The incorporation of sample weights into multilevel structural equation models. Struct. Equ. Model. 9, 475–502 (2002)
Stapleton, L.M.: An assessment of practical solutions for structural equation modeling with complex data. Struct. Equ. Model. 13, 28–58 (2006)
Steinley, D., Brusco, M.J.: A new variable weighting and selection procedure for K-means cluster analysis. Multivar. Behav. Res. 43, 77–108 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Agre, L.A., Peterson, N.A., Brady, J. (2015). Controlling for Population Density Using Clustering and Data Weighting Techniques When Examining Social Health and Welfare Problems. In: Chen, DG., Wilson, J. (eds) Innovative Statistical Methods for Public Health Data. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18536-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-18536-1_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18535-4
Online ISBN: 978-3-319-18536-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)