Skip to main content

Controlling for Population Density Using Clustering and Data Weighting Techniques When Examining Social Health and Welfare Problems

  • Chapter
  • 2308 Accesses

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

Abstract

Clustering techniques partition the unit of analysis or study subjects into similar groups by a certain variable, thus permitting a model to run on cases with related attributes as a control for sociodemographic differences. Though large-scale national surveys often provide a raw weight variable, when applied without transformation, yields no change in statistical results, thus leading to spurious conclusions about relationships between predictors and outcomes. For example, most research studies using various components of the National Longitudinal Survey on Youth (NLSY) data sets to test hypotheses do not employ a weighting technique or post-stratification procedure to normalize the sample against the population from which it is drawn. Therefore, this chapter will illustrate how an algebraic weight formula introduced by Oh and Scheuren (Weighting adjustment for unit non-response. In: Incomplete Data in Sample Surveys, Chap. 3. Academic Press, New York, 1983), can be used in path analysis to elucidate the relationship between underlying psychosocial mechanisms and health risk behaviors among adolescents in the 1998 NLSY Young Adult cohort. Using the NLSY sample originally surveyed from US population, the association between self-assessed risk perception or risk proneness and how that perception affects the likelihood of an adolescent to engage in substance use and sexual behavior is investigated, separated into clusters by mother’s race/ethnicity and educational attainment. To control for oversampling of under represented racial/ethnic groups, mathematically adjusted design weights are then implemented in the calculation of the covariance matrices for each cluster group by race and educational attainment, comparing non-normalized vs. normalized path analysis results. The impact of ignoring weights leading to serious bias in parameter estimates, with the underestimation of standards errors will be presented illustrating the distinction between weighted and non-weighted data. As an innovative statistical approach, this application uses a weighted case approach by testing the model on discrete cluster samples of youth by race/ethnicity and mother’s educational attainment. Determining public health policy initiatives and objectives requires that the data be representative of the population, ensured by transforming and applying the weight formula to the sample.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Cohen, J., Cohen, A.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Hillsdale (1983)

    Google Scholar 

  • Crockett, L.J., Raffaelli, M., Shen, Y.-L.: Linking self-regulation and risk proneness to risky sexual behavior: pathways through peer pressure and early substance use. J. Res. Adolesc. 16, 503–525 (2006)

    Article  Google Scholar 

  • Green, B.F.: Parameter sensitivity in multivariate methods. J. Multivar. Behav. Res. 12, 263–287 (1977)

    Article  Google Scholar 

  • Hahs-Vaughn, D.L., Lomax, R.G.: Utilization of sample weights in single-level structural equation modeling. J. Exp. Educ. 74, 163–190 (2006)

    Article  Google Scholar 

  • Horowitz, J.L., Manski, C.F.: Censoring of outcomes and regressors due to survey nonresponse: identification and estimation using weights and imputations. J. Econ. 84, 37–58 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Lang, K., Zagorsky, J.L.: Does growing up with a parent absent really hurt? J. Hum. Resour. 36, 253–273 (2001)

    Article  Google Scholar 

  • Little, R.J.A., Rubin, D.A.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    MATH  Google Scholar 

  • MaCurdy, T., Mroz, T., Gritz, R.M.: An evaluation of the national longitudinal survey on youth. J. Hum. Resour. 33, 345–436 (1998)

    Article  Google Scholar 

  • McDonald, R.P.: Path analysis with composite variables. Multivar. Behav. Res. 31, 239–270 (1996)

    Article  Google Scholar 

  • Oh, H.L., Scheuren, F.J.: Weighting adjustment for unit nonresponse. In: Incomplete Data in Sample Surveys, Chap. 3. Academic Press, New York (1983)

    Google Scholar 

  • Ohio State University, Center for Human Resource Research: NLSY 79 Child & Young Adult Data Users Guide. Ohio State University, Center for Human Resource Research, Columbus (2006). Retrieved from http://www.nlsinfo.org/pub/usersvc/Child-Young-Adult/2004ChildYA-DataUsersGuide.pdf

  • Olinsky, A., Chen, S., Harlow, L.: The comparative efficacy of imputation methods for missing data in structural equation modeling. Eur. J. Oper. Res. 151, 53–79 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Olobatuyi, R.: A User’s Guide to Path Analysis. University Press of America, Lanham (2006)

    Google Scholar 

  • Pachter, L.M., Auinger, P., Palmer, R., Weitzman, M.: Do parenting and home environment, maternal depression, neighborhood and chronic poverty affect child behavioral problems differently in different age groups? Pediatrics 117, 1329–1338 (2006)

    Article  Google Scholar 

  • Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  • Rothman, K.J., Greenland, S.: Causation and causal inference in epidemiology. Am. J. Public Health 95, S144–S150 (2005)

    Article  Google Scholar 

  • Rubin, D.B., Olkin, I., Madow, W.G.: Incomplete Data in Sample Surveys. Academic Press, Inc., New York (1983)

    MATH  Google Scholar 

  • Shafer, J.L., Oslen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivar. Behav. Res. 33, 545–571 (1998)

    Article  Google Scholar 

  • Stapleton, L.M.: The incorporation of sample weights into multilevel structural equation models. Struct. Equ. Model. 9, 475–502 (2002)

    Article  MathSciNet  Google Scholar 

  • Stapleton, L.M.: An assessment of practical solutions for structural equation modeling with complex data. Struct. Equ. Model. 13, 28–58 (2006)

    Article  MathSciNet  Google Scholar 

  • Steinley, D., Brusco, M.J.: A new variable weighting and selection procedure for K-means cluster analysis. Multivar. Behav. Res. 43, 77–108 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lynn A. Agre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Agre, L.A., Peterson, N.A., Brady, J. (2015). Controlling for Population Density Using Clustering and Data Weighting Techniques When Examining Social Health and Welfare Problems. In: Chen, DG., Wilson, J. (eds) Innovative Statistical Methods for Public Health Data. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18536-1_2

Download citation

Publish with us

Policies and ethics