Controlling for Population Density Using Clustering and Data Weighting Techniques When Examining Social Health and Welfare Problems

Agre, Lynn A.; Peterson, N. Andrew; Brady, James

doi:10.1007/978-3-319-18536-1_2

Controlling for Population Density Using Clustering and Data Weighting Techniques When Examining Social Health and Welfare Problems

Lynn A. Agre⁵,
N. Andrew Peterson⁶ &
James Brady⁷

Chapter

2308 Accesses

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

Abstract

Clustering techniques partition the unit of analysis or study subjects into similar groups by a certain variable, thus permitting a model to run on cases with related attributes as a control for sociodemographic differences. Though large-scale national surveys often provide a raw weight variable, when applied without transformation, yields no change in statistical results, thus leading to spurious conclusions about relationships between predictors and outcomes. For example, most research studies using various components of the National Longitudinal Survey on Youth (NLSY) data sets to test hypotheses do not employ a weighting technique or post-stratification procedure to normalize the sample against the population from which it is drawn. Therefore, this chapter will illustrate how an algebraic weight formula introduced by Oh and Scheuren (Weighting adjustment for unit non-response. In: Incomplete Data in Sample Surveys, Chap. 3. Academic Press, New York, 1983), can be used in path analysis to elucidate the relationship between underlying psychosocial mechanisms and health risk behaviors among adolescents in the 1998 NLSY Young Adult cohort. Using the NLSY sample originally surveyed from US population, the association between self-assessed risk perception or risk proneness and how that perception affects the likelihood of an adolescent to engage in substance use and sexual behavior is investigated, separated into clusters by mother’s race/ethnicity and educational attainment. To control for oversampling of under represented racial/ethnic groups, mathematically adjusted design weights are then implemented in the calculation of the covariance matrices for each cluster group by race and educational attainment, comparing non-normalized vs. normalized path analysis results. The impact of ignoring weights leading to serious bias in parameter estimates, with the underestimation of standards errors will be presented illustrating the distinction between weighted and non-weighted data. As an innovative statistical approach, this application uses a weighted case approach by testing the model on discrete cluster samples of youth by race/ethnicity and mother’s educational attainment. Determining public health policy initiatives and objectives requires that the data be representative of the population, ensured by transforming and applying the weight formula to the sample.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Cohen, J., Cohen, A.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Hillsdale (1983)
Google Scholar
Crockett, L.J., Raffaelli, M., Shen, Y.-L.: Linking self-regulation and risk proneness to risky sexual behavior: pathways through peer pressure and early substance use. J. Res. Adolesc. 16, 503–525 (2006)
Article Google Scholar
Green, B.F.: Parameter sensitivity in multivariate methods. J. Multivar. Behav. Res. 12, 263–287 (1977)
Article Google Scholar
Hahs-Vaughn, D.L., Lomax, R.G.: Utilization of sample weights in single-level structural equation modeling. J. Exp. Educ. 74, 163–190 (2006)
Article Google Scholar
Horowitz, J.L., Manski, C.F.: Censoring of outcomes and regressors due to survey nonresponse: identification and estimation using weights and imputations. J. Econ. 84, 37–58 (1998)
Article MathSciNet MATH Google Scholar
Lang, K., Zagorsky, J.L.: Does growing up with a parent absent really hurt? J. Hum. Resour. 36, 253–273 (2001)
Article Google Scholar
Little, R.J.A., Rubin, D.A.: Statistical Analysis with Missing Data. Wiley, New York (1987)
MATH Google Scholar
MaCurdy, T., Mroz, T., Gritz, R.M.: An evaluation of the national longitudinal survey on youth. J. Hum. Resour. 33, 345–436 (1998)
Article Google Scholar
McDonald, R.P.: Path analysis with composite variables. Multivar. Behav. Res. 31, 239–270 (1996)
Article Google Scholar
Oh, H.L., Scheuren, F.J.: Weighting adjustment for unit nonresponse. In: Incomplete Data in Sample Surveys, Chap. 3. Academic Press, New York (1983)
Google Scholar
Ohio State University, Center for Human Resource Research: NLSY 79 Child & Young Adult Data Users Guide. Ohio State University, Center for Human Resource Research, Columbus (2006). Retrieved from http://www.nlsinfo.org/pub/usersvc/Child-Young-Adult/2004ChildYA-DataUsersGuide.pdf
Olinsky, A., Chen, S., Harlow, L.: The comparative efficacy of imputation methods for missing data in structural equation modeling. Eur. J. Oper. Res. 151, 53–79 (2003)
Article MathSciNet MATH Google Scholar
Olobatuyi, R.: A User’s Guide to Path Analysis. University Press of America, Lanham (2006)
Google Scholar
Pachter, L.M., Auinger, P., Palmer, R., Weitzman, M.: Do parenting and home environment, maternal depression, neighborhood and chronic poverty affect child behavioral problems differently in different age groups? Pediatrics 117, 1329–1338 (2006)
Article Google Scholar
Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge (2000)
Google Scholar
Rothman, K.J., Greenland, S.: Causation and causal inference in epidemiology. Am. J. Public Health 95, S144–S150 (2005)
Article Google Scholar
Rubin, D.B., Olkin, I., Madow, W.G.: Incomplete Data in Sample Surveys. Academic Press, Inc., New York (1983)
MATH Google Scholar
Shafer, J.L., Oslen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivar. Behav. Res. 33, 545–571 (1998)
Article Google Scholar
Stapleton, L.M.: The incorporation of sample weights into multilevel structural equation models. Struct. Equ. Model. 9, 475–502 (2002)
Article MathSciNet Google Scholar
Stapleton, L.M.: An assessment of practical solutions for structural equation modeling with complex data. Struct. Equ. Model. 13, 28–58 (2006)
Article MathSciNet Google Scholar
Steinley, D., Brusco, M.J.: A new variable weighting and selection procedure for K-means cluster analysis. Multivar. Behav. Res. 43, 77–108 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics and Biostatistics, Rutgers University, 110 Frelinghuysen Road, Hill Center, Busch Campus, New Brunswick, NJ, 08854, USA
Lynn A. Agre
School of Social Work, Rutgers University, New Brunswick, NJ, USA
N. Andrew Peterson
Rutgers University, New Brunswick, NJ, USA
James Brady

Authors

Lynn A. Agre
View author publications
You can also search for this author in PubMed Google Scholar
N. Andrew Peterson
View author publications
You can also search for this author in PubMed Google Scholar
James Brady
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lynn A. Agre .

Editor information

Editors and Affiliations

Wallace H. Kuralt Distinguished Professor, Director of Statistical Development and Consultation, School of Social Work, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Ding-Geng (Din) Chen
Arizona State University, Tempe, Arizona, USA
Jeffrey Wilson

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Agre, L.A., Peterson, N.A., Brady, J. (2015). Controlling for Population Density Using Clustering and Data Weighting Techniques When Examining Social Health and Welfare Problems. In: Chen, DG., Wilson, J. (eds) Innovative Statistical Methods for Public Health Data. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18536-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-18536-1_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18535-4
Online ISBN: 978-3-319-18536-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics