Skip to main content

Towards Systematic Methods in an Era of Big Data: Neighborhood Wide Association Studies

  • Chapter
  • First Online:
Geospatial Approaches to Energy Balance and Breast Cancer

Part of the book series: Energy Balance and Cancer ((EBAC,volume 15))

Abstract

Methodologic challenges related to variable selection exist in neighborhood studies. In the era of “Big Data”, this variable selection issue will only continue to grow as neighborhood data become increasingly more complex and integrated with multilevel data. To allow for consistency and comparability of neighborhood variables across studies, systematic approaches for variable selection are needed. Borrowing concepts from empiric methods in biology, a novel neighborhood-wide association study (NWAS) and a neighborhood-environment wide association study (NE-WAS) were recently developed. This chapter introduces key concepts of the NWAS/NE-WAS designs, provides criteria for evaluating these systematic approaches, and discusses the potential impact these empiric methods have on future multilevel interventions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gomez SL, Shariff-Marco S, DeRouen M, Keegan THM, Yen IH, Mujahid M, et al. The impact of neighborhood social and built environment factors across the cancer continuum: current research, methodological considerations, and future directions. Cancer. 2015;121(14):2314–30.

    Article  PubMed  Google Scholar 

  2. Yen IH, Syme SL. The social environment and health: a discussion of the epidemiologic literature. Annu Rev Public Health. 1999;20(1):287–308.

    Article  CAS  PubMed  Google Scholar 

  3. Jackson RJ. The impact of the built environment on health: an emerging field. Am J Public Health. 2003;93(9):1382–4.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Lynch SM, Rebbeck TR. Bridging the gap between biologic, individual, and macroenvironmental factors in Cancer: a multilevel approach. Cancer Epidemiol Biomark Prev. 2013;22(4):485–95.

    Article  Google Scholar 

  5. Warnecke RB, Oh A, Breen N, Gehlert S, Paskett E, Tucker KL, et al. Approaching health disparities from a population perspective: the National Institutes of Health centers for population health and health disparities. Am J Public Health. 2008;98(9):1608–15.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Krieger N, Chen JT, Waterman PD, Soobader M-J, Subramanian SV, Carson R. Geocoding and monitoring of US socioeconomic inequalities in mortality and Cancer incidence: does the choice of area-based measure and geographic level matter? The public health disparities geocoding project. Am J Epidemiol. 2002;156(5):471–82.

    Article  PubMed  Google Scholar 

  7. Krieger N. Theories for social epidemiology in the 21st century: an ecosocial perspective. Int J Epidemiol. 2001;30(4):668–77.

    Article  CAS  PubMed  Google Scholar 

  8. Lynch SM, Mitra N, Ross M, Newcomb C, Dailey K, Jackson T, et al. A Neighborhood-Wide Association Study (NWAS): example of prostate cancer aggressiveness. PLoS One. 2017;12(3):e0174548.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Weber GM, Mandl KD, Kohane IS. Finding the missing link for big biomedical data. JAMA. 2014;311(24):2479–80.

    CAS  PubMed  Google Scholar 

  10. Mooney SJ, Westreich DJ, El-Sayed AM. Commentary: Epidemiology in the era of big data. Epidemiology. 2015;26(3):390–4.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Low S-K, Zembutsu H, Nakamura Y. Breast cancer: the translation of big genomic data to cancer precision medicine. Cancer Sci. 2018;109(3):497–506.

    Article  CAS  PubMed  Google Scholar 

  12. Kar SP, Beesley J, Amin Al Olama A, Michailidou K, Tyrer J, Kote-Jarai Z, et al. Genome-wide meta-analyses of breast, ovarian, and prostate Cancer association studies identify multiple new susceptibility loci shared by at least two Cancer types. Cancer Discov. 2016;6(9):1052–67.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. U.S. Census Data [Internet]. United States Census Bureau. 2010 cited Accessed 11 Sept 2018.

    Google Scholar 

  14. Behavioral Risk Factor Surveillance Data [Internet]. Center for disease control. 2010–2017 cited 21 Sept 2018. Available from https://www.cdc.gov/brfss/data_documentation/index.htm.

  15. Google Imagery [Internet]. Google, Inc. 2018 cited 11 Oct 2018. Available from https://lp.google-mkto.com/Google-imagery.html.

  16. Open Data Philly [Internet]. 2018 cited 11 Oct 2018. Available from https://www.opendataphilly.org/.

  17. Crime Data [Internet]. ESRI. 2018 cited 15 Oct 2018. Available from https://doc.arcgis.com/en/esri-demographics/data/crime-indexes.htm.

  18. Community Health Database [Internet]. Public health management corporation. 2016 [cited 16 June 2016]. Available from http://chdb.phmc.org/.

  19. National Cancer Institute(NCI) Division of Cancer Control and Population Sciences. NCI cohort consortium. Bethesda, MD. 1 Dec 2018. Available from https://epi.grants.cancer.gov/Consortia/cohort.html#proposing.

  20. MacArthur JBE, Cerezo M, Gil L, Hall P, Hastings E, Junkins H, McMahon A, Milano A, Morales J, Pendlington Z, Welter D, Burdett T, Hindorff L, Flicek P, Cunningham F, Parkinson H. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017;45(Database Issue):D896–901.

    Article  CAS  PubMed  Google Scholar 

  21. The Cancer Genome Atlas [Internet]. 2018 [cited 12 Nov 2018]. Available from https://tcga-data.nci.nih.gov/docs/publications/tcga/?

  22. Surveillance, Epidemiology, and End Results (SEER) Program [Internet]. National Cancer Institute, DCCPS, Surveillance Research Program. 1973–2015 [cited 1 Dec 2018]. Available from https://seer.cancer.gov/data/.

  23. Varghese JS, Easton DF. Genome-wide association studies in common cancers—what have we learnt? Curr Opin Genet Dev. 2010;20(3):201–9.

    Article  CAS  PubMed  Google Scholar 

  24. Sampson RJ, Morenoff JD, Gannon-Rowley T. Assessing Neighborhood Effects: social processes and new directions in research. Annu Rev Sociol. 2002;28:443–78.

    Article  Google Scholar 

  25. Eeles RA, Kote-Jarai Z, Giles GG, Olama AA, Guy M, Jugurnauth SK, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet. 2008;40(3):316–21.

    Article  CAS  PubMed  Google Scholar 

  26. Patel CJ, Bhattacharya J, Butte AJ. An Environment-Wide Association Study (EWAS) on type 2 diabetes mellitus. PLoS One. 2010;5(5):e10746.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Ioannidis JPA, Loy EY, Poulton R, Chia KS. Researching genetic versus nongenetic determinants of disease: a comparison and proposed unification. Sci Transl Med. 2009;1(7):7ps8.

    Article  PubMed  Google Scholar 

  28. Mooney SJ, Joshi S, Cerdá M, Kennedy GJ, Beard JR, Rundle AG. Contextual correlates of physical activity among older adults: a neighborhood environment-wide association study (NE-WAS). Cancer Epidemiol Biomark Prev. 2017;26(4):495–504.

    Article  Google Scholar 

  29. Pearson TA, Manolio TA. How to interpret a genome-wide association study. JAMA. 2008;299(11):1335–44.

    Article  CAS  PubMed  Google Scholar 

  30. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 2009;106(23):9362–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Reich DE, Lander ES. On the allelic spectrum of human disease. Trends Genet. 2001;17(9):502–10.

    Article  CAS  PubMed  Google Scholar 

  32. Bush WS, Moore JH. Chapter 11: genome-wide association studies. PLoS Comput Biol. 2012;8(12):e1002822.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005;6:95–108.

    Article  CAS  PubMed  Google Scholar 

  34. Meuwissen TH, Goddard ME. Fine mapping of quantitative trait loci using linkage disequilibria with closely linked marker loci. Genetics. 2000;155(1):421–30.

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Wang Y, Localio R, Rebbeck TR. Evaluating Bias due to population stratification in epidemiologic studies of gene-gene or gene-environment interactions. Cancer Epidemiol Biomark Prev. 2006;15(1):124–32.

    Article  CAS  Google Scholar 

  36. Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers EJ, Berk R, et al. Redefine statistical significance. Nat Hum Behav. 2018;2(1):6–10.

    Article  PubMed  Google Scholar 

  37. Chawla DS. “One-size-fits-all” threshold for P values under fire. Nature News [Internet] 2017. Available from https://www.nature.com/news/one-size-fits-all-threshold-for-p-values-under-fire-1.22625#/ref-link-2.

  38. Year 2000 US. Census SF1 and SF3 Form variables [Internet] 2014. cited 1 Jan 2014. Available from http://www.socialexplorer.com.

  39. Oakes JM. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. Soc Sci Med. 2004;58(10):1929–52. https://doi.org/10.1016/j.socscimed.2003.08.004.

    Article  Google Scholar 

  40. Messer L, Laraia B, Kaufman J, Eyster J, Holzman C, Culhane J, et al. The development of a standard neighborhood deprivation index. J Urban Health. 2006;83(6):1041–62.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Diez Roux AV, Mair C. Neighborhoods and health. Ann NY Acad Sci. 2010;1186(1):125–45.

    Article  PubMed  Google Scholar 

  42. Hubbard AE, Ahern J, Fleischer NL, Laan MV, Lippman SA, Jewell N, et al. To GEE or not to GEE: comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology. 2010;21(4):467–74.

    Article  PubMed  Google Scholar 

  43. Ru H, Martino S. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc Ser B Stat Methodol. 2008;71(2):319–92.

    Article  Google Scholar 

  44. Thomas DC, Casey G, Conti DV, Haile RW, Lewinger JP, Stram DO. Methodological issues in multistage genome-wide association studies. Stat Sci Review J Inst Math Stat. 2009;24(4):414–29.

    Google Scholar 

  45. Aslibekyan S, Claas SA, Arnett DK. To replicate or not to replicate: the case of Pharmacogenetic studies: establishing validity of Pharmacogenomic findings: from replication to triangulation. Circ Cardiovasc Genet. 2013;6(4):409–12.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Thomson H, Thomas S, Sellstrom E, Petticrew M. Housing improvements for health and associated socio-economic outcomes. Cochrane Database Syst Rev. 2013;

    Google Scholar 

  47. Zeigler-Johnson C, Tierney A, Rebbeck TR, Rundle A. Prostate Cancer severity associations with neighborhood deprivation. Prostate Cancer. 2011;2011:1–9.

    Article  Google Scholar 

  48. Carpenter W, Howard D, Taylor Y, Ross L, Wobker S, Godley P. Racial differences in PSA screening interval and stage at diagnosis. Cancer Causes Control. 2010;21(7):1071–80.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Kamphuis CB. Socioeconomic differences in lack of recreational walking among older adults: the role of neighbourhood and individual factors. Int J Behav Nutr Phys Act. 2009;6(1)

    Article  PubMed  PubMed Central  Google Scholar 

  50. Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidimiology. 2010;21(3):383–8.

    Article  Google Scholar 

  51. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  Google Scholar 

  52. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol. 1996;58:267–88.

    Google Scholar 

  53. Olson RS, La Cava W, Mustahsan Z, Varik A, Moore JH. Data-driven advice for applying machine learning to bioinformatics problems. Pac Symp Biocomput. 2018;23:192–203.

    PubMed  PubMed Central  Google Scholar 

  54. LoConte NK, Gershenwald JE, Thomson CA, Crane TE, Harmon GE, Rechis R. Lifestyle modifications and policy implications for primary and secondary Cancer prevention: diet, exercise, sun safety, and alcohol reduction. Am Soc Clin Oncol Educ Book. 2018;38:88–100.

    Article  PubMed  Google Scholar 

  55. Urbanowicz RJ, Moore JH. ExSTraCS 2.0: description and evaluation of a scalable learning Classifer system. Evol Intel. 2015;8(2.3):89–116.

    Article  Google Scholar 

  56. Ioannidis J. This I believe in genetics: discovery can be a nuisance, replication is science, implementation matters. Front Genet. 2013;4:33.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372(9):793–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Rebbeck TR. Precision prevention of Cancer. Cancer Epidemiol Biomark Prev. 2014;23:2713–5.

    Article  Google Scholar 

  59. O’Keefe EB, Meltzer JP. Health Disparities and Cancer: Racial Disparities in Cancer Mortality in the United States, 2000–2010. Frontiers in public health. 2015;3:51.

    Google Scholar 

  60. Institute of Medicine (IOM). Capturing social and behavioral domains and measures in electronic health records: Phase 2. Washington, DC: National Academies Press; 2014.

    Google Scholar 

  61. Cowley D. New Alliance seeks to promote health and prevent illness by addressing social determinants of health in Ogden, St George Utah 2018. Available from https://intermountainhealthcare.org/news/2018/06/new-alliance-seeks-to-promote-health-and-prevent-illness-by-addressing-social-determinants-of-health-in-ogden-st-george/.

  62. Lynch SM, Moore JH. A call for biological data mining approaches in epidemiology. BioData mining. 2016;9(1):1.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shannon M. Lynch Ph.D., M.P.H. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lynch, S.M. (2019). Towards Systematic Methods in an Era of Big Data: Neighborhood Wide Association Studies. In: Berrigan, D., Berger, N. (eds) Geospatial Approaches to Energy Balance and Breast Cancer. Energy Balance and Cancer, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-18408-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18408-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18407-0

  • Online ISBN: 978-3-030-18408-7

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics