Organizing and Analyzing the Activity Data in NHANES
- 7 Downloads
The NHANES study contains objectively measured physical activity data collected using hip-worn accelerometers from multiple cohorts. However, using the accelerometry data has proven daunting because (1) currently, there are no agreed-upon standard protocols for data storage and analysis; (2) data exhibit heterogeneous patterns of missingness due to varying degrees of adherence to wear-time protocols; (3) sampling weights need to be carefully adjusted and accounted for in individual analyses; (4) there is a lack of reproducible software that transforms the data from its published format into analytic form; and (5) the high dimensional nature of accelerometry data complicates analyses. Here, we provide a framework for processing, storing, and analyzing the NHANES accelerometry data for the 2003–2004 and 2005–2006 surveys. We also provide an NHANES data package in R, to help disseminate high-quality, processed activity data combined with mortality and demographic information. Thus, we provide the tools to transition from “available data online” to “easily accessible and usable data”, which substantially reduces the large upfront costs of initiating studies of association between physical activity and human health outcomes using NHANES. We apply these tools in an analysis showing that accelerometry features have the potential to predict 5-year all-cause mortality better than known risk factors such as age, cigarette smoking, and various comorbidities.
KeywordsAccelerometry Physical activity NHANES Prediction
We would like to thank the CDC, specifically the National Center for Health Statistics for collecting, organizing, and making public this unique data resource. We would also like to thank them for the permission to repost the publicly available NHANES and NDI data in analytic format. Also, we would like to thank the thousands of anonymous participants in the NHANES, whose data led to the exciting findings in this paper.
This research was supported by National Heart, Lung, and Blood Institute (R 01 HL123407), National Institute of Neurological Disorders and Stroke (R 01 NS060910), and National Institute on Aging Training Grant (T 32 AG000247).
- 2.Centers for Disease Control and Prevention (2017) About the national health and nutrition examination survey. http://www.cdc.gov/nchs/nhanes/about_nhanes.htm
- 3.Cooper R, Huang L, Hardy R, Crainiceanu A, Harris T, Schrack JA, Crainiceanu C, Kuh D (2017) Obesity history and daily patterns of physical activity at age 60-64 years: findings from the MRC national survey of health and development. J Gerontol A Biol Sci Med Sci 72(10):1424–1430CrossRefGoogle Scholar
- 4.Curtin L, Mohadjer L, Dohrmann S (2012) The national health and nutrition examination survey: sample design, 1999–2006. Vital Health Stat 2(155):1–39Google Scholar
- 6.Di J, Leroux A, Urbanek J, Varadhan R, Spira A, Schrack J, Zipunnikov V (2017) Patterns of sedentary and active time accumulation are associated with mortality in US adults: the NHANES study. bioRxiv. https://doi.org/10.1101/182337
- 8.Huang L, Scheipl F, Goldsmith J, Gellar J, Harezlak J, McLean MW, Swihart B, Xiao L, Crainiceanu C, Reiss P (2016) refund: Regression with functional dataGoogle Scholar
- 13.Leroux A (2018) rnhanesdata: NHANES accelerometry data pipeline. R package version 1.0. https://github.com/andrew-leroux/rnhanesdata
- 16.Lumley T (2017) survey: Analysis of complex sample surveys. R package version 3.32Google Scholar
- 18.National Cancer Institute (2018) Risk factor monitoring and methods: SAS programs for analyzing nhanes 2003 2004 accelerometer data. https://epi.grants.cancer.gov/nhanes_pam/
- 19.National Center for Health Statistics (2015) Office of analysis and epidemiology, public-use linked mortality file. http://www.cdc.gov/nchs/data_access/data_linkage/mortality.htm
- 21.R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
- 23.Robillard R, Hermens DF, Naismith SL, White D, Rogers NL, Ip TK, Mullin SJ, Alvares GA, Guastella AJ, Smith KL, Rong Y, Whitwell B, Southan J, Glozier N, Scott EM, Hickie IB (2015) Ambulatory sleep-wake patterns and variability in young people with emerging mental disorders. J Psychiatry Neurosci 40(1):28–37CrossRefGoogle Scholar
- 27.Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, Downey P, Elliott P, Green J, Landray M, Liu B, Matthews P, Ong G, Pell J, Silman A, Young A, Sprosen T, Peakman T, Collins R (2015) UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 12(3):e1001779CrossRefGoogle Scholar
- 29.Van Domelen DR (2018) accelerometry: Functions for processing accelerometer data. R package version 3.1.2Google Scholar
- 30.Van Domelen DR, Pittard WS, Harris TB (2014) nhanesaccel: Process accelerometer data from NHANES 2003–2006. R package version 2.1.1/r86Google Scholar
- 31.Van Domelen DR, Pttard SW (2014) Flexible R functions for processing accelerometer data, with emphasis on nhanes 2003–2006. R J 6:52–62Google Scholar
- 35.Yoshida K, Bohn J (2017) tableone: Create ‘Table 1’ to describe baseline characteristics. R package version 0.9.3Google Scholar