Abstract
With the ever-increasing demand of collecting and analyzing a large volume of data collected from different sources, computational models and methods—ones that allow for the integration of data sets to help understand a phenomenon from a broad, comprehensive systems perspective—are called for. Two features are commonly observed in large and complex systems. First, a system is made up of multiple subsystems. Second, there exists fragmented data. The methodological challenge is to reconcile the potential parametric inconsistency across individually calibrated subsystems. This study aims to explore a novel approach, called system-subsystem dependency network, which is capable of integrating subsystems that might have been individually calibrated using separate data sets. From a health informatics perspective, the method can be seen as a way to integrate heterogeneous data sources, especially from relatively well-structured clinical study data. In this paper, we compare several techniques for solving the methodological challenge. Additionally, we use data from a large-scale epidemiologic study, as well as from two large clinical trials to illustrate the solution to the inconsistency of overlapping subsystems and the integration of data sets.
Similar content being viewed by others
References
ACCORD Study Group (2007) Action to control cardiovascular risk in diabetes (ACCORD) trial: design and methods. Am J Cardiol 99:S21–S33
Bar-Yam Y (1997) Dynamics of complex systems. Addison Wesley, Reading, MA
Casella G, George EI (1992) Explaining the Gibbs sampler. Am Stat 46:167–174
Chen S-H, Ip EH, Wang Y (2010) Gibbs ensembles for nearly compatible and incompatible conditional models. Comput Stat Data An 55:1760–1769
Chen S-H, Ip EH, Wang Y (2013) Gibbs ensembles for incompatible dependency networks. WIREs Comp Stat 5:475–485
Flake G (1998) The computational beauty of nature. MIT Press, Boston, MA
Ip EH, Shoham D, Hammond R, Huang TT-K, Wang Y, Rahmandad H, Mabry PL (2013) Reconciling statistical and systems science methodologies using the Levins framework and agent-based modeling. Health Educ Behav 40:123–131
Hammond RA (2009) Complex systems modeling for obesity research. Prev Chronic Dis 6(3):A97
Heckerman D et al (2000) Dependency networks for inference, collaborative filtering, and data visualization. Mach Learn Res 1:49–75
Hendler J (2014) Data integration for heterogeneous datasets. Big Data 2:205–215
Huhne R, Thalheim R, Suhnel J (2014) AgeFactDB- the JenAge ageing factor database – toward data integration in ageing research. Nuclei Acids Res 42:892–896
Kirkwood TB (2011) Systems biology of ageing and longevity. Philos Trans R Soc Lond B Biol Sci 366:64–70
Lauritzen SL (1996) Graphical models. Clarendon Press, Oxford, UK
Lawrence RH, Jette AM (1996) Disentangling the disablement process. J Gerontol B-Psychol 51:173–182
Levine RA, Casella G (2006) Optimizing random scan Gibbs samplers. J Multivariate Ana 97:2071–2100
Look AHEAD Research Group (2003) Look AHEAD: action for health in diabetes. Design and methods for a clinical trial of weight loss for the prevention of cardiovascular disease in type 2 diabetes. Control Clin Trials 24:610–628
Mabry PL et al (2010) Systems science: a revolution in public health policy research. Am J Public Health 100:1161–1163
Mabry PL, Bures RM (2014) Systems science for obesity-related research questions: an introduction to the theme issue. Am J Public Health 104:1157–1159
Marcus SE et al (2010) Lessons learned from the application of systems science to tobacco control at the national cancer institute. Am J Public Health 100:1163–1165
Mount D et al (2009) Constructing common cohorts from trials with overlapping eligibility criteria: Implications for comparing effect sizes between trials. Clin Trials 6:416–429
Newman AB, Haggerty CL, Goodpaster B, Harris T, Kritchevsky S, Nevitt M, Miles TP, Visser M (2003) Strength and muscle quality in a well functioning cohort of older adults: the health, aging and body composition study. J Am Geriatr Soc 51:323–330
Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search. MIT Press, Boston, MA
Sterman J (2000) Business dynamics: systems thinking and modeling for a complex world. McGraw-Hill, Boston, MA
Sterman J (2006) Learning from evidence in a complex world. Am J Public Health 96:505–514
Vandenbroeck P, Goossens J, Clemens M (2013) Foresight: tacking obesity: future choices – building the obesity system map. Government Office for Science, UK. available at http://www.foresight.gov.uk
Wang Y, Xue H, Esposito L, Joyner MJ, Bar-Yam Y, Huang T (2014) Applications of complex systems science in obesity and non-communicable chronic disease research. Adv Nutr 5:574–577
Acknowledgements
The study is supported by NIH grants 1R21AG042761-01 and 1U01HL101066-01 (PI: Ip). We also acknowledge support from the Wake Forest Pepper Center grant P30AG021332 for Drs. Ip and Rejeski. We also thank Drs. Stephen Kritchevsky, Mike Miller, and Bob Byington for their consultation on study data-related issues.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ip, E.H., Chen, SH. & Rejeski, W.J. System-Subsystem Dependency Network for Integrating Multicomponent Data and Its Application to Health Sciences. J Healthc Inform Res 1, 139–156 (2017). https://doi.org/10.1007/s41666-017-0006-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41666-017-0006-5