Abstract
Physiological and genetic information has been critical to the successful diagnosis and prognosis of complex diseases. In this paper, we introduce a support-confidence-correlation framework to accurately discover truly meaningful and interesting association rules between complex physiological and genetic data for disease factor analysis, such as type II diabetes (T2DM). We propose a novel Multivariate and Multidimensional Association Rule mining system based on Change Detection (MMARCD). Given a complex data set u i (e.g. u 1 numerical data streams, u 2 images, u 3 videos, u 4 DNA/RNA sequences) observed at each time tick t, MMARCD incrementally finds correlations and hidden variables that summarise the key relationships across the entire system. Based upon MMARCD, we are able to construct a correlation network for human diseases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Botstein, D., Risch, N.: Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genetics 33, 228–237 (2003), doi: 10.1038
He, J., Zhang, Y., Huang, G.: Multivariate association mining for genetics and physiological data related with T2DM. Health Information Science and Systems (October 2011) (accepted)
Christensen, K., Murray, J.: What genome-wide association studies can do for medicine. N Engl. J. Med. 356(11), 1169–1171 (2007)
Klein, R.: Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005), PMID, 15761122
Johnson, A., O’Donnell, C.: An open access database of genome-wide association results. BMC Medical Genetics 10(6) (2009)
CGEMS Data Access, Cancer Genetic Markers of Susceptibility, National Cancer Institute, U.S.A., http://cgems.cancer.gov/
National Human Genome Research Institute, National Institutes of Health, http://www.genome.gov/
Diabetes Genetics Initiative, http://www.braod.mit.edu/diabetes
Ku, C.: The pursuit of GWA studies: where are we now? Journal of Human Genetics 55(4), 195–206 (2010)
Sladek, R., Rocheleau, G., Rung, J., et al.: A GWAS identifies novel risk loci for type 2 diabetes. Nature 445(7130), 881–885 (2007)
Welcome Trust Case Control Consortium, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)
Kuok, C.M., Fu, A., Wong, M.H.: Shatin, Mining fuzzy association rules in databases. ACM SIGMODÂ 27(1) (1998)
Georgii, E., et al.: Analyzing microarray data using quantitative association rules. Bioinformatics 21(2), ii123–ii129
Agrawal, R., Srikant, R.: Fast algorithm for mining association rules in large databases. In: VLDB 1994, Santiago, Chile, pp. 487–499 (1994)
Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Proc. 2000 ACM-SIGMOD Int. Workshop Data Mining and Knowledge Discovery (DMKD 2000), Dallas, TX, pp. 11–20 (May 2000)
Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: Proc. ICDM 2003 Int. Workshop on Frequent Itemset Mining Implementations (FIMI 2003), Melbourne, FL (November 2003)
Zaki, M., Hsiao, C.: CHARM: An efficient algorithm for closed itemset mining. In: Proc. 2002 SIAM Int. Conf. Data Mining (SDM 2002), Arlington, VA, pp. 457–473 (April 2002)
Burdick, D., Calimlin, M., Gehrke, J.: MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proc. 2001 Int. Conf. Data Engineering (ICDE 2001), Heidelberg, Germany, pp. 443–452 (April 2001)
Ying, et al.: Predicting source code changes by mining revision history. IEEE Trans. Software Engineering 30, 574–586 (2004)
Zimmermann, et al.: Mining version histories to guide software changes. IEEE Trans. Software Eng. 31(6), 429–445 (2005)
Wu, et al.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Min. Knowl. Discov. 21(3), 371–397 (2010)
Goh, K., Cusick, M., Valle, D., et al.: The human disease network. Proc. Natl. Acad. Sci., USA 104, 8685–8690 (2007)
Jimenez-Sanchez, G., Childs, B., Valle, D.: Human Disease Genes. Nature 409, 853–855
Childs, B., Valle, D.: Genetics, biology and disease. Annu. Rev. Genomics Hum. Genet. (1), 1–19 (2000)
Qiao, Z., He, J., Zhang, Y.: Multiple Time Series Anomaly Detection Based on Compression and Correlation Analysis: Algorithm and Medical Surveillance Case Study. In: 13th Asia Pacific Web Conference, Kunming, China (April 2012) (under review)
He, J., et al.: Cluster Analysis and Optimization in Color-Based Clustering for Image Abstract. In: ICDM Workshops 2007, pp. 213–218 (2007)
Huang, G., Ding, Z., He, J.: Automatic Generation of Traditional Style Painting by Using Density-Based Color Clustering. In: ICDM Workshops 2007, pp. 41–44 (2007)
MicroArray Gene Expression Markup Language Links, http://www.mged.org/Workgroups/MAGE/MAGEdescription2.pdf
Zhang, Y., Pang, C., He, J.: On multidimensional wavelet synopses for maximum error bounds. In: MCDM 2009, Chengdu, China (2009)
Huang, G., He, J., Ding, Z.: Wireless Video-Based Sensor Networks for Surveillance of Residential Districts. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds.) APWeb 2008. LNCS, vol. 4976, pp. 154–165. Springer, Heidelberg (2008)
Huang, G., He, J., Ding, Z.: Inter-frame change directing online clustering of multiple moving objects for video-based sensor networks. In: Web Intelligence/IAT Workshops 2008, pp. 442–446 (2008)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
He, J. et al. (2012). An Association Rule Analysis Framework for Complex Physiological and Genetic Data. In: He, J., Liu, X., Krupinski, E.A., Xu, G. (eds) Health Information Science. HIS 2012. Lecture Notes in Computer Science, vol 7231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29361-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-29361-0_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29360-3
Online ISBN: 978-3-642-29361-0
eBook Packages: Computer ScienceComputer Science (R0)