Abstract
High-throughput genomic profiling technology provides us detailed information of biological systems. However, it also increases the dimensionality in data, which makes it harder to identify key features and their relations to other features hidden in feature spaces. In this paper we propose a new idea based on the structure learning for the Gaussian Markov random field, which provides us an efficient way to represent a feature space as a collection of small graphs, where nodes represent features and edges represent conditional dependency between features. In our approach a collection of small graphs is created for each subgroup of a cohort, where our interest lies in finding characteristic patterns in each subgroup graph compared to the other subgroup graphs. A simple but effective method is proposed using polarized adjacency matrices to find topological differences in collections of graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Banerjee, O., Ghaoui, L.E., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research 9, 485–516 (2008)
Candés, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math. 59, 1207–1223 (2005)
Castellana, B., Escuin, D., Peiró, G., Garcia-Valdecasas, B., Vázquez, T., Pons, C., Pérez-Olabarria, M., Barnadas, A., Lerma, E.: ASPN and GJB2 are implicated in the mechanisms of invasion of ductal breast carcinomas. Journal of Cancer 3, 175–183 (2012)
Cox, D.R., Oakes, D.: Analysis of Survival Data. Monographs on Statistics & Applied Probability. Chapman & Hall/CRC (1984)
Cox, D.R., Wermuth, N.: Multivariate Dependencies: Models, Analysis and Interpretation. Chapman and Hall (1996)
Cross, G.R., Jain, A.K.: Markov random field texture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 5(1), 25–39 (1983)
d’Aspremont, A., Banerjee, O., El Ghaoui, L.: First-order methods for sparse covariance selection. SIAM Journal on Matrix Analysis and Applications 30(1), 56–66 (2008)
Dempster, A.P.: Covariance selection. Biometrika 32, 95–108 (1972)
Dinh, Q.T., Kyrillidis, A., Cevher, V.: A proximal newton framework for composite minimization: Graph learning without cholesky decompositions and matrix inversions. In: International Conference on Machine Learning (2013)
Dobra, A., Hans, C., Jones, B., Nevins, J.R., Yao, G., West, M.: Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis 90(1), 196–212 (2004)
Dryden, I., Ippoliti, L., Romagnoli, L.: Adjusted maximum likelihood and pseudo-likelihood estimation for noisy Gaussian Markov random fields. Journal of Computational and Graphical Statistics 11(2), 370–388 (2002)
Duchi, J., Gould, S., Koller, D.: Projected subgradient methods for learning sparse gaussians. In: Conference on Uncertainty in Artificial Intelligence (2008)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Giudici, P., Green, P.: Decomposable graphical Gaussian model determination. Biometrika 86(4), 785–801 (1999)
Grunenwald, H., Baas, B., Caruccio, N., Syed, F.: Rapid, high-throughput library preparation for next-generation sequencing. Nature Methods 7(8) (2010)
Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)
Jui Hsieh, C., Dhillon, I.S., Ravikumar, P.K., Sustik, M.A.: Sparse inverse covariance matrix estimation using quadratic approximation. In: Advances in Neural Information Processing Systems, vol. 24, pp. 2330–2338. MIT Press (2011)
Hsieh, C.J., Sustik, M.A., Dhillon, I., Ravikumar, P., Poldrack, R.: BIG & QUIC: Sparse inverse covariance estimation for a million variables. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3165–3173. MIT Press (2013)
Klopocki, E., Kristiansen, G., Wild, P.J., Klaman, I., Castanos-Velez, E., Singer, G., Stöhr, R., Simon, R., Sauter, G., Leibiger, H., Essers, L., Weber, B., Hermann, K., Rosenthal, A., Hartmann, A., Dahl, E.: Loss of SFRP1 is associated with breast cancer progression and poor prognosis in early stage tumors. International Journal of Oncology 25(3), 641–649 (2004)
Lauritzen, S.L.: Graphical Models. Oxford University Press (1996)
Lee, S.: Sparse inverse covariance estimation for graph representation of feature structure. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 227–240. Springer, Heidelberg (2014)
Lee, S., Wright, S.J.: Implementing algorithms for signal and image reconstruction on graphical processing units. Tech. rep., University of Wisconsin-Madison (2008)
Lee, S., Wright, S.J.: Manifold identification in dual averaging methods for regularized stochastic online learning. Journal of Machine Learning Research 13, 1705–1744 (2012)
Lilla, C., Koehler, T., Kropp, S., Wang-Gohrke, S., Chang-Claude, J.: Alcohol dehydrogenase 1B (ADH1B) genotype, alcohol consumption and breast cancer risk by age 50 years in a german case-control study. British Journal of Cancer 92(11), 2039–2041 (2005)
Lu, Z.: Smooth optimization approach for sparse covariance selection. SIAM Journal on Optimization 19(4), 1807–1827 (2009)
Manjunath, B.S., Chellappa, R.: Unsupervised texture segmentation using Markov random field models. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(5), 478–482 (1991)
McCall, M., Murakami, P., Lukk, M., Huber, W., Irizarry, R.: Assessing affymetrix genechip microarray quality. BMC Bioinformatics 12(1), 137 (2011)
McCall, M.N., Bolstad, B.M., Irizarry, R.A.: Frozen robust multiarray analysis (fRMA). Biostatistics 11(2), 242–253 (2010)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Annals of Statistics 34, 1436–1462 (2006)
Meinshausen, N., Bühlmann, P.: Stability selection. Journal of the Royal Statistical Society (Series B) 72(4), 417–473 (2010)
Otasek, D., Pastrello, C., Holzinger, A., Jurisica, I.: Visual data mining: Effective exploration of the biological universe. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 19–33. Springer, Heidelberg (2014)
Oztoprak, F., Nocedal, J., Rennie, S., Olsen, P.A.: Newton-like methods for sparse inverse covariance estimation. In: Advances in Neural Information Processing Systems, vol. 25, pp. 764–772. MIT Press (2012)
Piatkowski, N., Lee, S., Morik, K.: Spatio-temporal random fields: compressible representation and distributed estimation. Machine Learning 93(1), 115–139 (2013)
Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability, vol. 104. Chapman & Hall (2005)
Scheinberg, K., Ma, S., Goldfarb, D.: Sparse inverse covariance selection via alternating linearization methods. In: Advances in Neural Information Processing Systems, vol. 23, pp. 2101–2109. MIT Press (2010)
Spielman, D.A.: Faster isomorphism testing of strongly regular graphs. In: Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, pp. 576–584 (1996)
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B) 58, 267–288 (1996)
Turkay, C., Jeanquartier, F., Holzinger, A., Hauser, H.: On computationally-enhanced visual analysis of heterogeneous data and its application in biomedical informatics. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 117–140. Springer, Heidelberg (2014)
Ullmann, J.R.: An algorithm for subgraph isomorphism. Journal of the ACM 23(1), 31–42 (1976)
Vandenberghe, L., Boyd, S., Wu, S.P.: Determinant maximization with linear matrix inequality constraints. SIAM Journal on Matrix Analysis and Applications 19(2), 499–533 (1998)
Verzelen, N., Villers, F.: Tests for gaussian graphical models. Computational Statistics and Data Analysis 53(5), 1894–1905 (2009)
Whitney, H.: Congruent graphs and the connectivity of graphs. American Journal of Mathematics 54(1), 150–168 (1932)
Yuan, M., Lin, Y.: Model selection and estimation in the gaussian graphical model. Biometrika 94(1), 19–35 (2007)
Yuan, X.: Alternating direction method for covariance selection models. Journal of Scientific Computing 51(2), 261–273 (2012)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society (Series B) 67, 301–320 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Lee, S. (2014). Characterization of Subgroup Patterns from Graphical Representation of Genomic Data. In: Ślȩzak, D., Tan, AH., Peters, J.F., Schwabe, L. (eds) Brain Informatics and Health. BIH 2014. Lecture Notes in Computer Science(), vol 8609. Springer, Cham. https://doi.org/10.1007/978-3-319-09891-3_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-09891-3_47
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09890-6
Online ISBN: 978-3-319-09891-3
eBook Packages: Computer ScienceComputer Science (R0)