Characterization of Subgroup Patterns from Graphical Representation of Genomic Data

Lee, Sangkyun

doi:10.1007/978-3-319-09891-3_47

Sangkyun Lee²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8609))

Included in the following conference series:

International Conference on Brain Informatics and Health

1747 Accesses

Abstract

High-throughput genomic profiling technology provides us detailed information of biological systems. However, it also increases the dimensionality in data, which makes it harder to identify key features and their relations to other features hidden in feature spaces. In this paper we propose a new idea based on the structure learning for the Gaussian Markov random field, which provides us an efficient way to represent a feature space as a collection of small graphs, where nodes represent features and edges represent conditional dependency between features. In our approach a collection of small graphs is created for each subgroup of a cohort, where our interest lies in finding characteristic patterns in each subgroup graph compared to the other subgroup graphs. A simple but effective method is proposed using polarized adjacency matrices to find topological differences in collections of graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Banerjee, O., Ghaoui, L.E., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research 9, 485–516 (2008)
MATH Google Scholar
Candés, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math. 59, 1207–1223 (2005)
Article Google Scholar
Castellana, B., Escuin, D., Peiró, G., Garcia-Valdecasas, B., Vázquez, T., Pons, C., Pérez-Olabarria, M., Barnadas, A., Lerma, E.: ASPN and GJB2 are implicated in the mechanisms of invasion of ductal breast carcinomas. Journal of Cancer 3, 175–183 (2012)
Article Google Scholar
Cox, D.R., Oakes, D.: Analysis of Survival Data. Monographs on Statistics & Applied Probability. Chapman & Hall/CRC (1984)
Google Scholar
Cox, D.R., Wermuth, N.: Multivariate Dependencies: Models, Analysis and Interpretation. Chapman and Hall (1996)
Google Scholar
Cross, G.R., Jain, A.K.: Markov random field texture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 5(1), 25–39 (1983)
Article Google Scholar
d’Aspremont, A., Banerjee, O., El Ghaoui, L.: First-order methods for sparse covariance selection. SIAM Journal on Matrix Analysis and Applications 30(1), 56–66 (2008)
Article MATH MathSciNet Google Scholar
Dempster, A.P.: Covariance selection. Biometrika 32, 95–108 (1972)
Google Scholar
Dinh, Q.T., Kyrillidis, A., Cevher, V.: A proximal newton framework for composite minimization: Graph learning without cholesky decompositions and matrix inversions. In: International Conference on Machine Learning (2013)
Google Scholar
Dobra, A., Hans, C., Jones, B., Nevins, J.R., Yao, G., West, M.: Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis 90(1), 196–212 (2004)
Article MATH MathSciNet Google Scholar
Dryden, I., Ippoliti, L., Romagnoli, L.: Adjusted maximum likelihood and pseudo-likelihood estimation for noisy Gaussian Markov random fields. Journal of Computational and Graphical Statistics 11(2), 370–388 (2002)
Article MathSciNet Google Scholar
Duchi, J., Gould, S., Koller, D.: Projected subgradient methods for learning sparse gaussians. In: Conference on Uncertainty in Artificial Intelligence (2008)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Article MATH Google Scholar
Giudici, P., Green, P.: Decomposable graphical Gaussian model determination. Biometrika 86(4), 785–801 (1999)
Article MATH MathSciNet Google Scholar
Grunenwald, H., Baas, B., Caruccio, N., Syed, F.: Rapid, high-throughput library preparation for next-generation sequencing. Nature Methods 7(8) (2010)
Google Scholar
Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)
Chapter Google Scholar
Jui Hsieh, C., Dhillon, I.S., Ravikumar, P.K., Sustik, M.A.: Sparse inverse covariance matrix estimation using quadratic approximation. In: Advances in Neural Information Processing Systems, vol. 24, pp. 2330–2338. MIT Press (2011)
Google Scholar
Hsieh, C.J., Sustik, M.A., Dhillon, I., Ravikumar, P., Poldrack, R.: BIG & QUIC: Sparse inverse covariance estimation for a million variables. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3165–3173. MIT Press (2013)
Google Scholar
Klopocki, E., Kristiansen, G., Wild, P.J., Klaman, I., Castanos-Velez, E., Singer, G., Stöhr, R., Simon, R., Sauter, G., Leibiger, H., Essers, L., Weber, B., Hermann, K., Rosenthal, A., Hartmann, A., Dahl, E.: Loss of SFRP1 is associated with breast cancer progression and poor prognosis in early stage tumors. International Journal of Oncology 25(3), 641–649 (2004)
Google Scholar
Lauritzen, S.L.: Graphical Models. Oxford University Press (1996)
Google Scholar
Lee, S.: Sparse inverse covariance estimation for graph representation of feature structure. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 227–240. Springer, Heidelberg (2014)
Google Scholar
Lee, S., Wright, S.J.: Implementing algorithms for signal and image reconstruction on graphical processing units. Tech. rep., University of Wisconsin-Madison (2008)
Google Scholar
Lee, S., Wright, S.J.: Manifold identification in dual averaging methods for regularized stochastic online learning. Journal of Machine Learning Research 13, 1705–1744 (2012)
MATH MathSciNet Google Scholar
Lilla, C., Koehler, T., Kropp, S., Wang-Gohrke, S., Chang-Claude, J.: Alcohol dehydrogenase 1B (ADH1B) genotype, alcohol consumption and breast cancer risk by age 50 years in a german case-control study. British Journal of Cancer 92(11), 2039–2041 (2005)
Article Google Scholar
Lu, Z.: Smooth optimization approach for sparse covariance selection. SIAM Journal on Optimization 19(4), 1807–1827 (2009)
Article MATH Google Scholar
Manjunath, B.S., Chellappa, R.: Unsupervised texture segmentation using Markov random field models. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(5), 478–482 (1991)
Article Google Scholar
McCall, M., Murakami, P., Lukk, M., Huber, W., Irizarry, R.: Assessing affymetrix genechip microarray quality. BMC Bioinformatics 12(1), 137 (2011)
Article Google Scholar
McCall, M.N., Bolstad, B.M., Irizarry, R.A.: Frozen robust multiarray analysis (fRMA). Biostatistics 11(2), 242–253 (2010)
Article Google Scholar
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Annals of Statistics 34, 1436–1462 (2006)
Article MATH MathSciNet Google Scholar
Meinshausen, N., Bühlmann, P.: Stability selection. Journal of the Royal Statistical Society (Series B) 72(4), 417–473 (2010)
Article MathSciNet Google Scholar
Otasek, D., Pastrello, C., Holzinger, A., Jurisica, I.: Visual data mining: Effective exploration of the biological universe. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 19–33. Springer, Heidelberg (2014)
Google Scholar
Oztoprak, F., Nocedal, J., Rennie, S., Olsen, P.A.: Newton-like methods for sparse inverse covariance estimation. In: Advances in Neural Information Processing Systems, vol. 25, pp. 764–772. MIT Press (2012)
Google Scholar
Piatkowski, N., Lee, S., Morik, K.: Spatio-temporal random fields: compressible representation and distributed estimation. Machine Learning 93(1), 115–139 (2013)
Article MathSciNet Google Scholar
Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability, vol. 104. Chapman & Hall (2005)
Google Scholar
Scheinberg, K., Ma, S., Goldfarb, D.: Sparse inverse covariance selection via alternating linearization methods. In: Advances in Neural Information Processing Systems, vol. 23, pp. 2101–2109. MIT Press (2010)
Google Scholar
Spielman, D.A.: Faster isomorphism testing of strongly regular graphs. In: Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, pp. 576–584 (1996)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B) 58, 267–288 (1996)
MATH MathSciNet Google Scholar
Turkay, C., Jeanquartier, F., Holzinger, A., Hauser, H.: On computationally-enhanced visual analysis of heterogeneous data and its application in biomedical informatics. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 117–140. Springer, Heidelberg (2014)
Google Scholar
Ullmann, J.R.: An algorithm for subgraph isomorphism. Journal of the ACM 23(1), 31–42 (1976)
Article MathSciNet Google Scholar
Vandenberghe, L., Boyd, S., Wu, S.P.: Determinant maximization with linear matrix inequality constraints. SIAM Journal on Matrix Analysis and Applications 19(2), 499–533 (1998)
Article MATH MathSciNet Google Scholar
Verzelen, N., Villers, F.: Tests for gaussian graphical models. Computational Statistics and Data Analysis 53(5), 1894–1905 (2009)
Article MATH MathSciNet Google Scholar
Whitney, H.: Congruent graphs and the connectivity of graphs. American Journal of Mathematics 54(1), 150–168 (1932)
Article MathSciNet Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in the gaussian graphical model. Biometrika 94(1), 19–35 (2007)
Article MATH MathSciNet Google Scholar
Yuan, X.: Alternating direction method for covariance selection models. Journal of Scientific Computing 51(2), 261–273 (2012)
Article MATH MathSciNet Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society (Series B) 67, 301–320 (2005)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Fakultät für Informatik, LS VIII, Technische Universität Dortmund, 44221, Dortmund, Germany
Sangkyun Lee

Authors

Sangkyun Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Warsaw and Infobright Inc., Poland
Dominik Ślȩzak
School of Computer Engineering, Nanyang Technological University, Nanyang Avenue, Singapore
Ah-Hwee Tan
Computational Intelligence Laboratory, ECE Department, University of Manitoba, R3T 5V6, Winnipeg, MB, Canada
James F. Peters
Institute of Computer Science, University of Rostock, Rostock, Germany
Lars Schwabe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, S. (2014). Characterization of Subgroup Patterns from Graphical Representation of Genomic Data. In: Ślȩzak, D., Tan, AH., Peters, J.F., Schwabe, L. (eds) Brain Informatics and Health. BIH 2014. Lecture Notes in Computer Science(), vol 8609. Springer, Cham. https://doi.org/10.1007/978-3-319-09891-3_47

Download citation

DOI: https://doi.org/10.1007/978-3-319-09891-3_47
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09890-6
Online ISBN: 978-3-319-09891-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics