Skip to main content

Characterization of Subgroup Patterns from Graphical Representation of Genomic Data

  • Conference paper
Brain Informatics and Health (BIH 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8609))

Included in the following conference series:

  • 1747 Accesses

Abstract

High-throughput genomic profiling technology provides us detailed information of biological systems. However, it also increases the dimensionality in data, which makes it harder to identify key features and their relations to other features hidden in feature spaces. In this paper we propose a new idea based on the structure learning for the Gaussian Markov random field, which provides us an efficient way to represent a feature space as a collection of small graphs, where nodes represent features and edges represent conditional dependency between features. In our approach a collection of small graphs is created for each subgroup of a cohort, where our interest lies in finding characteristic patterns in each subgroup graph compared to the other subgroup graphs. A simple but effective method is proposed using polarized adjacency matrices to find topological differences in collections of graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banerjee, O., Ghaoui, L.E., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research 9, 485–516 (2008)

    MATH  Google Scholar 

  2. Candés, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math. 59, 1207–1223 (2005)

    Article  Google Scholar 

  3. Castellana, B., Escuin, D., Peiró, G., Garcia-Valdecasas, B., Vázquez, T., Pons, C., Pérez-Olabarria, M., Barnadas, A., Lerma, E.: ASPN and GJB2 are implicated in the mechanisms of invasion of ductal breast carcinomas. Journal of Cancer 3, 175–183 (2012)

    Article  Google Scholar 

  4. Cox, D.R., Oakes, D.: Analysis of Survival Data. Monographs on Statistics & Applied Probability. Chapman & Hall/CRC (1984)

    Google Scholar 

  5. Cox, D.R., Wermuth, N.: Multivariate Dependencies: Models, Analysis and Interpretation. Chapman and Hall (1996)

    Google Scholar 

  6. Cross, G.R., Jain, A.K.: Markov random field texture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 5(1), 25–39 (1983)

    Article  Google Scholar 

  7. d’Aspremont, A., Banerjee, O., El Ghaoui, L.: First-order methods for sparse covariance selection. SIAM Journal on Matrix Analysis and Applications 30(1), 56–66 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  8. Dempster, A.P.: Covariance selection. Biometrika 32, 95–108 (1972)

    Google Scholar 

  9. Dinh, Q.T., Kyrillidis, A., Cevher, V.: A proximal newton framework for composite minimization: Graph learning without cholesky decompositions and matrix inversions. In: International Conference on Machine Learning (2013)

    Google Scholar 

  10. Dobra, A., Hans, C., Jones, B., Nevins, J.R., Yao, G., West, M.: Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis 90(1), 196–212 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  11. Dryden, I., Ippoliti, L., Romagnoli, L.: Adjusted maximum likelihood and pseudo-likelihood estimation for noisy Gaussian Markov random fields. Journal of Computational and Graphical Statistics 11(2), 370–388 (2002)

    Article  MathSciNet  Google Scholar 

  12. Duchi, J., Gould, S., Koller, D.: Projected subgradient methods for learning sparse gaussians. In: Conference on Uncertainty in Artificial Intelligence (2008)

    Google Scholar 

  13. Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)

    Article  MATH  Google Scholar 

  14. Giudici, P., Green, P.: Decomposable graphical Gaussian model determination. Biometrika 86(4), 785–801 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  15. Grunenwald, H., Baas, B., Caruccio, N., Syed, F.: Rapid, high-throughput library preparation for next-generation sequencing. Nature Methods 7(8) (2010)

    Google Scholar 

  16. Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  17. Jui Hsieh, C., Dhillon, I.S., Ravikumar, P.K., Sustik, M.A.: Sparse inverse covariance matrix estimation using quadratic approximation. In: Advances in Neural Information Processing Systems, vol. 24, pp. 2330–2338. MIT Press (2011)

    Google Scholar 

  18. Hsieh, C.J., Sustik, M.A., Dhillon, I., Ravikumar, P., Poldrack, R.: BIG & QUIC: Sparse inverse covariance estimation for a million variables. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3165–3173. MIT Press (2013)

    Google Scholar 

  19. Klopocki, E., Kristiansen, G., Wild, P.J., Klaman, I., Castanos-Velez, E., Singer, G., Stöhr, R., Simon, R., Sauter, G., Leibiger, H., Essers, L., Weber, B., Hermann, K., Rosenthal, A., Hartmann, A., Dahl, E.: Loss of SFRP1 is associated with breast cancer progression and poor prognosis in early stage tumors. International Journal of Oncology 25(3), 641–649 (2004)

    Google Scholar 

  20. Lauritzen, S.L.: Graphical Models. Oxford University Press (1996)

    Google Scholar 

  21. Lee, S.: Sparse inverse covariance estimation for graph representation of feature structure. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 227–240. Springer, Heidelberg (2014)

    Google Scholar 

  22. Lee, S., Wright, S.J.: Implementing algorithms for signal and image reconstruction on graphical processing units. Tech. rep., University of Wisconsin-Madison (2008)

    Google Scholar 

  23. Lee, S., Wright, S.J.: Manifold identification in dual averaging methods for regularized stochastic online learning. Journal of Machine Learning Research 13, 1705–1744 (2012)

    MATH  MathSciNet  Google Scholar 

  24. Lilla, C., Koehler, T., Kropp, S., Wang-Gohrke, S., Chang-Claude, J.: Alcohol dehydrogenase 1B (ADH1B) genotype, alcohol consumption and breast cancer risk by age 50 years in a german case-control study. British Journal of Cancer 92(11), 2039–2041 (2005)

    Article  Google Scholar 

  25. Lu, Z.: Smooth optimization approach for sparse covariance selection. SIAM Journal on Optimization 19(4), 1807–1827 (2009)

    Article  MATH  Google Scholar 

  26. Manjunath, B.S., Chellappa, R.: Unsupervised texture segmentation using Markov random field models. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(5), 478–482 (1991)

    Article  Google Scholar 

  27. McCall, M., Murakami, P., Lukk, M., Huber, W., Irizarry, R.: Assessing affymetrix genechip microarray quality. BMC Bioinformatics 12(1), 137 (2011)

    Article  Google Scholar 

  28. McCall, M.N., Bolstad, B.M., Irizarry, R.A.: Frozen robust multiarray analysis (fRMA). Biostatistics 11(2), 242–253 (2010)

    Article  Google Scholar 

  29. Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Annals of Statistics 34, 1436–1462 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  30. Meinshausen, N., Bühlmann, P.: Stability selection. Journal of the Royal Statistical Society (Series B) 72(4), 417–473 (2010)

    Article  MathSciNet  Google Scholar 

  31. Otasek, D., Pastrello, C., Holzinger, A., Jurisica, I.: Visual data mining: Effective exploration of the biological universe. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 19–33. Springer, Heidelberg (2014)

    Google Scholar 

  32. Oztoprak, F., Nocedal, J., Rennie, S., Olsen, P.A.: Newton-like methods for sparse inverse covariance estimation. In: Advances in Neural Information Processing Systems, vol. 25, pp. 764–772. MIT Press (2012)

    Google Scholar 

  33. Piatkowski, N., Lee, S., Morik, K.: Spatio-temporal random fields: compressible representation and distributed estimation. Machine Learning 93(1), 115–139 (2013)

    Article  MathSciNet  Google Scholar 

  34. Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability, vol. 104. Chapman & Hall (2005)

    Google Scholar 

  35. Scheinberg, K., Ma, S., Goldfarb, D.: Sparse inverse covariance selection via alternating linearization methods. In: Advances in Neural Information Processing Systems, vol. 23, pp. 2101–2109. MIT Press (2010)

    Google Scholar 

  36. Spielman, D.A.: Faster isomorphism testing of strongly regular graphs. In: Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, pp. 576–584 (1996)

    Google Scholar 

  37. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B) 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  38. Turkay, C., Jeanquartier, F., Holzinger, A., Hauser, H.: On computationally-enhanced visual analysis of heterogeneous data and its application in biomedical informatics. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 117–140. Springer, Heidelberg (2014)

    Google Scholar 

  39. Ullmann, J.R.: An algorithm for subgraph isomorphism. Journal of the ACM 23(1), 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  40. Vandenberghe, L., Boyd, S., Wu, S.P.: Determinant maximization with linear matrix inequality constraints. SIAM Journal on Matrix Analysis and Applications 19(2), 499–533 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  41. Verzelen, N., Villers, F.: Tests for gaussian graphical models. Computational Statistics and Data Analysis 53(5), 1894–1905 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  42. Whitney, H.: Congruent graphs and the connectivity of graphs. American Journal of Mathematics 54(1), 150–168 (1932)

    Article  MathSciNet  Google Scholar 

  43. Yuan, M., Lin, Y.: Model selection and estimation in the gaussian graphical model. Biometrika 94(1), 19–35 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  44. Yuan, X.: Alternating direction method for covariance selection models. Journal of Scientific Computing 51(2), 261–273 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  45. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society (Series B) 67, 301–320 (2005)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Lee, S. (2014). Characterization of Subgroup Patterns from Graphical Representation of Genomic Data. In: Ślȩzak, D., Tan, AH., Peters, J.F., Schwabe, L. (eds) Brain Informatics and Health. BIH 2014. Lecture Notes in Computer Science(), vol 8609. Springer, Cham. https://doi.org/10.1007/978-3-319-09891-3_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09891-3_47

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09890-6

  • Online ISBN: 978-3-319-09891-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics