Skip to main content

Classification of Autism Genes Using Network Science and Linear Genetic Programming

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12101))

Abstract

Understanding the genetic background of complex diseases and disorders plays an essential role in the promising precision medicine. Deciphering what genes are associated with a specific disease/disorder helps better diagnose and treat it, and may even prevent it if predicted accurately and acted on effectively at early stages. The evaluation of candidate disease-associated genes, however, requires time-consuming and expensive experiments given the large number of possibilities. Due to such challenges, computational methods have seen increasing applications in predicting gene-disease associations. Given the intertwined relationships of molecules in human cells, genes and their products can be considered to form a complex molecular interaction network. Such a network can be used to find candidate genes that share similar network properties with known disease-associated genes. In this research, we investigate autism spectrum disorders and propose a linear genetic programming algorithm for autism gene prediction using a human molecular interaction network and known autism-genes for training. We select an initial set of network properties as features and our LGP algorithm is able to find the most relevant features while evolving accurate predictive models. Our research demonstrates the powerful and flexible learning abilities of GP on tackling a significant biomedical problem, and is expected to inspire further exploration of wide GP applications.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Loscalzo, J., Kohane, I., Barabási, A.L.: Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol. Syst. Biol. 3(1), 124 (2007)

    Article  Google Scholar 

  2. Griffiths, A.J., Miller, J.H., Suzuki, D.T., Lewontin, R.C., et al.: An Introduction to Genetic Analysis. WH Freeman and Company, New York (2000)

    Google Scholar 

  3. Glazier, A.M., Nadeau, J.H., Aitman, T.J.: Finding genes that underlie complex traits. Science 298(5602), 2345–2349 (2002)

    Article  Google Scholar 

  4. Zhu, M., Zhao, S.: Candidate gene identification approach: progress and challenges. Int. J. Biol. Sci. 3(7), 420–427 (2007)

    Article  Google Scholar 

  5. Kwon, J.M., Goate, A.M.: The candidate gene approach. Alcohol Res. Health 24(3), 164–168 (2000)

    Google Scholar 

  6. Tabor, H.K., Risch, N.J., Myers, R.M.: Candidate-gene approaches for studying complex genetic traits: practical considerations. Nat. Rev. Genet. 3(5), 391–397 (2002)

    Article  Google Scholar 

  7. Di Ventura, B., Lemerle, C., Michalodimitrakis, K., Serrano, L.: From in vivo to in silico biology and back. Nature 443(7111), 527–533 (2006)

    Article  Google Scholar 

  8. Barabási, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)

    Article  Google Scholar 

  9. Almasi, S.M., Hu, T.: Measuring the importance of vertices in the weighted human disease network. PLoS ONE 14(3), e0205936 (2019)

    Article  Google Scholar 

  10. Hu, T., Sinnott-Armstrong, N.A., Kiralis, J.W., Andrew, A.S., Karagas, M.R., Moore, J.H.: Characterizing genetic interactions in human disease association studies using statistical epistasis networks. BMC Bioinf. 12(1), 364 (2011)

    Article  Google Scholar 

  11. Hu, T., et al.: An information-gain approach to detecting three-way epistatic interactions in genetic association studies. J. Am. Med. Inf. Assoc. 20(4), 630–636 (2013)

    Article  Google Scholar 

  12. Hu, T., Tomassini, M., Banzhaf, W.: Complex network analysis of a genetic programming phenotype network. In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds.) EuroGP 2019. LNCS, vol. 11451, pp. 49–63. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_4

    Chapter  Google Scholar 

  13. Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabási, A.L.: The human disease network. Proc. Nat. Acad. Sci. 104(21), 8685–8690 (2007)

    Article  Google Scholar 

  14. Kafaie, S., Chen, Y., Hu, T.: A network approach to prioritizing susceptibility genes for genome-wide association studies. Genet. Epidemiol. 43(5), 477–491 (2019)

    Article  Google Scholar 

  15. Sun, K., Gonçalves, J.P., Larminie, C., Pržulj, N.: Predicting disease associations via biological network analysis. BMC Bioinf. 15(1), 304 (2014)

    Article  Google Scholar 

  16. Ott, J.: Neural networks and disease association studies. Am. J. Med. Genet. 105(1), 60–61 (2001)

    Article  Google Scholar 

  17. Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1–3), 37–52 (1987)

    Article  Google Scholar 

  18. Yang, P., Li, X., Chua, H.N., Kwoh, C.K., Ng, S.K.: Ensemble positive unlabeled learning for disease gene identification. PLoS ONE 9(5), e97079 (2014)

    Article  Google Scholar 

  19. Dorani, F., Hu, T., Woods, M.O., Zhai, G.: Ensemble learning for detecting gene-gene interactions in colorectal cancer. PeerJ 6, e5854 (2018)

    Article  Google Scholar 

  20. Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming. Published via http://lulu.com (2008)

  21. Pappa, G.L., Ochoa, G., Hyde, M.R., Freitas, A.A., Woodward, J., Swan, J.: Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms. Genet. Program. Evol. Mach. 15(1), 3–35 (2014). https://doi.org/10.1007/s10710-013-9186-9

    Article  Google Scholar 

  22. Brameier, M., Banzhaf, W.: A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evol. Comput. 5(1), 17–26 (2001)

    Article  MATH  Google Scholar 

  23. Guven, A.: Linear genetic programming for time-series modelling of daily flow rate. J. Earth Syst. Sci. 118(2), 137–146 (2009)

    Article  Google Scholar 

  24. Agapitos, A., O’Neill, M., Brabazon, A.: Adaptive distance metrics for nearest neighbour classification based on genetic programming. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds.) EuroGP 2013. LNCS, vol. 7831, pp. 1–12. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37207-0_1

    Chapter  Google Scholar 

  25. Nguyen, S., Mei, Y., Zhang, M.: Genetic programming for production scheduling: a survey with a unified framework. Complex Intell. Syst. 3(1), 41–66 (2017)

    Article  Google Scholar 

  26. Parkins, A.D., Nandi, A.K.: Genetic programming techniques for hand written digit recognition. Signal Process. 84(12), 2345–2365 (2004)

    Article  Google Scholar 

  27. Chen, S.H., Yeh, C.H.: Evolving traders and the business school with genetic programming: a new architecture of the agent-based artificial stock market. J. Econ. Dyn. Control 25(3–4), 363–393 (2001)

    Article  MATH  Google Scholar 

  28. Liu, K.H., Xu, C.G.: A genetic programming-based approach to the classification of multiclass microarray datasets. Bioinformatics 25(3), 331–337 (2009). https://doi.org/10.1093/bioinformatics/btn644

    Article  Google Scholar 

  29. Link, J., et al.: Application of genetic programming to high energy physics event selection. Nucl. Instrum. Methods Phys. Res., Sect. A 551(2–3), 504–527 (2005)

    Article  Google Scholar 

  30. Hu, T., et al.: An evolutioanry learning and network approach to identifying key metabolites for osteoarthritis. PLoS Comput. Biol. 14(3), e1005986 (2018)

    Article  Google Scholar 

  31. Hu, T., Oksanen, K., Zhang, W., Randell, E., Furey, A., Zhai, G.: Analyzing feature importance for metabolomics using genetic programming. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds.) EuroGP 2018. LNCS, vol. 10781, pp. 68–83. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77553-1_5

    Chapter  Google Scholar 

  32. Zhang, Y., Hu, T., Liang, X., Ali, M.Z., Shabbir, M.N.S.K.: Fault detection and classification for induction motors using genetic programming. In: Sekanina, L., Hu, T., Lourenço, N., Richter, H., García-Sánchez, P. (eds.) EuroGP 2019. LNCS, vol. 11451, pp. 178–193. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16670-0_12

    Chapter  Google Scholar 

  33. Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Berlin (2013)

    MATH  Google Scholar 

  34. Guo, H., Jack, L.B., Nandi, A.K.: Feature generation using genetic programming with application to fault classification. IEEE Trans. Sys. Man Cybern. Part B (Cybern.) 35(1), 89–99 (2005)

    Article  Google Scholar 

  35. Witczak, M., Obuchowicz, A., Korbicz, J.: Genetic programming based approaches to identification and fault diagnosis of non-linear dynamic systems. Int. J. Control 75(13), 1012–1031 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  36. Ghiassian, S.D., Menche, J., Barabasi, A.L.: A DIseAse MOdule Detection (DIAMOnD) algorithm derived from a systematic analysis of connectivity patterns of disease proteins in the human interactome. PLoS Comput. Biol. 11(4), e1004120 (2015)

    Article  Google Scholar 

  37. Menche, J., et al.: Uncovering disease-disease relationships through the incomplete interactome. Science 347(6224), 1257601 (2015)

    Article  Google Scholar 

  38. Abrahams, B.S., et al.: FARI gene 2.0: a community-driven knowledgebase for the autism spectrum disorders (ASDs). Mol. Autism 4(1), 36 (2013)

    Article  Google Scholar 

  39. Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick, V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(suppl-1), 514–517 (2005)

    Google Scholar 

  40. Duda, M., Zhang, H., Li, H.D., Wall, D.P., Burmeister, M., Guan, Y.: Brain-specific functional relationship networks inform autism spectrum disorder gene prediction. Trans. Psychiatry 8(1), 56 (2018)

    Article  Google Scholar 

  41. Oughtred, R., et al.: The biogrid interaction database: 2019 update. Nucleic Acids Res. 47(D1), D529–D541 (2018)

    Article  Google Scholar 

  42. Gleich, D.F.: Pagerank beyond the web. SIAM Rev. 57(3), 321–363 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  43. Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. arXiv preprint cs/0310049 (2003)

    Google Scholar 

  44. Newman, M.E.J.: Networks, 2nd edn. Oxford University Press, Oxford (2018)

    Book  MATH  Google Scholar 

  45. Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 23(2), e177–e183 (2007)

    Article  Google Scholar 

  46. Brameier, M.F., Banzhaf, W.: Linear Genetic Programming. Springer, New York (2007)

    MATH  Google Scholar 

  47. Abraham, A., Ramos, V.: Web usage mining using artificial ant colony clustering and linear genetic programming. In: The 2003 Congress on Evolutionary Computation, CEC 2003, vol. 2, pp. 1384–1391. IEEE (2003)

    Google Scholar 

  48. Nag, K., Pal, N.R.: A multiobjective genetic programming-based ensemble for simultaneous feature selection and classification. IEEE Trans. Cybern. 46(2), 499–510 (2015)

    Article  Google Scholar 

  49. Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 1, 37–63 (2011)

    Google Scholar 

  50. Buckland, M., Gey, F.: The relationship between recall and precision. J. Am. Soc. Inf. Sci. 45(1), 12–19 (1994)

    Article  Google Scholar 

  51. Iossifov, I., et al.: The contribution of de novo coding mutations to autism spectrum disorder. Nature 515(7526), 216 (2014)

    Article  Google Scholar 

  52. Fischbach, G.D., Lord, C.: The simons simplex collection: a resource for identification of autism genetic risk factors. Neuron 68(2), 192–195 (2010)

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by the Natural Science and Engineering Research Council (NSERC) of Canada Discovery Grant RGPIN-2016-04699 to TH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ting Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Chen, Y., Hu, T. (2020). Classification of Autism Genes Using Network Science and Linear Genetic Programming. In: Hu, T., Lourenço, N., Medvet, E., Divina, F. (eds) Genetic Programming. EuroGP 2020. Lecture Notes in Computer Science(), vol 12101. Springer, Cham. https://doi.org/10.1007/978-3-030-44094-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-44094-7_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-44093-0

  • Online ISBN: 978-3-030-44094-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics