Skip to main content

Towards a Graph-Theoretic Approach to Hybrid Performance Prediction from Large-Scale Phenotypic Data

  • Conference paper
  • First Online:
  • 655 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9303))

Abstract

High-throughput biological data analysis has received a large amount of interest in the last decade due to pioneering technologies that are able to automatically generate large-scale datasets by performing millions of analytical tests on a daily basis. Here we present a new network-based approach to analyze a high-throughput phenomic dataset that was collected on maize inbreds and hybrids by an automated phenotyping facility. Our dataset consists of 1600 biological samples from 600 different genotypes (200 inbred and 400 hybrid lines). On each sample, 141 phenotypic traits were observed for 33 days. We apply a graph-theoretic approach to address two important problems: (i) to discover meaningful patterns in the dataset and (ii) to predict hybrid performance in terms of biomass based on automatically collected phenotypic traits. We propose a modelling framework in which the prediction problem becomes transformed into finding the shortest path in a correlation-based network. Preliminary results show small but encouraging correlations between predicted and observed biomass. Extensions of the algorithm and applications of the modelling framework to other types of biological data are discussed.

Alberto Castellini and Christian Edlich-Muth contributed equally to this work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Andorf, S., Gärtner, T., Steinfath, M., Witucka-Wall, H., Altmann, T., Repsilber, D.: Towards systems biology of heterosis: a hypothesis about molecular network structure applied for the Arabidopsis metabolome. EURASIP J. Bioinform. Syst. Biol. 2009(1), 1–12 (2009)

    Article  Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Chen, D., Neumann, K., Friedel, S., Kilian, B., Chen, M., Altmann, T., Klukas, C.: Dissecting the phenotypic components of crop plant growth and drought responses based on high-throughput image analysis. Plant Cell 26(12), 4636–4655 (2014)

    Article  Google Scholar 

  4. Feher, K., Lisec, J., Römisch-Margl, L., Selbig, J., Gierl, A., Piepho, H.P., Nikoloski, Z., Willmitzer, L.: Deducing hybrid performance from parental metabolic profiles of young primary roots of maize by using a multivariate diallel approach. PLoS ONE 9(1), e85435 (2014)

    Article  Google Scholar 

  5. Gärtner, T., Steinfath, M., Andorf, S., Lisec, J., Meyer, R.C., Altmann, T., Willmitzer, L., Selbig, J.: Improved heterosis prediction by combining information on DNA- and metabolic markers. PLoS ONE 4(4), e5220–547 (2009)

    Article  Google Scholar 

  6. Groszmann, M., Greaves, I.K., Fujimoto, R., Peacock, W.J., Dennis, E.S.: The role of epigenetics in hybrid vigour. Trends Genet. 29(12), 684–690 (2013)

    Article  Google Scholar 

  7. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York (2001)

    Book  MATH  Google Scholar 

  8. Junker, A., Murayam, M.M., Weigelt-Fischer, K., Arana-Ceballos, F., Klukas, C., Melchinger, A.E., Meyer, R.C., Riewe, D., Altmann, T.: Optimizing experimental procedures for quantitative evaluation of crop plant performance in high throughput phenotyping systems. Frontiers in</CHECK>. Front. Plant Sci. 5, 770 (2015)

    Article  Google Scholar 

  9. Klukas, C., Chen, D., Pape, J.M.: Integrated analysis platform: an open-source information system for high-throughput plant phenotyping. Plant Physiol. 165(2), 506–518 (2014)

    Article  Google Scholar 

  10. Klukas, C., Pape, J.M., Entzian, A.: Analysis of high-throughput plant image data with the information system IAP. J. Integr. Bioinform. 9(2), 191 (2012)

    Google Scholar 

  11. Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  12. Marbach, D., Costello, J.C., Küffner, R., Vega, N.M., Prill, R.J., Camacho, D.M., Allison, K.R., Aderhold, A., The DREAM5 Consortium, Kellis, M., Collins, J.J., Stolovitzky, G.: Wisdom of crowds for robust gene network inference. Nat. Methods 9(8), 796–804 (2012)

    Google Scholar 

  13. Neumann, K., Klukas, C., Friedel, S., Rischbeck, P., Chen, D., Entzian, A., Stein, N., Graner, A., Kilian, B.: Dissecting spatio-temporal biomass accumulation in barley under different water regimes using high-throughput image analysis. Plant Cell and Environment, February 2015

    Google Scholar 

  14. Ogutu, J.O., Piepho, H.P.: Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD. BMC Proc. 8(Suppl 5), S7 (2014)

    Article  Google Scholar 

  15. Schnable, P.S., Springer, N.M.: Progress toward understanding heterosis in crop plants. Annu. Rev. Plant Biol. 64, 71–88 (2013)

    Article  Google Scholar 

  16. Steinfath, M., Gärtner, T., Lisec, J., Meyer, R.C., Altmann, T., Willmitzer, L., Selbig, J.: Prediction of hybrid biomass in Arabidopsis thaliana by selected parental SNP and metabolic markers. Theoret. Appl. Genet. 120(2), 239–247 (2010)

    Article  Google Scholar 

  17. Strogatz, S.H.: Exploring complex networks. Nature 410(6825), 268–276 (2001)

    Article  Google Scholar 

  18. Wold, H.: Soft Modelling By Latent Variables. Academic Press, London (1975)

    Google Scholar 

  19. Xu, S., Zhu, D., Zhang, Q.: Predicting hybrid performance in rice using genomic best linear unbiased prediction. PNAS 111(34), 12456–12461 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Castellini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Castellini, A., Edlich-Muth, C., Muraya, M., Klukas, C., Altmann, T., Selbig, J. (2015). Towards a Graph-Theoretic Approach to Hybrid Performance Prediction from Large-Scale Phenotypic Data. In: Lones, M., Tyrrell, A., Smith, S., Fogel, G. (eds) Information Processing in Cells and Tissues. IPCAT 2015. Lecture Notes in Computer Science(), vol 9303. Springer, Cham. https://doi.org/10.1007/978-3-319-23108-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23108-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23107-5

  • Online ISBN: 978-3-319-23108-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics