Skip to main content

Compressive Sensing and Hierarchical Clustering for Microarray Data with Missing Values

  • Conference paper
  • First Online:
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2018)

Abstract

Commonly, in gene expression microarray measurements multiple missing expression values are generated, and the proper handling of missing values is a critical task. To address the issue, in this paper a novel methodology, based on compressive sensing mechanism, is proposed in order to analyze gene expression data on the basis of topological characteristics of gene expression time series. The approach conceives, when data are recovered, their processing through a non-linear PCA for dimensional reduction and a Hierarchical Clustering Algorithm for agglomeration and visualization. Experiments have been performed on the yeast Saccharomyces cerevisiae dataset by considering different percentages of information loss. The approach highlights robust performance when high percentage of loss of information occurs and when few sampling data are available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.mas.ncl.ac.uk/~ntmwn/compare2trees/.

References

  1. Draghici, S., Khatri, P., Eklund, A.C., Szallasi, Z.: Reliability and reproducibility issues in DNA microarray measurements. Trends Genet. 22(2), 101–109 (2006)

    Article  Google Scholar 

  2. Camastra, F., Di Taranto, M.D., Staiano, A., Statistical and computational methods for genetic diseases: an overview. Comput. Math. Methods Med. 2015, Article ID 954598 (2015)

    Google Scholar 

  3. Di Gangi, M., Lo Bosco, G., Rizzo, R., Deep learning architectures for prediction of nucleosome positioning from sequences data. BMC Bioinform. 19, Article no. 418 (2018)

    Google Scholar 

  4. Di Taranto, M.D., et al.: Association of USF1 and APOA5 polymorphisms with familial combined hyperlipidemia in an Italian population. Mol. Cell. Probes 29(1), 19–24 (2015)

    Article  Google Scholar 

  5. Fiannaca, A., et al.: Deep learning models for bacteria taxonomic classification of metagenomic data. BMC Bioinform. 19, Article no. 198 (2018)

    Google Scholar 

  6. Staiano, A., et al.: Investigation of single nucleotide polymorphisms associated to familial combined hyperlipidemia with random forests. In: Apolloni, B., Bassis, S., Esposito, A., Morabito, F. (eds.) Neural Nets and Surroundings, vol. 19, pp. 169–178. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35467-0_18

    Chapter  Google Scholar 

  7. Wang, A., Chen, Y., An, N., Yang, J., Li, L., Jiang, L.: Microarray missing value imputation: a regularized local learning method. IEEE/ACM Trans. Comput. Biol. Bioinform. 16, 980–993 (2018)

    Article  Google Scholar 

  8. Giancarlo, R., Bosco, G.L., Pinello, L., Utro, F.: The three steps of clustering in the post-genomic era: a synopsis. In: Rizzo, R., Lisboa, P.J.G. (eds.) CIBB 2010. LNCS, vol. 6685, pp. 13–30. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21946-7_2

    Chapter  Google Scholar 

  9. DeRisi, J.L., Iyer, V.R., Brown, P.O.: Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278(5338), 680–686 (1997). PMID: 9381177

    Article  Google Scholar 

  10. Candès, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008)

    Article  Google Scholar 

  11. Ciaramella, A., Gianfico, M., Giunta, G.: Compressive sampling and adaptive dictionary learning for the packet loss recovery in audio multimedia streaming. Multimed. Tools Appl. 75(24), 17375–17392 (2016)

    Article  Google Scholar 

  12. Ciaramella, A., Giunta, G.: Packet loss recovery in audio multimedia streaming by using compressive sensing. IET Commun. 10(4), 387–392 (2016)

    Article  Google Scholar 

  13. Scholz, M., Fraunholz, M., Selbig, J.: Nonlinear principal component analysis: neural network models and applications. In: Gorban, A.N., Kégl, B., Wunsch, D.C., Zinovyev, A.Y. (eds.) Principal Manifolds for Data Visualization and Dimension Reduction. LNCSE, vol. 58, pp. 44–67. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73750-6_2

    Chapter  Google Scholar 

  14. Ciaramella, A., Longo, G., Staiano, A., Tagliaferri, R.: NEC: a hierarchical agglomerative clustering based on fisher and negentropy information. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds.) NAIS/WIRN -2005. LNCS, vol. 3931, pp. 49–56. Springer, Heidelberg (2006). https://doi.org/10.1007/11731177_8

    Chapter  Google Scholar 

  15. Nye, T.M., Lió, P., Gilks, W.R.: A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics 22(1), 117–9 (2006)

    Article  Google Scholar 

Download references

Acknowledgments

The research was developed when Davide Nardone was a M.Sc. student in Applied Computer Science at University of Naples Parthenope.

This work was partially funded by the University of Naples Parthenope (Sostegno alla ricerca individuale per il triennio 2016–2018 project, and supported by Gruppo Nazionale per il Calcolo Scientifico (GNCS-INdAM)).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonino Staiano .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ciaramella, A., Nardone, D., Staiano, A. (2020). Compressive Sensing and Hierarchical Clustering for Microarray Data with Missing Values. In: Raposo, M., Ribeiro, P., Sério, S., Staiano, A., Ciaramella, A. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2018. Lecture Notes in Computer Science(), vol 11925. Springer, Cham. https://doi.org/10.1007/978-3-030-34585-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34585-3_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34584-6

  • Online ISBN: 978-3-030-34585-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics