Applications of Multilevel Thresholding Algorithms to Transcriptomics Data

  • Luis Rueda
  • Iman Rezaeian
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7042)


Microarrays are one of the methods for analyzing the expression levels of genes in a massive and parallel way. Since any errors in early stages of the analysis affect subsequent stages, leading to possibly erroneous biological conclusions, finding the correct location of the spots in the images is extremely important for subsequent steps that include segmentation, quantification, normalization and clustering. On the other hand, genome-wide profiling of DNA-binding proteins using ChIP-seq and RNA-seq has emerged as an alternative to ChIP-chip methods. Due to the large amounts of data produced by next generation sequencing technology, ChIPseq and RNA-seq offer much higher resolution, less noise and greater coverage than its predecessor, the ChIPchip array.

Multilevel thresholding algorithms have been applied to many problems in image and signal processing. We show that these algorithms can be used for transcriptomics and genomics data analysis such as sub-grid and spot detection in DNA microarrays, and also for detecting significant regions based on next generation sequencing data. We show the advantages and disadvantages of using multilevel thresholding and other algorithms in these two applications, as well as an overview of numerical and visual results used to validate the power of the thresholding methods based on previously published data.


microarray image gridding image analysis multi level thresholding transcriptomics 


  1. 1.
    Ceccarelli, B., Antoniol, G.: A Deformable Grid-matching Approach for Microarray Images. IEEE Transactions on Image Processing 15(10), 3178–3188 (2006)CrossRefGoogle Scholar
  2. 2.
    Barski, A., Zhao, K.: Genomic location analysis by chip-seq. Journal of Cellular Biochemistry (107), 11–18 (2009)Google Scholar
  3. 3.
    Buck, M., Nobel, A., Lieb, J.: Chipotle: a user-friendly tool for the analysis of chip-chip data. Genome Biology 6(11), R97 (2005)CrossRefGoogle Scholar
  4. 4.
    Bariamis, D., Maroulis, D., Iakovidis, D.: M 3 G: Maximum Margin Microarray Gridding. BMC Bioinformatics 11, 49 (2010)CrossRefGoogle Scholar
  5. 5.
    Zacharia, E., Maroulis, D.: Micoarray image gridding via an evolutionary algorithm. In: IEEE International Conference on Image Processing, pp. 1444–1447 (2008)Google Scholar
  6. 6.
    Antoniol, G., Ceccarelli, M.: A Markov Random Field Approach to Microarray Image Gridding. In: Proc. of the 17th International Conference on Pattern Recognition, pp. 550–553 (2004)Google Scholar
  7. 7.
    Hower, V., Evans, S., Pachter, L.: Shape-based peak identification for chip-seq. BMC Bioinformatics 11(81) (2010)Google Scholar
  8. 8.
    Angulo, J., Serra, J.: Automatic Analysis of DNA Microarray Images Using Mathematical Morphology. Bioinformatics 19(5), 553–562 (2003)CrossRefGoogle Scholar
  9. 9.
    Johnson, W., Li, W., Meyer, C., Gottardo, R., Carroll, J., Brown, M., Liu, X.S.: Model-based analysis of tiling-arrays for chip-chip. Proceedings of the National Academy of Sciences 103(33), 12457–12462 (2006)CrossRefGoogle Scholar
  10. 10.
    Jothi, R., Cuddapah, S., Barski, A., Cui, K., Zhao, K.: Genome-wide identification of in vivo proteindna binding sites from chip-seq data. Nucleic Acids Research 36(16), 5221–5231 (2008)CrossRefGoogle Scholar
  11. 11.
    Rueda, L.: Sub-grid Detection in DNA Microarray Images. In: Proceedings of the IEEE Pacific-RIM Symposium on Image and Video Technology, pp. 248–259 (2007)Google Scholar
  12. 12.
    Rueda, L.: An Efficient Algorithm for Optimal Multilevel Thresholding of Irregularly Sampled Histograms. In: Proceedings of the 7th International Workshop on Statistical Pattern Recognition, pp. 612–621 (2008)Google Scholar
  13. 13.
    Rueda, L., Vidyadharan, V.: A Hill-climbing Approach for Automatic Gridding of cDNA Microarray Images. IEEE Transactions on Computational Biology and Bioinformatics 3(1), 72–83 (2006)CrossRefGoogle Scholar
  14. 14.
    Luessi, M., Eichmann, M., Schuster, G., Katsaggelos, A.: Framework for efficient optimal multilevel image thresholding. Journal of Electronic Imaging 18 (2009)Google Scholar
  15. 15.
    Katzer, M., Kummer, F., Sagerer, G.: A Markov Random Field Model of Microarray Gridding. In: Proceeding of the 2003 ACM Symposium on Applied Computing, pp. 72–77 (2003)Google Scholar
  16. 16.
    Malone, J., Oliver, B.: Microarrays, deep sequencing and the true measure of the transcriptome. BMC Biology 9(1), 34 (2011)CrossRefGoogle Scholar
  17. 17.
    Park, P.J.: Chip-seq: advantages and challenges of a maturing technology. Nat. Rev. Genetics 10(10), 669–680 (2009)CrossRefGoogle Scholar
  18. 18.
    Qi, Y., Rolfe, A., MacIsaac, K.D., Gerber, G., Pokholok, D., Zeitlinger, J., Danford, T., Dowell, R., Fraenkel, E., Jaakkola, T.S., Young, R., Gifford, D.: High-resolution computational models of genome binding events. Nat. Biotech. 24(8), 963–970 (2006)CrossRefGoogle Scholar
  19. 19.
    Reiss, D., Facciotti, M., Baliga, N.: Model-based deconvolution of genome-wide dna binding. Bioinformatics 24(3), 396–403 (2008)CrossRefGoogle Scholar
  20. 20.
    Rozowsky, J., Euskirchen, G., Auerbach, R., Zhang, Z., Gibson, T., Bjornson, R., Carriero, N., Snyder, M., Gerstein, M.: Peakseq enables systematic scoring of chip-seq experiments relative to controls. Nat. Biotech. 27(1), 66–75 (2009)CrossRefGoogle Scholar
  21. 21.
    Rueda, L., Rezaeian, I.: A fully automatic gridding method for cdna microarray images. BMC Bioinformatics 12, 113 (2011)CrossRefGoogle Scholar
  22. 22.
    Tuteja, G., White, P., Schug, J., Kaestner, K.H.: Extracting transcription factor targets from ChIP-Seq data. Nucleic Acids Res. 37(17), e113 (2009)CrossRefGoogle Scholar
  23. 23.
    Maulik, U., Bandyopadhyay, S.: Performance Evaluation of Some Clustering Algorithms and Validity Indices. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(12), 1650–1655 (2002)CrossRefGoogle Scholar
  24. 24.
    Valouev, A., Johnson, D., Sundquist, A., Medina, C., Anton, E., Batzoglou, S., Myers, R., Sidow, A.: Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat. Meth. 5(9), 829–834 (2008)CrossRefGoogle Scholar
  25. 25.
    Wang, C., Xu, J., Zhang, D., Wilson, Z., Zhang, D.: An effective approach for identification of in vivo protein-DNA binding sites from paired-end ChIP-Seq data. BMC Bioinformatics 41(1), 117–129 (2008)Google Scholar
  26. 26.
    Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009)CrossRefGoogle Scholar
  27. 27.
    Zang, C., Schones, D.E., Zeng, C., Cui, K., Zhao, K., Peng, W.: A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25(15), 1952–1958 (2009)CrossRefGoogle Scholar
  28. 28.
    Zhang, Y., Liu, T., Meyer, C., Eeckhoute, J., Johnson, D., Bernstein, B., Nusbaum, C., Myers, R., Brown, M., Li, W., Liu, X.S.: Model-based analysis of chip-seq (macs). Genome Biology 9(9), R137 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Luis Rueda
    • 1
  • Iman Rezaeian
    • 1
  1. 1.School of Computer ScienceUniversity of WindsorWindsorCanada

Personalised recommendations