BIOTHINGS: A Pipeline Creation Tool for PAR-CLIP Sequence Analsys

  • Oier Echaniz
  • Manuel GrañaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11486)


Bioinformatics pipelines dealing with analysis of sequences of aminoacids are tricky. It is not easy to match the input and outputs of stand-alone applications that sometimes were developed for quite different kinds of sequences. In this paper we propose a tool for the guided and safe composition of pipelines to treat a specific kind of sequences. This tool can easily extend to more general bioinformatics setting. Cross-Linking Immuno Precipitation associated to high-throughput sequencing (CLIP-seq) has been recently developed aiming to uncover the RNA-protein interaction genome-wide. Specifically PhotoActivable-Ribonucleoside-enhanced-CLIP (PAR-CLIP) has been proposed to achieve single-nucleotide resolution. A critical step in the analysis of PAR-CLIP sequences is peak calling. Specific methods propose probabilistic models based on its substitution properties, allowing for a more accurate detection of RNA-protein interaction sites. The pipeline construction tool proposed here can be used for systematic comparison of the effect of the choice of peak calling method.



This work has been partially supported by FEDER funds through MINECO project TIN2017-85827-P.


  1. 1.
    Althammer, S., González-Vallinas, J., Ballaré, C., Beato, M., Eyras, E.: Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data. Bioinformatics 27(24), 3333–3340 (2011)CrossRefGoogle Scholar
  2. 2.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B (Methodol.) 57(1), 289–300 (1995)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bolger, A.M., Lohse, M., Usadel, B.: Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 30(15), 2114–2120 (2014)CrossRefGoogle Scholar
  4. 4.
    Bottini, S., Pratella, D., Grandjean, V., Repetto, E., Trabucchi, M.: Recent computational developments on CLIP-seq data analysis and microRNA targeting implications. Briefings Bioinf. 19(6), 1290–1301 (2017)Google Scholar
  5. 5.
    Chen, B., Yun, J., Kim, M.S., Mendell, J.T., Xie, Y.: PIPE-CLIP: a comprehensive online tool for CLIP-seq data analysis. Genome Biol. 15, R18 (2014)CrossRefGoogle Scholar
  6. 6.
    Chen, C., Khaleel, S.S., Huang, H., Cathy, H.W.: Software for pre-processing illumina next-generation sequencing short read sequences. Source Code Biol. Med. 9(1), 8 (2014)CrossRefGoogle Scholar
  7. 7.
    Comoglio, F., Sievers, C., Paro, R.: Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data. BMC Bioinf. 16, 32 (2015)CrossRefGoogle Scholar
  8. 8.
    Corcoran, D.L., et al.: PARalyzer: definition of RNA binding sites from PAR-CLIP short-read sequence data. Genome Biol. 12, R79 (2011)CrossRefGoogle Scholar
  9. 9.
    Dobin, A., et al.: STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1), 15–21 (2013)CrossRefGoogle Scholar
  10. 10.
    Echaniz, O., Graña, M.: A comparison of par-clip peak calling approaches on noisy data. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2017–2023, December 2018Google Scholar
  11. 11.
    Echaniz, O., Graña, M.: BIOTHINGS: a tool to create safe and sound bioinformatics pipelines, February 2019.
  12. 12.
    Erhard, F., Dölken, L., Jaskiewicz, L., Zimmer, R.: PARma: identification of microRNA target sites in AGO-PAR-CLIP data. Genome Biol. 14, R79 (2013)CrossRefGoogle Scholar
  13. 13.
    Garzia, A., Morozov, P., Sajek, M., Meyer, C., Tuschl, T.: PAR-CLIP for discovering target sites of RNA-binding proteins. In: Lamandé, S.R. (ed.) mRNA Decay. MMB, vol. 1720, pp. 55–75. Springer, New York (2018). Scholar
  14. 14.
    Golumbeanu, M., Mohammadi, P., Beerenwinkel, N.: BMix: probabilistic modeling of occurring substitutions in PAR-CLIP data. Bioinformatics 32(7), 976–983 (2016)CrossRefGoogle Scholar
  15. 15.
    Charles, G.E., Bailey, T.L., Noble, W.S.: FIMO: scanning for occurrences of a given motif. Bioinformatics 27(7), 1017–1018 (2011)CrossRefGoogle Scholar
  16. 16.
    Hafner, M., et al.: Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141(1), 129–141 (2010)CrossRefGoogle Scholar
  17. 17.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)CrossRefGoogle Scholar
  18. 18.
    Sievers, C., Schlumpf, T., Sawarkar, R., Comoglio, F., Paro, R.: Mixture models and wavelet transforms reveal high confidence RNA-protein interaction sites in MOV10 PAR-CLIP data. Nucleic Acids Res. 40(20), e160 (2012)CrossRefGoogle Scholar
  19. 19.
    Sims, D., Sudbery, I., Ilott, N.E., Heger, A., Ponting, C.P.: Sequencing depth and coverage: key considerations in genomic analyses. Nat. Rev. Genet. 15, 121 (2014)CrossRefGoogle Scholar
  20. 20.
    Smith, A.D., et al.: Updates to the RMAP short-read mapping software. Bioinformatics 25(21), 2841–2842 (2009)CrossRefGoogle Scholar
  21. 21.
    Webb, S., Hector, R.D., Kudla, G., Granneman, S.: PAR-CLIP data indicate that Nrd1-Nab3-dependent transcription termination regulates expression of hundreds of protein coding genes in yeast. Genome Biol. 15, R8 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Grupo de Inteligencia Computacional (GIC)Universidad del País Vasco (UPV/EHU)San SebastiánSpain
  2. 2.Asociación de Ciencias de la programación Python San Sebastian (ACPYSS)San SebastiánSpain

Personalised recommendations