Advertisement

TrueSight: Self-training Algorithm for Splice Junction Detection Using RNA-seq

  • Yang Li
  • Hong-Mei Li
  • Paul Burns
  • Mark Borodovsky
  • Gene E. Robinson
  • Jian Ma
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7262)

Abstract

RNA-seq has proven to be a powerful technique for transcriptome profiling based on next-generation sequencing (NGS) technologies. However, due to the limited read length of NGS data, it is extremely challenging to accurately map RNA-seq reads to splice junctions, which is critically important for the analysis of alternative splicing and isoform construction. Several tools have been developed to find splice junctions by RNA-seq de novo, without the aid of gene annotations [1-3]. However, the sensitivity and specificity of these tools need to be improved. In this paper, we describe a novel method, called TrueSight, that combines information from (i) RNA-seq read mapping quality and (ii) coding potential from the reference genome sequences into a unified model that utilizes semi-supervised learning to precisely identify splice junctions.

References

  1. 1.
    Trapnell, C., Pachter, L., Salzberg, S.L.: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25(9), 1105–1111 (2009)CrossRefGoogle Scholar
  2. 2.
    Wang, K., et al.: MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res. 38(18), e178 (2010)CrossRefGoogle Scholar
  3. 3.
    Au, K.F., et al.: Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res. 38(14), 4570–4578 (2010)CrossRefGoogle Scholar
  4. 4.
    Celeux, G., Govaert, G.: A classification EM algorithm for clustering and two stochastic versions. Computational Statistics & Data Analysis 14(3), 315–332 (1992)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Yang Li
    • 1
    • 2
  • Hong-Mei Li
    • 2
    • 3
  • Paul Burns
    • 4
  • Mark Borodovsky
    • 4
    • 5
  • Gene E. Robinson
    • 2
    • 3
  • Jian Ma
    • 1
    • 2
  1. 1.Department of BioengineeringUniversity of IllinoisUrbana-ChampaignUSA
  2. 2.Institute for Genomic BiologyUniversity of IllinoisUrbana-ChampaignUSA
  3. 3.Department of EntomologyUniversity of IllinoisUrbana-ChampaignUSA
  4. 4.Wallace H. Coulter Department of Biomedical EngineeringGeorgia Institute of TechnologyUSA
  5. 5.School of Computational Science & EngineeringGeorgia Institute of TechnologyUSA

Personalised recommendations