Computational Identification of Short Initial Exons
Despite the existence of many gene prediction programs and their increasing accuracy over the last few years, the accurate identification of short exons remains a challenging problem. In this paper we concentrate on short initial exons and present a method to improve the detection of these short coding regions. The proposed algorithm is based on the Weight Array Method (WAM) and CpG islands. The algorithm was evaluated on a total of 158 sequences containing short initial exons, and achieves an accuracy of up to 73%. By comparison with GENSCAN, the proposed WAM-CpG Island algorithm reveals an improvement of up to 22%. Further, the WAM-CpG island approach can be employed to complement existing gene prediction packages to produce substantial improvements in the correct detection of short initial exons.
KeywordsDonor Site Gene Prediction Computational Identification Translation Initiation Site Internal Exon
- 4.Zhang, M.Q., Marr, T.G.: A weight array method for splicing signal analysis. CABIOS 9(5), 499–509 (1993)Google Scholar
- 11.Guigo, R.: DNA composition, codon usage and exon prediction (2000), www.pdg.cnb.uam.es/cursos/FVi2001/GenomAna/GeneIdentification/SearchContent/main.html
- 12.Hannenhalli, S., Levy, S.: Promoter prediction in the human genome. Bioinformatics 17(suppl. 1), s90–s96 (2001)Google Scholar