Computational Prediction of Driver Missense Mutations in Melanoma

  • Haiyang Sun
  • Zhenyu Yue
  • Le Zhao
  • Junfeng Xia
  • Yannan Bin
  • Di Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10955)


Discovering driver mutations used as the diagnostic and prognostic biomarkers is important for the treatment of cancer, including melanoma. Although during the last decade several computational methods have been developed to predict the effect of missense mutations in cancer, only a few have been specifically designed for identifying driver mutations in a specific disease context. To take into consideration of disease-specific factor, here we made efforts to prioritize missense mutations presented in melanoma. We collected 385 pathogenic mutations from the database of curated mutations (DoCM), and 392 benign mutations filtered from a benchmark neutral database (VariSnp), respectively. To evaluation of the model effect, we also selected 45 mutations from other databases. Then a random forest classifier was constructed to prioritize melanoma pathogenic mutations based on conservation, functional region annotation, protein secondary structure, protein domain, physicochemical features, and splicing information. The proposed method achieved an AUC of 0.94 on both training and test sets. When compared with previous developed algorithms, our method obtained a higher accuracy in identifying driver missense mutations in melanoma, along with a more balanced sensitivity and specificity than the other prediction methods.


Melanoma Missense mutation Pathogenicity prediction 



The authors thank the members of our laboratory for their valuable discussions. This work has been supported by the grants from the National Natural Science Foundation of China (61672037 and 21601001) and the Anhui Provincial Outstanding Young Talent Support Plan (gxyqZD2017005), and the Young Wanjiang Scholar Program of Anhui Province, China.


  1. 1.
    Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics. CA Cancer J. Clin. 67(1), 7–30 (2017)CrossRefGoogle Scholar
  2. 2.
    Greenman, C., et al.: Patterns of somatic mutation in human cancer genomes. Nature 446(7132), 153–158 (2007)CrossRefGoogle Scholar
  3. 3.
    Shtivelman, E., et al.: Pathways and therapeutic targets in melanoma. Oncotarget 5(7), 1701 (2014)CrossRefGoogle Scholar
  4. 4.
    Lovly, C.M., et al.: Routine multiplex mutational profiling of melanomas enables enrollment in genotype-driven therapeutic trials. PLoS ONE 7(4), e35309 (2012)CrossRefGoogle Scholar
  5. 5.
    Xia, J., et al.: A meta-analysis of somatic mutations from next generation sequencing of 241 melanomas: a road map for the study of genes with potential clinical relevance. Mol. Cancer Ther. 13(7), 1918–1928 (2014)CrossRefGoogle Scholar
  6. 6.
    Kircher, M., et al.: A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46(3), 310–315 (2014)CrossRefGoogle Scholar
  7. 7.
    Kumar, P., Henikoff, S., Ng, P.C.: Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4(7), 1073–1081 (2009)CrossRefGoogle Scholar
  8. 8.
    Suriano, G., et al.: Identification of CDH1 germline missense mutations associated with functional inactivation of the E-cadherin protein in young gastric cancer probands. Hum. Mol. Genet. 12(5), 575–582 (2003)CrossRefGoogle Scholar
  9. 9.
    Suriano, G., et al.: E-cadherin germline missense mutations and cell phenotype: evidence for the independence of cell invasion on the motile capabilities of the cells. Hum. Mol. Genet. 12(22), 3007–3016 (2003)CrossRefGoogle Scholar
  10. 10.
    Carter, H., et al.: Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res. 69(16), 6660–6667 (2009)CrossRefGoogle Scholar
  11. 11.
    Mao, Y., et al.: CanDrA: cancer-specific driver missense mutation annotation with optimized features. PLoS ONE 8(10), e77945 (2013)CrossRefGoogle Scholar
  12. 12.
    Futreal, P.A., et al.: A census of human cancer genes. Nat. Rev. Cancer 4(3), 177–183 (2004)CrossRefGoogle Scholar
  13. 13.
    Grimm, D.G., et al.: The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Hum. Mutat. 36(5), 513–523 (2015)CrossRefGoogle Scholar
  14. 14.
    Ainscough, B.J., et al.: DoCM: a database of curated mutations in cancer. Nat. Methods 13(10), 806–807 (2016)CrossRefGoogle Scholar
  15. 15.
    Schaafsma, G.C., Vihinen, M.: VariSNP, a benchmark database for variants from dbSNP. Hum. Mutat. 36(2), 161–166 (2015)CrossRefGoogle Scholar
  16. 16.
    Landrum, M.J., et al.: ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44(D1), D862–D868 (2016)CrossRefGoogle Scholar
  17. 17.
    My Cancer Genome Homepage. Accessed 21 Nov 2017
  18. 18.
    Griffith, M., et al.: CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat. Genet. 49(2), 170–174 (2017)CrossRefGoogle Scholar
  19. 19.
    Cingolani, P., et al.: A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6(2), 80–92 (2012)CrossRefGoogle Scholar
  20. 20.
    Xin, J., et al.: High-performance web services for querying gene and variant annotation. Genome Biol. 17(1), 91 (2016)CrossRefGoogle Scholar
  21. 21.
    ENCODE Project Consortium: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012)CrossRefGoogle Scholar
  22. 22.
    Ng, S.B., et al.: Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461(7261), 272–276 (2009)CrossRefGoogle Scholar
  23. 23.
    Flicek, P., et al.: Ensembl. Nucleic Acids Res. 42(Database issue), D749–D755 (2014)Google Scholar
  24. 24.
    Wang, S., et al.: RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res. 44(W1), W430–W435 (2016)CrossRefGoogle Scholar
  25. 25.
    Atchley, W.R., et al.: Solving the protein sequence metric problem. Proc. Natl. Acad. Sci. U.S.A. 102(18), 6395–6400 (2005)CrossRefGoogle Scholar
  26. 26.
    Breiman, L.: Machine Learning. Kluwer Academic Publishers, The Netherlands (2001)Google Scholar
  27. 27.
    Frank, E., Hall, M.A., Witten, I.H.: The WEKA Workbench. Fourth edn. Burlington (2016)Google Scholar
  28. 28.
    Buske, O.J., et al.: Identification of deleterious synonymous variants in human genomes. Bioinformatics 29(15), 1843–1850 (2013)CrossRefGoogle Scholar
  29. 29.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)CrossRefGoogle Scholar
  30. 30.
    Fraser, H.B., et al.: Evolutionary rate in the protein interaction network. Science 296(5568), 750–752 (2002)CrossRefGoogle Scholar
  31. 31.
    Rogers, M.F., et al.: CScape: a tool for predicting oncogenic single-point mutations in the cancer genome. Sci. Rep. 7(1), 11597 (2017)CrossRefGoogle Scholar
  32. 32.
    Reva, B., Antipin, Y., Sander, C.: Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39(17), e118 (2011)CrossRefGoogle Scholar
  33. 33.
    Van Raamsdonk, C.D., et al.: Mutations in GNA11 in uveal melanoma. New Engl. J. Med. 363(23), 2191–2199 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Haiyang Sun
    • 1
  • Zhenyu Yue
    • 2
  • Le Zhao
    • 2
  • Junfeng Xia
    • 1
  • Yannan Bin
    • 1
  • Di Zhang
    • 2
  1. 1.Institute of Physical Science and Information TechnologyAnhui UniversityHefeiChina
  2. 2.School of Computer Science and TechnologyAnhui UniversityHefeiChina

Personalised recommendations