Discovering driver mutations used as the diagnostic and prognostic biomarkers is important for the treatment of cancer, including melanoma. Although during the last decade several computational methods have been developed to predict the effect of missense mutations in cancer, only a few have been specifically designed for identifying driver mutations in a specific disease context. To take into consideration of disease-specific factor, here we made efforts to prioritize missense mutations presented in melanoma. We collected 385 pathogenic mutations from the database of curated mutations (DoCM), and 392 benign mutations filtered from a benchmark neutral database (VariSnp), respectively. To evaluation of the model effect, we also selected 45 mutations from other databases. Then a random forest classifier was constructed to prioritize melanoma pathogenic mutations based on conservation, functional region annotation, protein secondary structure, protein domain, physicochemical features, and splicing information. The proposed method achieved an AUC of 0.94 on both training and test sets. When compared with previous developed algorithms, our method obtained a higher accuracy in identifying driver missense mutations in melanoma, along with a more balanced sensitivity and specificity than the other prediction methods.
This is a preview of subscription content, log in to check access.
The authors thank the members of our laboratory for their valuable discussions. This work has been supported by the grants from the National Natural Science Foundation of China (61672037 and 21601001) and the Anhui Provincial Outstanding Young Talent Support Plan (gxyqZD2017005), and the Young Wanjiang Scholar Program of Anhui Province, China.
Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics. CA Cancer J. Clin. 67(1), 7–30 (2017)CrossRefGoogle Scholar
Greenman, C., et al.: Patterns of somatic mutation in human cancer genomes. Nature 446(7132), 153–158 (2007)CrossRefGoogle Scholar
Shtivelman, E., et al.: Pathways and therapeutic targets in melanoma. Oncotarget 5(7), 1701 (2014)CrossRefGoogle Scholar
Lovly, C.M., et al.: Routine multiplex mutational profiling of melanomas enables enrollment in genotype-driven therapeutic trials. PLoS ONE 7(4), e35309 (2012)CrossRefGoogle Scholar
Xia, J., et al.: A meta-analysis of somatic mutations from next generation sequencing of 241 melanomas: a road map for the study of genes with potential clinical relevance. Mol. Cancer Ther. 13(7), 1918–1928 (2014)CrossRefGoogle Scholar
Kircher, M., et al.: A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46(3), 310–315 (2014)CrossRefGoogle Scholar
Kumar, P., Henikoff, S., Ng, P.C.: Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4(7), 1073–1081 (2009)CrossRefGoogle Scholar
Suriano, G., et al.: Identification of CDH1 germline missense mutations associated with functional inactivation of the E-cadherin protein in young gastric cancer probands. Hum. Mol. Genet. 12(5), 575–582 (2003)CrossRefGoogle Scholar
Suriano, G., et al.: E-cadherin germline missense mutations and cell phenotype: evidence for the independence of cell invasion on the motile capabilities of the cells. Hum. Mol. Genet. 12(22), 3007–3016 (2003)CrossRefGoogle Scholar
Carter, H., et al.: Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res. 69(16), 6660–6667 (2009)CrossRefGoogle Scholar
Mao, Y., et al.: CanDrA: cancer-specific driver missense mutation annotation with optimized features. PLoS ONE 8(10), e77945 (2013)CrossRefGoogle Scholar
Futreal, P.A., et al.: A census of human cancer genes. Nat. Rev. Cancer 4(3), 177–183 (2004)CrossRefGoogle Scholar
Grimm, D.G., et al.: The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Hum. Mutat. 36(5), 513–523 (2015)CrossRefGoogle Scholar
Ainscough, B.J., et al.: DoCM: a database of curated mutations in cancer. Nat. Methods 13(10), 806–807 (2016)CrossRefGoogle Scholar
Schaafsma, G.C., Vihinen, M.: VariSNP, a benchmark database for variants from dbSNP. Hum. Mutat. 36(2), 161–166 (2015)CrossRefGoogle Scholar
Landrum, M.J., et al.: ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44(D1), D862–D868 (2016)CrossRefGoogle Scholar
Griffith, M., et al.: CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat. Genet. 49(2), 170–174 (2017)CrossRefGoogle Scholar
Cingolani, P., et al.: A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6(2), 80–92 (2012)CrossRefGoogle Scholar
Xin, J., et al.: High-performance web services for querying gene and variant annotation. Genome Biol. 17(1), 91 (2016)CrossRefGoogle Scholar
ENCODE Project Consortium: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012)CrossRefGoogle Scholar
Ng, S.B., et al.: Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461(7261), 272–276 (2009)CrossRefGoogle Scholar