Skip to main content

Classification of Samples with Order-Restricted Discriminant Rules

  • Protocol
Statistical Analysis in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1362))

Abstract

In recent years, mass spectrometry techniques have helped proteomics to become a powerful tool for the early diagnosis of cancer, as they help to discover protein profiles specific to each pathological state. One of the questions where proteomics is giving useful practical results is that of classifying patients into one of the possible severity levels of an illness, based on some features measured on the patient. This classification is usually made using one of the many discrimination procedures available in statistical literature. We present in this chapter recently developed restricted discriminant rules that use additional information in terms of orderings on the means, and we illustrate how to apply them to mass spectrometry data using R package dawai. Specifically, we use proteomic prostate cancer data, and we describe all steps needed, including data preprocessing and feature extraction, to build a discriminant rule that classifies samples in one of several disease stages, thus helping diagnosis. The restricted discriminant rules are compared with some standard classifiers that do not take into account the additional information, showing better performance in terms of error rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Toss A, DeMatteis E, Rossi E et al (2013) Ovarian cancer: can proteomics give new insights for therapy and diagnosis? Int J Mol Sci 14:8271–8290

    Article  PubMed Central  PubMed  Google Scholar 

  2. Yasui Y, Pepe M, Thompson ML et al (2003) A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics 4:449–463

    Article  PubMed  Google Scholar 

  3. Paul D, Kumar A, Gajbhiye A et al (2013) Mass spectrometry-based proteomics in molecular diagnostics: discovery of cancer biomarkers using tissue culture. BioMed Res Int 2013, Article ID 783131

    Google Scholar 

  4. Khadir A, Tiss A (2013) Proteomics approaches towards early detection and diagnosis of cancer. J Carcinog Mutagen S14:002

    Google Scholar 

  5. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27

    Article  Google Scholar 

  6. Buntime W (1992) Learning classification trees. Stat Comput 2:63–72

    Article  Google Scholar 

  7. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York

    Google Scholar 

  8. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Google Scholar 

  9. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  10. Fernandez M, Rueda C, Salvador B (2006) Incorporating additional information to normal linear discriminant rules. J Am Stat Assoc 101:569–577

    Article  CAS  Google Scholar 

  11. Conde D, Fernandez MA, Rueda C et al (2012) Classification of samples into two or more ordered populations with application to a cancer trial. Stat Med 31:3773–3786

    Article  CAS  PubMed  Google Scholar 

  12. Conde D, Salvador B, Rueda C et al (2013) Performance and estimation of the true error rate of classification rules built with additional information. An application to a cancer trial. Stat Appl Genet Mol Biol 12:583–602

    PubMed  Google Scholar 

  13. Conde D, Fernandez MA, Salvador B et al (2014) dawai: Discriminant analysis with additional information. http://cran.r-project.org/package=dawai

  14. Petricoin EF, Ornstein DK, Paweletz CP et al (2002) Serum proteomic patterns for detection of prostate cancer. J Natl Cancer Inst 94:1576–1578

    Article  CAS  PubMed  Google Scholar 

  15. Semmes OJ, Feng Z, Adam B-L et al (2005) Evaluation of serum protein profiling by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry for the detection of prostate cancer: I. Assessment of platform reproducibility. Clin Chem 51:102–112

    Article  CAS  PubMed  Google Scholar 

  16. Wagner M, Naik D, Pothen A (2003) Protocols for disease classification from mass spectrometry data. Proteomics 3:1692–1698

    Article  CAS  PubMed  Google Scholar 

  17. Zhu W, Wang X, Ma Y et al (2003) Detection of cancer-specific markers amid massive mass spectral data. Proc Natl Acad Sci U S A 100:14666–14671

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Baggerly KA, Morris JS, Wang J et al (2003) A comprehensive approach to the analysis of matrix assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics 3:1667–1672

    Article  CAS  PubMed  Google Scholar 

  19. Bhattacharyya S, Siegel ER, Petersen GM et al (2004) Diagnosis of pancreatic cancer using serum proteomic profiling. Neoplasia 6:674–686

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  20. Li J, Zhang Z, Rosenzweig J et al (2002) Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem 48:1296–1304

    CAS  PubMed  Google Scholar 

  21. Alfassi ZB (2004) On the normalization of a mass spectrum for comparison of two spectra. Journal Am Soc Mass Spectrom 15:385–387

    Article  CAS  Google Scholar 

  22. Petricoin EF, Ardekani AM, Hitt BA et al (2002) Use of proteomic patters in serum to identify ovarian cancer. Lancet 359:572–577

    Article  CAS  PubMed  Google Scholar 

  23. Meuleman W, Engwegen JYMN, Gast M-CW et al (2008) Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data. BMC Bioinformatics 9:88

    Article  PubMed Central  PubMed  Google Scholar 

  24. Bhanot G, Alexe G, Venkataraghavan B et al (2006) A robust meta-classification strategy for cancer detection from MS data. Proteomics 6:592–604

    Article  CAS  PubMed  Google Scholar 

  25. Tibshirani R, Hastie T, Narasimhan B et al (2004) Sample classification from protein mass spectrometry, by ‘peak probability contrasts’. Bioinformatics 20:3034–3044

    Article  CAS  PubMed  Google Scholar 

  26. Wang MZ, Howard B, Campa MJ et al (2003) Analysis of human serum proteins by liquid phase isoelectric focusing and matrix-assisted laser desorption/ionization-mass spectrometry. Proteomics 3:1661–1666

    Article  CAS  PubMed  Google Scholar 

  27. Taskin V, Dogan B, Olmez T (2013) Prostate cancer classification from mass spectrometry data by using wavelet analysis and Kernel Partial Least Squares Algorithm. Int J Biosci Biochem Bioinforma 3:98–102

    Google Scholar 

  28. Malyarenko DI, Cooke WE, Adam B-L et al (2005) Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-flight mass spectrometric records for serum peptides using time-series analysis techniques. Clin Chem 51:65–74

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  29. Liu Q, Krishnapuram B, Pratapa P et al (2004) Identification of differentially expressed proteins using MALDI-TOF mass spectra. Conf Rec Asilomar Conf Signals Syst Comput 2:1323–1327

    Google Scholar 

  30. Morris JS, Coombes KR, Koomen J et al (2005) Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics 21:1764–1775

    Article  CAS  PubMed  Google Scholar 

  31. van Eeden C (2006) Restricted parameter space estimation problems: admissibility and minimaxity properties. Springer, New York

    Book  Google Scholar 

  32. Canty A, Ripley B (2014) boot: bootstrap functions (originally by Angelo Canty for S). http://cran.r-project.org/package=boot

  33. Sinnwell JP, Schaid DJ (2013) ibdreg: regression methods for IBD linkage with covariates. http://cran.r-project.org/package=ibdreg

  34. Genz A, Bretz F, Miwa T et al (2014) mvtnorm: multivariate normal and t distributions. http://cran.r-project.org/package=mvtnorm

  35. Ripley B, Venables B, Bates DM et al (2011) Support functions and datasets for venables and Ripley’s MASS. http://cran.r-project.org/package=MASS

  36. Breiman L, Cutler A, Liaw A et al (2014) randomForest: Breiman and Cutler’s random forests for classification and regression. http://cran.r-project.org/package=randomForest

  37. Meyer D, Dimitriadou E, Hornik K et al (2014) e1071: Misc Functions of the Department of Statistics (e1071), TU Wien. http://cran.r-project.org/package=e1071

  38. Mahalanobis PC (1936) On the generalised distance in statistics. Proc Natl Inst Sci India 12:49–55

    Google Scholar 

  39. Salvador B, Fernandez MA, Martin I et al (2008) Robustness of classification rules that incorporate additional information. Comput Stat Data An 52:2489–2495

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Conde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Conde, D., Fernández, M.A., Salvador, B., Rueda, C. (2016). Classification of Samples with Order-Restricted Discriminant Rules. In: Jung, K. (eds) Statistical Analysis in Proteomics. Methods in Molecular Biology, vol 1362. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3106-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3106-4_10

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3105-7

  • Online ISBN: 978-1-4939-3106-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics