Classification of Samples with Order-Restricted Discriminant Rules

Conde, David; Fernández, Miguel A.; Salvador, Bonifacio; Rueda, Cristina

doi:10.1007/978-1-4939-3106-4_10

David Conde³,
Miguel A. Fernández³,
Bonifacio Salvador³ &
…
Cristina Rueda³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1362))

3670 Accesses
1 Altmetric

Abstract

In recent years, mass spectrometry techniques have helped proteomics to become a powerful tool for the early diagnosis of cancer, as they help to discover protein profiles specific to each pathological state. One of the questions where proteomics is giving useful practical results is that of classifying patients into one of the possible severity levels of an illness, based on some features measured on the patient. This classification is usually made using one of the many discrimination procedures available in statistical literature. We present in this chapter recently developed restricted discriminant rules that use additional information in terms of orderings on the means, and we illustrate how to apply them to mass spectrometry data using R package dawai. Specifically, we use proteomic prostate cancer data, and we describe all steps needed, including data preprocessing and feature extraction, to build a discriminant rule that classifies samples in one of several disease stages, thus helping diagnosis. The restricted discriminant rules are compared with some standard classifiers that do not take into account the additional information, showing better performance in terms of error rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Toss A, DeMatteis E, Rossi E et al (2013) Ovarian cancer: can proteomics give new insights for therapy and diagnosis? Int J Mol Sci 14:8271–8290
Article PubMed Central PubMed Google Scholar
Yasui Y, Pepe M, Thompson ML et al (2003) A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics 4:449–463
Article PubMed Google Scholar
Paul D, Kumar A, Gajbhiye A et al (2013) Mass spectrometry-based proteomics in molecular diagnostics: discovery of cancer biomarkers using tissue culture. BioMed Res Int 2013, Article ID 783131
Google Scholar
Khadir A, Tiss A (2013) Proteomics approaches towards early detection and diagnosis of cancer. J Carcinog Mutagen S14:002
Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27
Article Google Scholar
Buntime W (1992) Learning classification trees. Stat Comput 2:63–72
Article Google Scholar
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York
Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article Google Scholar
Fernandez M, Rueda C, Salvador B (2006) Incorporating additional information to normal linear discriminant rules. J Am Stat Assoc 101:569–577
Article CAS Google Scholar
Conde D, Fernandez MA, Rueda C et al (2012) Classification of samples into two or more ordered populations with application to a cancer trial. Stat Med 31:3773–3786
Article CAS PubMed Google Scholar
Conde D, Salvador B, Rueda C et al (2013) Performance and estimation of the true error rate of classification rules built with additional information. An application to a cancer trial. Stat Appl Genet Mol Biol 12:583–602
PubMed Google Scholar
Conde D, Fernandez MA, Salvador B et al (2014) dawai: Discriminant analysis with additional information. http://cran.r-project.org/package=dawai
Petricoin EF, Ornstein DK, Paweletz CP et al (2002) Serum proteomic patterns for detection of prostate cancer. J Natl Cancer Inst 94:1576–1578
Article CAS PubMed Google Scholar
Semmes OJ, Feng Z, Adam B-L et al (2005) Evaluation of serum protein profiling by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry for the detection of prostate cancer: I. Assessment of platform reproducibility. Clin Chem 51:102–112
Article CAS PubMed Google Scholar
Wagner M, Naik D, Pothen A (2003) Protocols for disease classification from mass spectrometry data. Proteomics 3:1692–1698
Article CAS PubMed Google Scholar
Zhu W, Wang X, Ma Y et al (2003) Detection of cancer-specific markers amid massive mass spectral data. Proc Natl Acad Sci U S A 100:14666–14671
Article PubMed Central CAS PubMed Google Scholar
Baggerly KA, Morris JS, Wang J et al (2003) A comprehensive approach to the analysis of matrix assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics 3:1667–1672
Article CAS PubMed Google Scholar
Bhattacharyya S, Siegel ER, Petersen GM et al (2004) Diagnosis of pancreatic cancer using serum proteomic profiling. Neoplasia 6:674–686
Article PubMed Central CAS PubMed Google Scholar
Li J, Zhang Z, Rosenzweig J et al (2002) Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem 48:1296–1304
CAS PubMed Google Scholar
Alfassi ZB (2004) On the normalization of a mass spectrum for comparison of two spectra. Journal Am Soc Mass Spectrom 15:385–387
Article CAS Google Scholar
Petricoin EF, Ardekani AM, Hitt BA et al (2002) Use of proteomic patters in serum to identify ovarian cancer. Lancet 359:572–577
Article CAS PubMed Google Scholar
Meuleman W, Engwegen JYMN, Gast M-CW et al (2008) Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data. BMC Bioinformatics 9:88
Article PubMed Central PubMed Google Scholar
Bhanot G, Alexe G, Venkataraghavan B et al (2006) A robust meta-classification strategy for cancer detection from MS data. Proteomics 6:592–604
Article CAS PubMed Google Scholar
Tibshirani R, Hastie T, Narasimhan B et al (2004) Sample classification from protein mass spectrometry, by ‘peak probability contrasts’. Bioinformatics 20:3034–3044
Article CAS PubMed Google Scholar
Wang MZ, Howard B, Campa MJ et al (2003) Analysis of human serum proteins by liquid phase isoelectric focusing and matrix-assisted laser desorption/ionization-mass spectrometry. Proteomics 3:1661–1666
Article CAS PubMed Google Scholar
Taskin V, Dogan B, Olmez T (2013) Prostate cancer classification from mass spectrometry data by using wavelet analysis and Kernel Partial Least Squares Algorithm. Int J Biosci Biochem Bioinforma 3:98–102
Google Scholar
Malyarenko DI, Cooke WE, Adam B-L et al (2005) Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-flight mass spectrometric records for serum peptides using time-series analysis techniques. Clin Chem 51:65–74
Article PubMed Central CAS PubMed Google Scholar
Liu Q, Krishnapuram B, Pratapa P et al (2004) Identification of differentially expressed proteins using MALDI-TOF mass spectra. Conf Rec Asilomar Conf Signals Syst Comput 2:1323–1327
Google Scholar
Morris JS, Coombes KR, Koomen J et al (2005) Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics 21:1764–1775
Article CAS PubMed Google Scholar
van Eeden C (2006) Restricted parameter space estimation problems: admissibility and minimaxity properties. Springer, New York
Book Google Scholar
Canty A, Ripley B (2014) boot: bootstrap functions (originally by Angelo Canty for S). http://cran.r-project.org/package=boot
Sinnwell JP, Schaid DJ (2013) ibdreg: regression methods for IBD linkage with covariates. http://cran.r-project.org/package=ibdreg
Genz A, Bretz F, Miwa T et al (2014) mvtnorm: multivariate normal and t distributions. http://cran.r-project.org/package=mvtnorm
Ripley B, Venables B, Bates DM et al (2011) Support functions and datasets for venables and Ripley’s MASS. http://cran.r-project.org/package=MASS
Breiman L, Cutler A, Liaw A et al (2014) randomForest: Breiman and Cutler’s random forests for classification and regression. http://cran.r-project.org/package=randomForest
Meyer D, Dimitriadou E, Hornik K et al (2014) e1071: Misc Functions of the Department of Statistics (e1071), TU Wien. http://cran.r-project.org/package=e1071
Mahalanobis PC (1936) On the generalised distance in statistics. Proc Natl Inst Sci India 12:49–55
Google Scholar
Salvador B, Fernandez MA, Martin I et al (2008) Robustness of classification rules that incorporate additional information. Comput Stat Data An 52:2489–2495
Article Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Estadística e Investigación Operativa, Facultad de Ciencias, Universidad de Valladolid, Paseo de Belén 7, 47011, Valladolid, Spain
David Conde, Miguel A. Fernández, Bonifacio Salvador & Cristina Rueda

Authors

David Conde
View author publications
You can also search for this author in PubMed Google Scholar
Miguel A. Fernández
View author publications
You can also search for this author in PubMed Google Scholar
Bonifacio Salvador
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Rueda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Conde .

Editor information

Editors and Affiliations

Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
Klaus Jung

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Conde, D., Fernández, M.A., Salvador, B., Rueda, C. (2016). Classification of Samples with Order-Restricted Discriminant Rules. In: Jung, K. (eds) Statistical Analysis in Proteomics. Methods in Molecular Biology, vol 1362. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3106-4_10

Download citation

DOI: https://doi.org/10.1007/978-1-4939-3106-4_10
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3105-7
Online ISBN: 978-1-4939-3106-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics