Abstract
Although discovering proteomic biomarker by using mass spectrometry technique is promising, its rate of introducing proteomic biomarker approved by the US Food and Drug Administration is falling every year and nearly 1 per year on an average since 1998. Apparently, there is a big gap between biomarker discovery and biomarker validation. Here, we reviewed the challenges appearing in the three key stages for the pipeline of proteomic biomarker, that is, blood sample preparation, bioinformatics algorithms for biomarker candidate discovery, and validation and clinical application of proteomic biomarkers. To analyze and explain the reasons for the gap between biomarker discovery and validation, we covered areas ranging from the techniques/methods used in biomarker discovery and their related biological backgrounds to the existing problems in these techniques/methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adam BL, Qu Y et al (2002) Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res 62(13):3609–3614
Ahmed N, Barker G et al (2003) An approach to remove albumin for the proteomic analysis of low abundance biomarkers in human serum. Proteomics 3(10):1980–1987
Albrethsen J (2007) Reproducibility in protein profiling by MALDI-TOF mass spectrometry. Clin Chem 53(5):852–858
Alfassi ZB (2004) On the normalization of a mass spectrum for comparison of two spectra. J Am Soc Mass Spectrom 15(3):385–387
America AH, Cordewener JH (2008) Comparative LC-MS: a landscape of peaks and valleys. Proteomics 8(4):731–749
Anderson NL, Anderson NG (2002) The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics 1(11):845–867
Anderson NL, Polanski M et al (2004) The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol Cell Proteomics 3(4):311–326
Andreev VP, Rejtar T et al (2003) A universal denoising and peak picking algorithm for LC-MS based on matched filtration in the chromatographic time domain. Anal Chem 75(22): 6314–6326
Arneberg R, Rajalahti T et al (2007) Pretreatment of mass spectral profiles: application to proteomic data. Anal Chem 79(18):7014–7026
Baggerly KA, Morris JS et al (2003) A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics 3(9):1667–1672
Ball G, Mian S et al (2002) An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics 18(3):395–404
Bensmail H, Golek J et al (2005) A novel approach for clustering proteomics data using Bayesian fast Fourier transform. Bioinformatics 21(10):2210–2224
Bhanot G, Alexe G et al (2006) A robust meta-classification strategy for cancer detection from MS data. Proteomics 6(2):592–604
Bodovitz S, Joos T (2004) The proteomics bottleneck: strategies for preliminary validation of potential biomarkers and drug targets. Trends Biotechnol 22(1):4–7
Bolstad BM, Irizarry RA et al (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193
Brouwers FM, Petricoin EF III et al (2005) Low molecular weight proteomic information distinguishes metastatic from benign pheochromocytoma. Endocr Relat Cancer 12(2):263–272
Bylund D, Danielsson R et al (2002) Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography-mass spectrometry data. J Chromatogr A 961(2):237–244
Callister SJ, Barry RC et al (2006) Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. J Proteome Res 5(2):277–286
Chen T, Kao MY et al (2001) A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. J Comput Biol 8(3):325–337
Cho SY, Lee EY et al (2005) Efficient prefractionation of low-abundance proteins in human plasma and construction of a two-dimensional map. Proteomics 5(13):3386–3396
Coombes KR (2005) Analysis of mass spectrometry profiles of the serum proteome. Clin Chem 51(1):1–2
Coombes KR, Fritsche HA Jr et al (2003) Quality control and peak finding for proteomics data collected from nipple aspirate fluid by surface-enhanced laser desorption and ionization. Clin Chem 49(10):1615–1623
Coombes KR, Tsavachidis S et al (2005) Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics 5(16):4107–4117
Cox J, Mann M (2007) Is proteomics the new genomics? Cell 130(3):395–398
Dancik V, Addona TA et al (1999) De novo peptide sequencing via tandem mass spectrometry. J Comput Biol 6(3–4):327–342
Davis MT, Patterson SD (2007) Does the serum peptidome reveal hemostatic dysregulation? Ernst Schering Res Found Workshop 61:23–44
Diamandis EP (2003) Point: proteomic patterns in biological fluids: do they represent the future of cancer diagnostics? Clin Chem 49(8):1272–1275
Diamandis EP (2004) Analysis of serum proteomic patterns for early cancer diagnosis: drawing attention to potential problems. J Natl Cancer Inst 96(5):353–356
Diamond DL, Y Zhang et al (2003) Use of ProteinChip array surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS) to identify thymosin beta-4, a differentially secreted protein from lymphoblastoid cell lines. J Am Soc Mass Spectrom 14(7):760–765
Dijkstra M, Vonk RJ et al (2007) SELDI-TOF mass spectra: a view on sources of variation. J Chromatogr B Analyt Technol Biomed Life Sci 847(1):12–23
Ebert MP, Meuer J et al (2004) Identification of gastric cancer patients by serum protein profiling. J Proteome Res 3(6):1261–1266
Fenselau C (2007) A review of quantitative methods for proteomic studies. J Chromatogr B Analyt Technol Biomed Life Sci 855(1):14–20
Fernandez-de-Cossio J, Gonzalez J et al (1995) A computer program to aid the sequencing of peptides in collision-activated decomposition experiments. Comput Appl Biosci 11(4): 427–434
Fischer B, Roth V et al (2005) NovoHMM: a hidden Markov model for de novo peptide sequencing. Anal Chem 77(22):7265–7273
Fischer B, Grossmann J et al (2006) Semi-supervised LC/MS alignment for differential proteomics. Bioinformatics 22(14):e132–e140
Frank A, Pevzner P (2005) PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem 77(4):964–973
Fung ET, Enderwick C (2002) ProteinChip clinical proteomics: computational challenges and solutions. Biotechniques Suppl:34–38, 40–41
Fushiki T, Fujisawa H et al (2006) Identification of biomarkers from mass spectrometry data using a common peak approach. BMC Bioinformatics 7:358
Geho DH, Liotta LA et al (2006) The amplified peptidome: the new treasure chest of candidate biomarkers. Curr Opin Chem Biol 10(1):50–55
Geurts P, Fillet M et al (2005) Proteomic mass spectra classification using decision tree based ensemble methods. Bioinformatics 21(14):3138–3145
Gobom J, Mueller M et al (2002) A calibration method that simplifies and improves accurate determination of peptide molecular masses by MALDI-TOF MS. Anal Chem 74(15): 3915–3923
Gras R, Muller M et al (1999) Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis 20(18):3535–3550
Hanash SM, Pitteri SJ et al (2008) Mining the plasma proteome for cancer biomarkers. Nature 452(7187):571–579
Hastings CA, Norton SM et al (2002) New algorithms for processing and peak detection in liquid chromatography/mass spectrometry data. Rapid Commun Mass Spectrom 16(5):462–467
Hauskrecht M, Pelikan R et al (2005) Feature selection for classification of SELDI-TOF-MS proteomic profiles. Appl Bioinformatics 4(4):227–246
Higdon R, Kolker N et al (2004) LIP index for peptide classification using MS/MS and SEQUEST search via logistic regression. OMICS 8(4):357–369
Hilario M, Kalousis A et al (2006) Processing and classification of protein mass spectra. Mass Spectrom Rev 25(3):409–449
Hingorani SR, Petricoin EF et al (2003) Preinvasive and invasive ductal pancreatic cancer and its early detection in the mouse. Cancer Cell 4(6):437–450
Hoffmann P, Ji H et al (2001) Continuous free-flow electrophoresis separation of cytosolic proteins from the human colon carcinoma cell line LIM 1215: a non two-dimensional gel electrophoresis-based proteome analysis strategy. Proteomics 1(7):807–818
Hortin GL (2006) The MALDI-TOF mass spectrometric view of the plasma proteome and peptidome. Clin Chem 52(7):1223–1237
Huang L, Jacob RJ et al (2001) Functional assignment of the 20 S proteasome from Trypanosoma brucei using mass spectrometry and new bioinformatics approaches. J Biol Chem 276(30):28327–28339
Itoh SG, Okamoto Y (2007) Effective sampling in the configurational space of a small peptide by the multicanonical-multioverlap algorithm. Phys Rev E Stat Nonlin Soft Matter Phys 76(2, Part 2):026705
Jaitly N, Monroe ME et al (2006) Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline. Anal Chem 78(21):7397–7409
Jirasek A, Schulze G et al (2004) Accuracy and precision of manual baseline determination. Appl Spectrosc 58(12):1488–1499
Joos TO, Bachmann J (2005) The promise of biomarkers: research and applications. Drug Discov Today 10(9):615–616
Karpievitch YV, Hill EG et al (2007) PrepMS: TOF MS data graphical preprocessing tool. Bioinformatics 23(2):264–265
Kim YP, Oh YH et al (2008) Protein kinase assay on peptide-conjugated gold nanoparticles. Biosens Bioelectron 23(7):980–986
Lange E, Gropl C et al (2007) A geometric approach for the alignment of liquid chromatography-mass spectrometry data. Bioinformatics 23(13): i273–i281
Lee DS, Rudge AD et al (2005) A new model validation tool using kernel regression and density estimation. Comput Methods Programs Biomed 80(1):75–87
Lee HJ, Lee EY et al (2006) Biomarker discovery from the plasma proteome using multidimensional fractionation proteomics. Curr Opin Chem Biol 10(1):42–49
Li B, Robinson DH et al (1997) Evaluation of properties of apigenin and [G-3H]apigenin and analytic method development. J Pharm Sci 86(6):721–725
Li J, Zhang Z et al (2002) Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem 48(8):1296–1304
Li L, Umbach DM et al (2004) Application of the GA/KNN method to SELDI proteomics data. Bioinformatics 20(10):1638–1640
Listgarten J, Emili A (2005) Statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry. Mol Cell Proteomics 4(4): 419–434
Listgarten J, Neal RM et al (2007) Difference detection in LC-MS data for protein biomarker discovery. Bioinformatics 23(2): e198–e204
Liu H, Li J et al (2002) A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Inform 13:51–60
Ludwig JA, Weinstein JN (2005) Biomarkers in cancer staging, prognosis and treatment selection. Nat Rev Cancer 5(11):845–856
Ma B, Zhang K et al (2003) PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom 17(20):2337–2342
Mackey AJ, Haystead TA et al (2002) Getting more from less: algorithms for rapid protein identification with multiple short peptide sequences. Mol Cell Proteomics 1(2):139–147
Malyarenko DI, Cooke WE et al (2005) Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-flight mass spectrometric records for serum peptides using time-series analysis techniques. Clin Chem 51(1):65–74
Marcuson R, Burbeck SL et al (1982) Normalization and reproducibility of mass profiles in the detection of individual differences from urine. Clin Chem 28(6):1346–1348
McGuire JN, Overgaard J et al (2008) Mass spectrometry is only one piece of the puzzle in clinical proteomics. Brief Funct Genomic Proteomic 7(1):74–83
Miklos GL, Maleszka R (2001) Integrating molecular medicine with functional proteomics: realities and expectations. Proteomics 1(1):30–41
Mueller LN, Rinner O et al (2007) SuperHirn – a novel tool for high resolution LC-MS-based peptide/protein profiling. Proteomics 7(19):3470–3480
Ng JK, Ajikumar PK et al (2007) Spatially addressable protein array: ssDNA-directed assembly for antibody microarray. Electrophoresis 28(24):4638–4644
Pantaleo MA, Nannini M et al (2008) Conventional and novel PET tracers for imaging in oncology in the era of molecular therapy. Cancer Treat Rev 34(2):103–121
Park T, Yi SG et al (2003) Evaluation of normalization methods for microarray data. BMC Bioinformatics 4:33
Perkins DN, Pappin DJ et al (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18):3551–3567
Perrin, C, Walczak B et al (2001) The use of wavelets for signal denoising in capillary electrophoresis. Anal Chem 73(20):4903–4917
Petricoin EF, Liotta LA (2003) Mass spectrometry-based diagnostics: the upcoming revolution in disease detection. Clin Chem 49(4):533–534
Petricoin EF III, Ornstein DK et al (2002a) Serum proteomic patterns for detection of prostate cancer. J Natl Cancer Inst 94(20):1576–1578
Petricoin EF, Ardekani AM et al (2002b) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306):572–577
Petricoin EF, Belluco C et al (2006) The blood peptidome: a higher dimension of information content for cancer biomarker discovery. Nat Rev Cancer 6(12):961–967
Pitzer E, Masselot A et al (2007) Assessing peptide de novo sequencing algorithms performance on large and diverse data sets. Proteomics 7(17):3051–3054
Poon TC, Yip TT et al (2003) Comprehensive proteomic profiling identifies serum proteomic signatures for detection of hepatocellular carcinoma and its subtypes. Clin Chem 49(5):752–760
Powell K (2003) Proteomics delivers on promise of cancer biomarkers. Nat Med 9(8):980
Prados J, Kalousis A et al (2004) Mining mass spectra for diagnosis and biomarker discovery of cerebral accidents. Proteomics 4(8):2320–2332
Prince JT, Marcotte EM (2006) Chromatographic alignment of ESI-LC-MS proteomics data sets by ordered bijective interpolated warping. Anal Chem 78(17):6140–6152
Qu Y, Adam BL et al (2002) Boosted decision tree analysis of surface-enhanced laser desorption/ionization mass spectral serum profiles discriminates prostate cancer from noncancer patients. Clin Chem 48(10):1835–1843
Radhakrishnan R, Solomon M et al (2008) Tissue microarray – a high-throughput molecular analysis in head and neck cancer. J Oral Pathol Med 37(3):166–176
Rai AJ, Zhang Z et al (2002) Proteomic approaches to tumor marker discovery. Arch Pathol Lab Med 126(12):1518–1526
Ransohoff DF (2005) Bias as a threat to the validity of cancer molecular-marker research. Nat Rev Cancer 5(2):142–149
Rejtar T, Chen HS et al (2004) Increased identification of peptides by enhanced data processing of high-resolution MALDI TOF/TOF mass spectra prior to database searching. Anal Chem 76(20):6017–6028
Resing KA, Meyer-Arendt K et al (2004) Improving reproducibility and sensitivity in identifying human proteins by shotgun proteomics. Anal Chem 76(13):3556–3568
Ressom HW, Varghese RS et al (2005) Analysis of mass spectral serum profiles for biomarker selection. Bioinformatics 21(21):4039–4045
Ressom HW, Varghese RS et al (2007) Peak selection from MALDI-TOF mass spectra using ant colony optimization. Bioinformatics 23(5):619–626
Ressom HW, Varghese RS et al (2008) Classification algorithms for phenotype prediction in genomics and proteomics. Front Biosci 13:691–708
Rietjens IM, Steensma A et al (1995) Comparative biotransformation of hexachlorobenzene and hexafluorobenzene in relation to the induction of porphyria. Eur J Pharmacol 293(4):293–299
Rifai N, Gillette MA et al (2006) Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 24(8):971–983
Rogers MA, Clarke P et al (2003) Proteomic profiling of urinary proteins in renal cancer by surface enhanced laser desorption ionization and neural-network analysis: identification of key issues affecting potential clinical utility. Cancer Res 63(20):6971–6983
Rosty C, Christa L et al (2002) Identification of hepatocarcinoma-intestine-pancreas/pancreatitis-associated protein I as a biomarker for pancreatic ductal adenocarcinoma by protein biochip technology. Cancer Res 62(6):1868–1875
Sawyers CL (2008) The cancer biomarker problem. Nature 452(7187):548–552
Shackman JG, Watson CJ et al (2004) High-throughput automated post-processing of separation data. J Chromatogr A 1040(2):273–282
Shen S, Zhang PS et al (2003) Analysis of protein tyrosine kinase expression in melanocytic lesions by tissue array. J Cutan Pathol 30(9):539–547
Shevchenko A, Sunyaev S et al (2001) Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching. Anal Chem 73(9):1917–1926
Shimizu A, Nakanishi T et al (2006) Detection and characterization of variant and modified structures of proteins in blood and tissues by mass spectrometry. Mass Spectrom Rev 25(5):686–712
Shin YK, Lee HJ et al (2006) Proteomic analysis of mammalian basic proteins by liquid-based two-dimensional column chromatography. Proteomics 6(4):1143–1150
Silva JC, Denny R et al (2005) Quantitative proteomic analysis by accurate mass retention time pairs. Anal Chem 77(7):2187–2200
Simpson RJ, Bernhard OK et al (2008) Proteomics-driven cancer biomarker discovery: looking to the future. Curr Opin Chem Biol 12(1):72–77
Steeves JB, Gagne HM et al (2000) Normalization of residual ions after removal of the base peak in electron impact mass spectrometry. J Forensic Sci 45(4):882–885
Stoll D, Templin MF et al (2002) Protein microarray technology. Front Biosci 7:c13–c32
Stolt R, Torgrip RJ et al (2006) Second-order peak detection for multicomponent high-resolution LC/MS data. Anal Chem 78(4):975–983
Su LK (2003) Co-immunoprecipitation of tumor suppressor protein-interacting proteins. Methods Mol Biol 223:135–140
Tam SW, Pirro J et al (2004) Depletion and fractionation technologies in plasma proteomic analysis. Expert Rev Proteomics 1(4):411–420
Tan CS, Ploner A et al (2006) Finding regions of significance in SELDI measurements for identifying protein biomarkers. Bioinformatics 22(12):1515–1523
Tang HY, Ali-Khan N et al (2005) A novel four-dimensional strategy combining protein and peptide separation methods enables detection of low-abundance proteins in human plasma and serum proteomes. Proteomics 5(13):3329–3342
Taylor JA, Johnson RS (1997) Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom 11(9):1067–1075
Taylor JA, Johnson RS (2001) Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal Chem 73(11):2594–2604
Thomas TM, Shave EE et al (2002) Preparative electrophoresis: a general method for the purification of polyclonal antibodies. J Chromatogr A 944(1–2):161–168
Tibshirani R, Hastie T et al (2004) Sample classification from protein mass spectrometry, by ‘peak probability contrasts’. Bioinformatics 20(17):3034–3044
Villanueva J, Philip J et al (2004) Serum peptide profiling by magnetic particle-assisted, automated sample processing and MALDI-TOF mass spectrometry. Anal Chem 76(6):1560–1570
Vlahou A, Laronga C et al (2003) A novel approach toward development of a rapid blood test for breast cancer. Clin Breast Cancer 4(3):203–209
Wagner M, Naik D et al (2003) Protocols for disease classification from mass spectrometry data. Proteomics 3(9):1692–1698
Wang K, Johnson A et al (2005) TSE clearance during plasma products separation process by Gradiflow(TM). Biologicals 33(2):87–94
Wang MZ, Howard B et al (2003) Analysis of human serum proteins by liquid phase isoelectric focusing and matrix-assisted laser desorption/ionization-mass spectrometry. Proteomics 3(9):1661–1666
Wang P, Tang H et al (2006) Normalization regarding non-random missing values in high-throughput mass spectrometry data. Pac Symp Biocomput 315–326
Wang P, Tang H et al (2007) A statistical method for chromatographic alignment of LC-MS data. Biostatistics 8(2):357–367
Weissleder R, Pittet MJ (2008) Imaging in the era of molecular oncology. Nature 452(7187): 580–589
Whelan RJ, Sunahara RK et al (2004) Affinity assays using fluorescence anisotropy with capillary electrophoresis separation. Anal Chem 76(24):7380–7386
Won Y, Song HJ et al (2003) Pattern analysis of serum proteome distinguishes renal cell carcinoma from other urologic diseases and healthy persons. Proteomics 3(12):2310–2316
Wu B, Abbott T et al (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19(13):1636–1643
Yasui Y, Pepe M et al (2003) A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics 4(3):449–463
Yewdell JW (2003) Immunology. Hide and seek in the peptidome. Science 301(5638):1334–1335
Yu JS, Ongarello S et al (2005) Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data. Bioinformatics 21(10):2200–2209
Zhang J, He S et al (2008) PeakSelect: preprocessing tandem mass spectra for better peptide identification. Rapid Commun Mass Spectrom 22(8):1203–1212
Zhang X, Lu X et al (2006) Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data. BMC Bioinformatics 7:197
Zhukov TA, Johanson RA et al (2003) Discovery of distinct protein profiles specific for lung tumors and pre-malignant lung lesions by SELDI mass spectrometry. Lung Cancer 40(3):267–279
Acknowledgements
This research is funded by the Bioinformatics Core Research Grant at The Methodist Research Institute, Cornell University. Dr. Zhou is partially funded by The Methodist Hospital Scholarship Award. He and Dr. Wong are also partially funded by NIH grants R01LM08696, R01LM009161, and R01AG028928. The authors have declared no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Jin, G., Zhou, X., Wang, H., Wong, S.T.C. (2009). The Challenges in Blood Proteomic Biomarker Discovery. In: Pham, T. (eds) Computational Biology. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-0811-7_12
Download citation
DOI: https://doi.org/10.1007/978-1-4419-0811-7_12
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-0810-0
Online ISBN: 978-1-4419-0811-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)