1 Introduction

The history of matrix-assisted laser desorption/ionization (MALDI) mass spectrometry (MS) can be dated back to 1985 [1]. Karas et al. first reported the use of an organic molecule as a matrix to assist desorption/ionization of other small molecules under UV laser irradiation [1]. In 1987, Koichi Tanaka and his colleagues showed that coupling MALDI to a time-of-flight (TOF) mass analyzer allowed the detection of macromolecules, especially proteins [2]. Koichi Tanaka’s MALDI-TOF MS method for analyses of macromolecules was highly regarded. It created new opportunities for application of MS to biomedical research. In 2002, Koichi Tanaka together with two other chemists were awarded The Nobel Prize in Chemistry 2002 for their developments of soft desorption ionization methods for mass spectrometric analysis of biological macromolecules. In the past 15 years, while applications of MALDI MS technologies to qualitative and quantitative analyses of proteins and metabolites have been investigated, these technologies have been widely employed in proteomic/metabolomic research, especially in biomarker discovery. Because this chapter is aimed at updating readers on the advances in MALDI MS technologies in clinical diagnostic applications, it is not going to provide a comprehensive review on all MALDI MS technologies. This chapter will only cover MALDI-TOF MS, MALDI-TOF/TOF MS, SALDI-TOF MS, MALDI-QqQ MS, and SELDI-TOF MS, which have great potentials in influencing the clinical diagnostic practices. Basic principles and brief overviews of these technologies will be provided to an extent allowing readers to understand the potentials and limitations of these technologies in diagnostic applications. Then the applications of these technologies to biomarker discovery and their potential uses in biomarker quantification will be reviewed. Practical concerns and their possible solutions on applying MALDI-based MS technologies, especially SELDI, to quantification and discovery of serum/plasma biomarkers will also be addressed.

2 MALDI-TOF MS and MALDI-TOF/TOF MS

2.1 Basic Principles

MALDI is regarded as a soft desorption ionization method because it can result in the formation of ions without significantly breaking any chemical bonds by using optimal laser irradiance [1]. This is particularly important in obtaining the correct mass of a biomolecule, especially for proteins, during MS analysis. Subsequently, structure or sequence information can be obtained by tandem MS analysis. As indicated from the name MALDI, a matrix is needed to assist desorption and ionization of an organic molecule under UV irradiation [1, 3]. Desorption/ionization efficiencies of different types of biomolecules depend on the chemical used as the matrix. For example, cyano-4-hydroxycinnamic acid (CHCA) is a very good matrix for peptides [4]. Sinapinic acid is good for intact proteins [4]. Super-DHB (i.e., a mixture of 2,5-dihydroxybenzoic acid and 2-hydroxy-5-methoxybenzoic acid) is good for glycan analysis [5]. 3-Hydroxypicolinic acid is commonly used for MALDI-TOF MS analysis of DNA molecules [6]. An analyte or a mixture of analytes needs to mix with a chemical matrix in solution phase, and is added on a conductive MS sample plate. After drying, analyte-matrix co-crystals are formed. This work-flow is illustrated in Fig. 1. They are then subjected to MALDI MS analysis. Under UV irradiation the analytes will be desorbed and ionized [3]. In the absence of alkali metal ion and/or halide ion in the co-crystals, singly charged protonated and deprotonated molecules are usually formed. The mass (or molecular weight) of a molecule will be approximately equal to “m/z value −1.0073” and “m/z value +1.0073”, respectively. In the presence of alkali metal ion and/or halide ion, such as Na+ and Cl, metal ion adducts and/or halide ion adducts may be formed. Either positively or negatively charged molecules are transferred to the mass analyzer for separation according to their mass-to-charge (m/z) ratios.

Fig. 1
figure 04131

A typical workflow of sample spot preparations for analyses by MALDI-TOF MS and SALDI-TOF MS. In conventional MALDI-TOF MS, the analytes and chemical matrix are mixed and dried to form co-crystals, whereas analytes are coated homogeneously and distributed evenly on a layer of solid matrix in SALDI-TOF MS

There are various types of mass analyzers. A TOF or TOF/TOF mass analyzer is the most commonly used coupled with a MALDI source. When kinetic energy is given to a group of charged molecules in direct proportion to their charge states under vacuum, the charged molecules will travel in a flight tube at a velocity inversely proportional to the square root of their m/z values [7]. In other words, charged molecules with larger m/z values have longer TOF, and they are efficiently separated for generating a mass spectrum within 1 s. The resolving power of a TOF mass analyzer depends on the length of the flight path. In contrast, other commonly used mass analyzers, such as ion trap, orbitrap, and Fourier transform ion cyclotron resonance (FTICR) mass analyzers, have resolving powers directly proportional to the time of the charged molecules staying inside the mass analyzers. The high-throughput nature of a TOF mass analyzer makes it perfectly match with a MALDI source. A single TOF mass analyzer does not allow efficient structure/sequence elucidation of a targeted analyte in a mixture of analytes. This can be overcome by linking two TOF mass analyzers in series, i.e., TOF/TOF. The first TOF mass analyzer is used to resolve and select the precursor ion of a targeted analyte for later fragmentation, whereas the second TOF mass analyzer is used to separate the fragment ions for generating a tandem MS spectrum [8, 9].

2.2 A High-Throughput Technology for Discovery and Quantification of Biomarkers

Discovery of disease-specific biomarkers for assisting diagnoses is still a difficult but important task all over the world. One commonly used approach for identification of disease-specific biomarkers is to compare the quantitative biomolecule profiles of plasma/serum specimens from the patients with the target disease and control subjects without the disease. Because of high heterogeneity in the baseline concentrations of various circulating biomolecules among both the patients and control subjects, it is important to obtain and compare quantitative plasma/serum biomolecule profiles from patient and control groups with a reasonable sample size. In such a case, high-throughput technologies are required in order to complete the analyses of the specimens within an acceptable period of time.

The major advantage of MALDI-TOF MS is that it is high-throughput in nature. After preprocessing, samples for MS analyses are applied on a MALDI sample plate as individual spots. MALDI-TOF MS analysis of each sample spot takes less than 1 min. One hundred samples can be automatically analyzed within 1 h. In contrast to electrospray ionization (ESI), which is another commonly used soft ionization technology, a single preprocessed sample is usually subjected to liquid chromatography (LC) before ESI MS analysis [10], resulting in a turn-around time of 20 min to 1 h. Therefore, it takes 30–100 h for analyzing 100 samples by ESI MS. For a shotgun proteomic profiling approach, it will take a day for obtaining a coarse proteomic profile of a single specimen, or at least a week for obtaining a comprehensive proteomic profile.

2.3 Common Use in Analyses of Large Biomolecules, But Not Small Biomolecules

It is well known that MALDI-TOF MS and MALDI-TOF/TOF MS have been widely applied to protein identification in proteomics laboratories. When subjected to MALDI-TOF MS, peptides and proteins are predominantly detected as singly charged protonated molecules at high sensitivities. On one hand, the detection sensitivity of MALDI-TOF MS depends on the chemical composition of a molecule. On the other hand, in general the detection sensitivity is inversely proportional to the mass of a molecule. For large proteins, e.g., albumin, usually at least an amount of 100 fmol to 1 pmol is required for a reliable MS signal. For a clean preparation, a peptide of 0.25 fmol can be readily detected. MALDI-TOF MS can efficiently obtain the masses of majority of peptides in a protein tryptic digest in the MS range of m/z 1,000–2,500 with high accuracy (<40 ppm for external m/z calibration; <5 ppm for internal m/z calibration). The resulted list of tryptic peptides’ peak intensities and masses can then be subjected to a database search to obtain the protein identity by using the tryptic peptide mass fingerprinting algorithms, e.g., Mascot [11]. For individual tryptic peptides, it can be further subjected to tandem MS to obtain a series of a-, b-, y-ions if one has a MALDI-TOF/TOF MS instrument. The resulting list of fragment ions’ peak intensities and masses can be further subjected to a database search to obtain the protein identity by using the MS/MS ion search algorithms [12].

As described in the previous section, analytes are embedded in analyte-matrix co-crystals before MS analysis. The majority of the matrices are derivatives of benzoic acid, cinnamic acid, and carboxylic acids [3]. However, in MALDI-TOF MS, these small molecules themselves form protonated ions, fragment ions, and cluster ions, and cause intensive chemical noises in the mass range below m/z 800 [13, 14]. These noises cause significant interference during the analyses of small molecules. This explains why there have been only a few reports on using MALDI-TOF MS for small molecule analysis in the past 27 years [14]. Although analyses of small molecules are technically difficult, all these reports have provided concrete evidence that MALDI-TOF MS is a feasible tool for analysis of small biomolecules, including amino acids [15], lipids [16], O-linked glycans [17], steroid hormone [18], etc.

2.4 Quantitative Issues in the MALDI-TOF Mass Spectra

In addition to matrix chemical noises in the low mass region, the prerequisite formation of analyte-matrix co-crystals causes uneven distribution of analytes on a sample spot. MS signals of various analytes vary significantly among the co-crystals. In common practice a representative mass spectrum of a single sample spot is generated from the summation of mass spectra obtained at different positions within a sample spot. As a result, conventional MALDI-TOF MS methods are usually considered not quantitative, or at most semi-quantitative [4]. However, the reproducibility of the peak intensities of biomolecules in a MALDI-TOF MS spectrum can be improved by using an unbiased automatic mass spectrum acquisition protocol across a sample spot and by providing a fine network (e.g., nitrocellulose coating) for formation of a layer of homogeneous small analyte-matrix co-crystals [4]. With an unbiased MS acquisition protocol and the use of nitrocellulose film, intra-assay and inter-assay coefficients of variation (CVs) of the normalized peak intensities of peptide/protein standards were found to be <15% [4], suggesting that MALDI-TOF MS is a feasible tool for profiling and quantifying peptides and proteins in biological samples.

2.5 Coupled with Functionalized Magnetic Beads for Peptide/Protein Biomarker Discovery

Plasma/serum samples are highly complex, and cannot be directly subjected to MALDI-TOF MS analysis because of a signal suppression problem. This can be solved by using chromatographic techniques to enrich a subgroup of proteins with matched physicochemical properties. This concept was first introduced by Bruker Daltonics Inc. (Bremen, Germany) as a commercially available system called ClinProt for semi-quantitative profiling of proteins/peptides in serum/plasma. In this system, various types of functionalized magnetic beads with different chromatographic properties are available, including hydrophobic interaction (C3, C8, C18), weak cation exchange, weak anion exchange, metal ion affinity (Cu2+, Fe3+), and lectin affinity (Concanavalin A). The ClinProt magnetic bead technology is only licensed to be performed with MALDI-TOF MS instruments from the same manufacturer. The ClinProt system users are supplied with a kit of standard protocol, together with specific buffers. The compositions of the binding and washing reagents are not disclosed. Because MALDI-TOF MS is a sensitive technology for detection of proteins, only 5 μL of plasma/serum is required for the ClinProt system, according to the supplier’s instructions. The eluted proteins/peptides are added on a thin-layer of CHCA, and subjected to MALDI-TOF MS for obtaining a semi-quantitative mass spectrum. The ClinProt technology was first reported to be highly quantitative. The CVs for the normalized protein/peptide peak intensities were ≤7% [19]. However, later studies showed that the CVs were between 20% and 30% for both manual and robotic assays [20, 21]. There are about 40 reports on using the ClinProt system in discovery of potential biomarkers of human diseases, such as oral cancer [19], head and neck cancer [20], and nephrotic syndrome [22].

In 2007, Jimenez et al. reported an automated method comparable to the ClinProt system by using C18 hydrophobic magnetic beads for profiling of serum peptides with masses in the range of m/z 800–4,000 [23]. The intra-assay and inter-assay CVs were 2–38% and 10–53%, respectively. Later our group developed a strategy for quantitative profiling of both serum peptides/proteins and micropreparative purification of the corresponding peptides/proteins in parallel using C18 hydrophobic, strong anion exchange, and weak cation exchange magnetic beads [24]. In our method, only 2 μL of serum is required, and sinapinic acid is used as the chemical matrix. By using an automatic platform for the binding, washing, and elution steps and using a MALDI-TOF MS instrument optimized for quantitative proteomic profiling, both intra-assay and inter-assay CVs were found to be 4–30%. Because the peptides/proteins corresponding to the potential diagnostic peaks are purified in parallel with the profiling experiments, the subsequent work for deciphering protein identities of the potential biomarker peaks is greatly simplified. Using this method, we have recently identified proapolipoprotein CII (Pro-apoC2) and a des-arginine variant of serum amyloid A (SAA) as host response biomarkers for diagnosis of late-onset septicemia and necrotizing enterocolitis in preterm infants [25]. The ApoSAA score computed from plasma apoC2 and SAA concentrations was effective in identifying necrotizing enterocolitis/late-onset sepsis cases in both independent case–control and prospective cohort studies. On the basis of the ApoSAA score, infants suspected with the diseases could be stratified into different risk categories. This enabled neonatologists to withhold treatment in 45% and enact early stoppage of antibiotics in 16% of non-sepsis infants.

2.6 Sequence-Specific Exopeptidase Activity Test for “Functional” Biomarkers in Disease Diagnosis

The combined use of hydrophobic magnetic beads and MALDI-TOF/TOF MS allows both quantitative profiling of plasma/serum peptides and direct identification of the amino acid sequences of the peptides without the need for subsequent purification work. When Villanueva et al. attempted to identify the serum peptide pattern associated with metastatic thyroid cancer by undertaking this approach, they found that the majority of the disease-associated peptides were derived from fibrinopeptide A, complement C3f, and fibrinogen-α as a result of exopeptidase degradation [26]. It was speculated that proteases produced by the thyroid cancer cells led to the formation of these disease-associated peptides [27]. This led them to develop further the Sequence-Specific Exopeptidase Activity Test (SSEAT) test [27]. Instead of identification of the disease-associated peptides, the test monitors degradation of artificial substrates in the presence of individual patients’ sera by MALDI-TOF MS. Double labeled, non-degradable peptides are spiked into the samples as internal standards at the same time to adjust for the adsorptive and processing-related losses. The peak intensity ratios of degradation products to the corresponding non-degradable reference peptides are used as biomarkers. The CVs of these ratios were reported to be 6.3–14.3%. Using the SSEAT test, the group could classify 48 metastatic thyroid cancer patients and 48 healthy controls at 94% sensitivity and 90% specificity [27]. The major advantage of the SSEAT test is that reproducibility problems related to sample collection, storage, and handling in serum peptide profiling analysis can be greatly reduced. Furthermore, theoretically, by using specific peptide sequences as substrates for different diseases, the diagnostic sensitivity and specificity of a SSEAT test may be further improved.

2.7 Quantification of Protein Biomarkers in Disease Diagnosis

Immunosorbent assay and immunoturbidity assay are the most commonly used technologies for quantification of specific protein biomarkers in routine clinical chemistry laboratories. Both technologies require the use of specific antibodies. The use of antibodies allows sensitive and specific quantification of a target protein biomarker. However, the use of antibodies can cause uncertainty in measurement. Affinity and specificity of the antibody preparations against a specific antigen varies significantly from source to source. It is not uncommon for immunoassay kits from different manufacturers to produce disconcordant readings. Furthermore, a specific protein can appear as different forms in biological specimens, including glycosylation variants, free subunits, and metabolized forms. A typical example is circulating human chorionic gonadotropin (hCG), which is a useful biomarker for diagnosis of pregnancy, hydatidiform mole, and certain poorly differentiated cancers. HCG is present in a number of forms in blood, including intact hCG, nicked hCG, hyper- and hypoglycosylated hCG, hCG missing the C-terminal extension, free alpha-subunit, large free alpha-subunit, free beta-subunit, nicked free beta-subunit, and beta-core fragment [28]. For blood samples collected in normal pregnancy, only minor variations in the assay performance appear among the commercial immunoassay kits. However, for irregular gestations, immunoassay results can be significantly different among the kits [28]. When different forms of a protein biomarker have different molecular weights, they can be readily differentiated by mass spectrometry, resulting in more reliable measurements [29].

MALDI-TOF MS can be used alone for quantification of a protein biomarker in uncomplex biological specimens, such as urine. For example, MALDI-TOF MS has been used to semi-quantify albumin in urine for the diagnosis of albuminuria [30, 31]. This approach does not require any pretreatment of a urine sample [30], and the results are not affected by the presence of interfering substances, such as drugs, detergents, and blood, which often cause false-positive and false-negative results in conventional urinary dipstick tests [31]. Glycated and glutathionylated hemoglobin can be measured by direct MALDI-TOF MS analysis of hemolysate with both intra-assay and inter-assay CVs <10% [32]. The MALDI-TOF MS results correlated well with results obtained by using a validated routine assay for HbA1c (correlation coefficient = 0.92) [32].

In complex biological specimens like serum, direct MALDI-TOF MS analysis of low abundant proteins is not possible. The high and medium abundant proteins in serum will mask the signals of the targeted protein. In such a case, MALDI-TOF MS can be combined with immunoprecipitation or immunocapture techniques to enrich and unmask the signal of a protein biomarker. Because it is difficult to control the amount of the target proteins recovered from the antibody beads, stable-isotope labeled internal standard protein that has the same amino acid sequence must be added to specimens for normalizing the variations. For example, after immunoprecipitation of amyloid-beta peptides from the cerebral spinal fluid, different amyloid-beta isoforms as well as their corresponding stable-isotope labeled internal standards appear as individual peaks of expected m/z values in a MALDI-TOF mass spectrum, and their quantities can be measured with high accuracy with intra-assay CVs <10% [33]. The results obtained by this method correlated well with the results obtained by ELISA with correlation coefficients of 0.89–0.95. Using specific antibody coated beads, Mason et al. has recently developed a sensitive method for quantifying angiotensin I and angiotensin II in human plasma [34]. This assay has a limit of detection of 13 and 11 pg/mL for angiotensin I and angiotensin II, respectively. The intra-assay CVs are <10%.

The limitations of MALDI-TOF MS-based quantitative analysis of large intact proteins are low specificity, low sensitivity, and low resolution. An amount of 100 fmol to 1 pmol is required for generating reliable MS signal from an intact protein. For example, a concentration of 100 fmol/μL (i.e., ~6.5 μg/mL) is required for reliable measurement of intact BSA. In addition, MALDI-TOF MS does not have good resolution to resolve large intact proteins. The accuracy of a measurement can easily be affected by the presence of protein contaminants with close molecular weights. Furthermore, wide-type proteins and corresponding mutant proteins cannot be efficiently resolved. To overcome these limitations one could digest a protein mixture first, capture the specific peptides that are commonly obtained by protease digestion (i.e., proteotypic peptides) with specific anti-peptide antibody coated beads, and finally quantify the peptides to reflect the protein concentrations. This approach is called iMALDI [35] or SISCAPA [36]. For example, epidermal growth factor receptor (EGFR) has a molecular weight of 180 kDa. The detection sensitivity of this approach for EGFR was shown to be 5 fmol [35]. If one has a MALDI-TOF/TOF MS instrument, the identity of a detected target peptide can be further confirmed by tandem MS. This could help to avoid false positive test results [35]. By using synthetic proteotypic peptides of six proteins and corresponding stable isotope peptides as internal standards for proof-of-concept, this approach has been shown to have average intra-assay CVs of 2.5% at a loading amount of 11 fmol on the sample spots [36]. Although the sensitivities of these methods are still at a magnitude of nanograms per milliliter, it is expected to be improved with the advancement of MALDI-TOF MS in the near future. Furthermore, this approach has a great potential in specific quantification of mutant proteins resulting from sense mutation of a gene sequence, e.g., EGFR with T790M mutation, which is a therapy response predictor for non-small cell lung cancer patients treated with EGFR tyrosine kinase inhibitors [37]. The major shortcoming of the SISCAPA or iMALDI approach is that it cannot differentiate different forms of a target protein if the proteotypic peptide selected for quantification does not cover the differences. For example, a proteotypic peptide lying in the N-terminal region of a target protein cannot differentiate its intact form from the C-terminal truncated forms. More details about iMALDI can be found in Chap. 6 (“Mass Spectrometry in High-throughput Clinical Biomarker Assays: Multiple Reaction Monitoring” written by Parker et al.).

2.8 Identification of Disease Associated Aberrant Glycosylation

There has been a long history in applying glycoprotein biomarkers for disease diagnosis and prognosis. Alternations in glycosylation changes have been observed in various diseases, such as congenital disorders of glycosylation syndrome (CDGs) [38], liver diseases [39], kidney diseases [40], and cancers [41]. A typical example of glycoprotein biomarkers for monitoring disease-associated glycosylation is circulating transferrin, which is still used in most hospitals for liver damage caused by chronic alcohol abuse [39] and identification of various types of CDGs nowadays [42]. As early as 1978, abnormal microheterogeneity of serum transferrin was observed in male alcoholics after alcohol intoxication [43]. In 1993, serum transferrin was first used to examine abnormal glycosylation in CDG patients [38]. Alternation in glycosylation of glycoproteins and glycolipids is a common feature in various cancers, and is involved in numerous ways in carcinogenesis, such as progression, cell–cell interaction, and metastasis. Tumor cells have different glycosylation machineries. Changes of glycosylation machinery in the cancer cells can be reflected in blood circulation by tracing the changes in the glycosylation of the proteins released by the tumor [44]. The poor specificity of a tumor biomarker is often due to the fact that it is also produced by normal cells under other pathological conditions. However, this problem can be reduced by measuring the circulating levels of its variants carrying cancer-associated glycosylations. For cancer diagnosis, a typical example is alpha-fetoprotein (AFP). Compared to the total serum AFP level, both fucosylated AFP and monosialylated AFP are more specific in the diagnosis of hepatocellular carcinoma (HCC) [45, 46]. Elevated mRNA expression of alpha1-6 fucosyltransferase in human HCC tissues was associated with the production of tumor-specific fucosylated AFP glycoform [47]. Serum levels of monosialylated AFP were negatively correlated with the tissue levels of beta-galactoside alpha-2,6-sialyltransferase [41].

MALDI-TOF MS can be used to identify and quantify disease-associated glycosylations carried by either a single protein or a mixture of proteins. For both cases, N-linked glycans or O-linked glycans can be cleaved from a protein preparation, cleaned up to remove interfering substances, and subsequently subjected to MALDI-TOF MS to obtain a mass spectrum of glycans (Fig. 2a). After normalization, the peak intensities of individual glycans can be used to estimate their relative levels in the preparation [5]. The intra- and interassay CVs of normalized peak intensities of N-glycans were reported to be <10% and <17%, respectively [5, 48]. The first application of MALDI-TOF-MS to analysis of N-linked glycans on transferrin preparations (Fig. 2b) that were affinity isolated from serum samples for diagnosis of Type-I CDGs was reported in 1994 [49]. Besides analyzing glycans cleaved from glycoproteins, one could use MALDI-TOF MS to examine disease-associated glycopeptides which are obtained by proteolytic digestion of affinity isolated proteins. MALDI-TOF MS analysis of glycopeptides from serum transferrin has been applied in CDG screening system to the diagnosis of Type-II CDGs in Japan [50]. Because MALDI-TOF is a sensitive technique, only 20 μL of serum is required for screening of Type-I and Type-II CDGs [50].

Fig. 2
figure 04132

(a) A typical workflow of quantitative profiling of N-linked glycans carried by proteins in whole serum (steps 1 and 3–5) or N-linked glycans carried by a single serum protein (steps 1–5) by MALDI-TOF MS. (b) Representative quantitative N-glycan profile from transferrin purified from serum by micro-scale antibody affinity chromatography. (c) Representative quantitative N-glycan profile from proteins in whole serum

When analyzing glycans released from all proteins in a tissue instead of a single protein, the concept of glycome appears. MALDI-FTICR MS was first used to obtain a semi-quantitative profile of O-linked glycome in serum, and identified potential glycan biomarkers for ovarian cancer [51]. One year later the same approach was used to discover potential glycan biomarkers for breast cancer [52]. Despite these encouraging results, a MALDI-FTICR MS instrument is far too expensive to be acquired by most of the clinical laboratories for providing routine services. Almost at the same time, another team and our team reported the use of MALDI-TOF MS for obtaining semi-quantitative profiles of serum N-linked glycome (Fig. 2c), and showed the potential use of serum N-linked glycome fingerprints in the diagnoses of metastatic prostatic cancer and liver fibrosis [5, 53]. Similar MALDI-TOF MS approaches have been used to identify serum N-glycan biomarkers for diagnoses of various cancers, including HCC [54], breast cancer [55], esophageal adenocarcinoma [56], and ovarian cancer [57]. In the case of breast cancer, a serum N-glycan at m/z 2,534 was found to be a potential predictor of patients’ response to trastuzumab [58].

2.9 Qualitative and Quantitative Analysis of Genetic Markers

In 1992, Nordhoff et al. first demonstrated the use of MALDI-TOF MS to detect and measure the masses of nucleic acids [59]. Three year later, the research team led by Charles Cantor showed that MALDI-TOF MS was a useful tool for DNA sequencing [60]. Since then, various MALDI-TOF MS-based methods have been being developed for applications of molecular genetics in clinical diagnosis. The successful application of MALDI-TOF MS in genotyping has been widely applied in the past 10 years. The most commonly used MALDI-TOF MS method is detection and quantification of single base primer extension products for qualitative and quantitative analysis of DNA copies containing single nucleotide polymorphism (SNP) by MALDI-TOF MS [61]. When the primers are well designed to achieve a good separation of the primer and the extension products in a mass spectrum, the genotyping assays can be combined to perform up to 15-fold multiplex SNP analysis [62, 63]. The single base primer extension assay can be applied to diagnosis and screening of hereditary diseases such as cystic fibrosis and beta-thalassemia [64, 65]. In fact, any diseases/pathological conditions that are associated with mutations in a specific gene or a specific set of genes can be easily identified and quantified by this method. This has recently been applied to the detection and quantification of the frequency of EGFR activating mutations in non-small-cell lung cancer tissues for prediction of patient’s response to EGFR tyrosine kinase inhibitor [37]. This method was shown to have detection limits of 0.4–2.2% [37]. In addition, when a known amount of an oligonucleotide having a well-designed sequence is spiked into a biological sample for competition in the primer extension reaction, the primer extension method can be used for measurement of the exact number of copies of DNA containing a mutation of interest. A typical application example is detection of 60 hepatitis B virus variants in four multiplex reactions [63]. The limit of quantification was 1,000 HBV copies/mL. Besides DNA, the primer extension method can be applied to the qualitative and quantitative analysis of RNA [66]. In 2007 it was first shown that quantification of plasma placental RNA allelic ratio permitted noninvasive detection of prenatal chromosomal aneuploidy detection [67]. This work has opened a new avenue for prenatal diagnosis.

2.10 Quantification of Metabolites by MALDI-TOF MS and SALDI-TOF MS

The matrix chemical noises in the low mass region (<m/z 800) make MALDI-TOF MS inferior for small molecule analysis. Despite that, attempts have been made to use MALDI-TOF MS for direct quantification of biomolecules, such as amino acids [15] and lipids [68], without the need for chemical derivatization. MALDI-TOF MS peak intensities usually increase with the amount of biomolecules. By spiking an internal standard and using an external calibration curve, one can use MALDI-TOF MS to estimate the concentration of a metabolite in a biological specimen through calculating the peak intensity ratio of the target metabolite to the internal standard. In Gogichaeva et al.’s study, methyltyrosine was used as a universal internal standard for quantification of various amino acids [15]. The calibration curves exhibited linearity in a range between 20 and 300 μM with correlation coefficients >0.983. The between-day CVs for the majority of amino acids were <10%, with proline and arginine being exceptions with CVs of about 12% [15]. By using 4-cholesten-3-one as a universal internal standard, it was practically feasible to use MALDI-TOF MS to identify and measure the lipid composition (m/z 369.6–833.0) of VLDL, LDL and HDL [68].

It has recently been shown that by MS acquisition at the negative ion mode and using 9-aminoacridine instead of the typical chemical matrices, matrix chemical noise can be greatly reduced [69]. This method allowed the detection and quantification of metabolites having acid protons, such as amines, alcohols, carboxylic acids, phenols, and sulfonates, with high sensitivity [70]. High linearity of the MS peak intensities of the deprotonated metabolites was observed at low concentration [69, 71]. The detection limits were in the femtomole range [69, 71]. By using 9-aminoacridine as the matrix and N-1-napthylphthalamic acid as the universal internal standard, SPE-enriched various bile acid species from plasma specimens could be directly measured with the limit of detection within the range 0.25–4.60 μg/mL [72].

Another method for reducing matrix chemical noises is the replace of the chemical matrix by a solid matrix. This approach is called surface assisted laser desorption/ionization (SALDI) (Fig. 1) [73]. The concept of SALDI was introduced by Sunner et al. in 1995. By using graphite to replace the chemical matrix, it was shown that peptides and proteins could be detected at high sensitivities [74]. Moreover, the background signal at the low mass region was low [73]. Since then, many other solid materials, such as silicon [75], carbon nanotube [76, 77], graphene flake [78], reduced graphene oxide [79], polymer matrix [80], and gold nanoparticles [81], have been shown to be useful matrices for SALDI-TOF MS analysis of small biomolecules, including carbohydrates [76, 81], amino acids [77], and lipids [82]. On one hand, the use of a solid-phase matrix alleviates the matrix chemical noises and interference problem at the low mass range. On the other hand, it solves the problem of uneven distribution of the analytes on a sample spot. By using graphene-based materials, the MALDI-TOF mass spectra of small molecules were found to be highly reproducible [78]. The within-spot spectrum-to-spectrum CV of peak intensities for spermin was 14% for the graphene matrix, compared to 40% for the CHCA matrix [78]. Recently, Lu et al. examined the shot-to-shot and spot-to-spot reproducibility of SALDI-TOF mass spectra for polypropylene glycol polymers. The shot-to-shot and spot-to-spot CVs of the signal intensities were 1.9–7.1% and <10%, respectively [83]. SALDI-TOF MS has great potential in quantitative profiling of small biomolecules, especially metabolites.

2.11 Discovery of Metabolite Biomarkers by Quantitative Profiling

In the Post Genome Era, besides proteomics, metabolomics has been a hot topic in the past 10 years. Many research groups have been attempting to use MS technologies to obtain quantitative profiles of metabolites in patients’ specimens, and to identify potential metabolite biomarkers by comparing the profiles between subjects with and without the diseases. LC-ESI MS has been the most commonly used technology in this research area [84]. ESI is a kind of atmospheric pressure ionization-based method, resulting in occurrence of ionization suppression [84]. Another disadvantage is that the use of LC limits the throughput. It can be very time consuming when one wants to obtain comprehensive metabolite profiles from over 100 biological specimens in a biomarker discovery study. Because of the high-throughput nature of MALDI-TOF MS, the use of MALDI-TOF MS in metabolite profiling is a very attractive alternative. It has been shown that 9-aminoacridine can be used to obtain quantitative cellular metabolite profiles by direct mixing of cells and the matrix without any preprocessing [69, 85]. For example, by a single direct on-spot analysis of 2,500 human acute lymphoblastic leukemia Jurkat cells, this method detected up to 150 metabolite peaks in the range of m/z 250–850 within 90 s [69]. It is important to note that signal suppression of a metabolite was observed when another metabolite with a similar chemical structure was present [86]. Hence, when using MALDI-TOF MS for quantitative analysis of metabolites, the data should be interpreted carefully. In the near future it will be interesting to see whether metabolites in plasma/serum can be directly profiled with the use of 9-aminoacridine as the matrix.

2.12 Quantification of Metabolites by MALDI-TOF/TOF MS

Nowadays selected reaction monitoring/multiple reaction monitoring (SRM/MRM) is the most widely accepted MS method for reliable quantification of small molecules, and typically implemented in an ESI triple quadrupole (ESI-QqQ) mass spectrometer (see Chap. 6 for details of the basic principle and instrumentation). The QqQ tandem mass analyzer is used dedicatedly in the SRM/MRM method because a quadrupole mass analyzer can be used as a mass filter, which only allows charged molecules of a specific m/z value to pass through the mass analyzer for either subsequent fragmentation or detection. By undertaking the filtering approach, the background noise can be greatly reduced, leading to high detection sensitivity. SRM cannot be implemented in MALDI-TOF/TOF MS instruments. Few reports on using MALDI-TOF/TOF MS in tandem MS mode for quantification have been available. Gogichaeva et al. showed that amino acids could be fragmented by MALDI-TOF/TOF MS [87]. By calculating the peak intensity ratios of the indicator fragment ions of the target amino acids to the indicator fragment ion of an internal standard, good correlation between the mixture component molar ratios and indicator fragment ions intensity ratios was observed [87]. Although correlation coefficients and coefficients of variation of their MALDI-TOF/TOF MS method were not reported, the study highlighted the potentials of applying MALDI-TOF/TOF MS to biomolecule quantification [87].

Recently, using citrulline for proof-of-concept, our team has developed a novel MALD-TOF/TOF MS-based quantification method called parallel fragmentation monitoring (PFM) [88]. This method is comparable to SRM. As in the SRM method, the PFM method also requires at least two pairs of precursor and selected fragment ions of specific m/z values, one pair for the target molecule and one pair for the internal standard. A stable isotope analog of the target molecule only 1 mass unit heavier is used as an internal standard, so that precursor ions of both the target molecule and internal standard can be specifically isolated with the first TOF analyzer at the same time, and undergo fragmentation simultaneously to yield a full range composite MS/MS spectrum. In both the SRM and PFM methods, the peak area ratio of the selected fragments of the target analyte to internal standard was used for quantification. The use of a stable isotope analog should also be able to minimize the error due to systematic bias of the instrumentation, and normalize the recovery yield after enriching the analytes from the biological samples for quantification. To reduce the matrix noises in the low mass range, a carbon-based nanomaterial was used as the matrix. The performance of the PFM method appears to be comparable to those of the SRM/MRM methods. Both PFM and SRM/MRM methods generated linear calibration curves with correlation coefficients >0.99 (Fig. 3) [8890]. Moreover, both types of assays gave the within- and between-day CVs ≤10% [8890]. Our results also showed that the calibration curves were highly reproducible. Daily calibration or use of a stored calibration generated highly similar measurement values [88]. This suggests that PFM can potentially be a cost and time effective and robust technology for quantification of biomolecules in routine clinical chemistry laboratories.

Fig. 3
figure 04133

Representative MALDI-TOF/TOF tandem MS spectra of citrulline (a) and [ureido-13C] citrulline (b) acquired independently with graphene flake as the solid matrix. (c) Representative linear calibration curve of the PFM assay for quantitative analysis of citrulline in the range of 10–250 μM. The peak intensity ratios of the indicator fragment ion of citrulline (m/z 153.1) to that of [ureido-13C] citrulline (m/z 154.1) in the calibration standards were plotted against the citrulline concentrations

The major advantage of using MALDI-TOF/TOF MS instead of MALDI-TOF MS for quantification is that MALDI-TOF/TOF MS has higher detection specificity and sensitivity for direct quantification of a target biomolecule in a complex biological specimen or in a partially enriched preparation. MALDI-TOF MS does not have enough resolution to resolve two ion species with highly close molecular weights, but they can be easily differentiated by looking at the fragmentation pattern. Even a highly advanced MALDI-TOF MS with ultra-high resolution, such as MALDI-FTICR MS, is not able to differentiate the naturally occurring isomers, such as leucine and isoleucine, by only focusing on the intact ions because of the exactly similar molecular weights. Isomers can only be differentiated on the basis of fragment ions. Furthermore, the background noises and interference from the other biomolecules in the preparation, like signal suppression by biomolecules sharing similar chemical structures, can be minimized by measuring ratios of the indicator fragment ions, resulting in higher detection sensitivity and measurement accuracy.

3 MALDI-QqQ MS

3.1 Quantification of Biomarkers

Although a QqQ tandem mass analyzer is usually coupled with an ESI source, it can also be coupled with a MALDI source [91]. In MALDI-QqQ MS the identity of a biomolecule can be defined by a mass transition ion pair as in the case of SRM [91]. By operating the QqQ analyzer as mass filters for the targeted precursor ions and fragment ions, the matrix chemical noises at the low mass region can be greatly reduced [91]. Comparing the quantitative results obtained by MALDI-QqQ MS and ESI-QqQ MS for 53 small-molecule pharmaceutical compounds, Gobey et al. demonstrated the potentials of MALDI-QqQ MS for high-throughput quantification of small biomolecules [91]. When operating MALDI-QqQ MS in SRM/MRM mode, the CVs for quantifications of small biomolecule or drug are typically around 10% [9194]. It has recently been shown that MALDI-QqQ MS can also be used to measure protein biomarkers in plasma by quantifying their proteotypic peptides in the presence of corresponding isotopically labeled peptide standards, as in case of typical MRM methods [95]. The measurement results are highly comparable to those obtained by using ESI-QqQ MS. Both technologies are accurate (within-day CVs <20%) and precise (relative errors <20%) for protein quantification [95]. Because MALD-QqQ MS is a relatively new technology, the currently available data have been limited. However, all the recent reports have demonstrated that MALDI-QqQ MS in SRM/MRM mode, which combines the merits of MALDI ionization technology and those of the conventional SRM/MRM approach, is a reliable high-throughput technology for biomolecule quantification. More successful applications of MALDI-QqQ MS to quantification of biomarkers in human specimens should be forthcoming.

4 SELDI-TOF MS

4.1 Basic Principle

Surface-enhanced laser desorption/ionization TOF mass spectrometry (SELDI-TOF MS) is a variant of MALDI-TOF MS, and is mainly designed for quantitative analysis of proteins in biological samples. This concept was first introduced by Hutchen and Yip in 1993 [96]. Instead of spotting of a mixture of proteins on a MALDI sample plate, a mixture of proteins is subjected to ProteinChip array-based retentate chromatography before MALDI-TOF analysis. ProteinChip arrays coated with different types of chromatographic materials (hydrophilic, hydrophobic, cationic exchange, anionic exchange, immobilized metal affinity, antibody affinity, ligand affinity, etc.) can selectively bind and concentrate proteins with the matched physicochemical or biochemical properties (Fig. 4a, b). Those nonspecifically bound proteins and impurities are then washed away with suitable washing buffers [97]. Retained proteins are finally co-crystallized with a chemical matrix (Fig. 4c), and subjected to MALDI-TOF MS for unbiased detection of protonated proteins (Fig. 4d, e). In the case of SELDI-TOF MS, sinapinic acid is the most commonly used matrix to assist desorption/ionization of proteins. Most of the proteins are detected as singly charged protonated molecules, and presented as individual peaks in a mass spectrum, resulting in a proteomic profile (Fig. 4f). The combinations of specific m/z values and the physicochemical properties that are reflected by the type of the ProteinChip arrays used provide unique identities for individual proteins [97]. This is why SELDI-TOF MS is commonly regarded as a proteomic fingerprinting technology. A series of follow-up experimental work is needed to purify the corresponding proteins and decipher the true identities [98100].

Fig. 4
figure 04134

(a) Typical workflow of quantitative proteomic profiling by SELDI-TOF MS. After denaturation and dilution, a patient sample is added to a binding surface of a ProteinChip array. (b) After incubation and washing, proteins with matched physicochemical/biochemical properties are retained on the surface. (c) Then a chemical matrix is added. (d) After drying, the sample spot is subjected to MALDI-TOF MS analysis to obtain signals from various positions with a regular spacing (e) (dark circles) throughout the entire spot. (f) After summing up the MS signals, a quantitative proteomic profile is obtained

It is worth noting that the concepts of SALDI and SELDI can be combined. A solid-phase matrix can both capture biomolecules with a particular physicochemical property and assist desorption/ionization of the captured biomolecules in MALDI-TOF MS analysis. Immobilization of CHCA onto the hydrophobic ProteinChip arrays allowed direct quantitative profiling of urine proteins by SELDI-TOF MS without the need for adding any chemical matrix after sample binding and washing [101]. Recently, a graphene-based SELDI probe has been developed for capture and direct detection of DNA oligomer without addition of any chemical matrix [102].

4.2 Quantitative Issues in the SELDI-TOF Mass Spectra

As in the case of MALDI-TOF MS, with appropriate MS analysis conditions, SELDI-TOF MS is quantitative. On the ProteinChip arrays, chromatographic resins are coated on a film of hydrogel, which provides a network for the formation of fine analyte-matrix co-crystals. Furthermore, in a typical SELDI-TOF MS experiment, an unbiased automated MS acquisition strategy is used. MS signals from 60 to 120 laser shots are obtained from a sample spot in a linear sweep or from various positions with a regular spacing on the entire sample spot, and are summed up to form a representative mass spectrum. After normalizing the MS signals by either total ion current and/or total peak intensity, the intensity values of the protein/peptide peaks is highly reproducible. The intra-assay and inter-assay CVs for the normalized intensities of majority of the SELDI peaks are between 5% and 25% [103, 104]. With the use of standardized experimental protocol and quality control strategy, the inter-laboratory CVs of the normalized peak intensities are between 15% and 36% [105]. By using a combination of ProteinChip arrays with different chromatographic coatings, SELDI-TOF MS can be used to obtain comprehensive semi-quantitative profiles of proteins with molecular weights between 2 and 250 kDa [106].

4.3 General Biomedical Applications of SELDI-TOF MS

Similar to other affinity technologies, SELDI-TOF MS can be applied to various types of research projects where appropriate. It all depends on what chromatographic functional groups, affinity materials, or proteins are being conjugated covalently on the ProteinChip array surface. It can be used to capture and profile transcription factors by coating with DNA materials of a specific sequence [107]. It can also be used to study the effect of DNA methylation on binding the transcriptional factors to a DNA sequence [108]. Such an approach could help to characterize transcription factors and to screen for differences in cellular regulatory networks. When a specific protein is coated, it could be used to study protein–protein interaction [109, 110]. For example, it has been used to search for protein–protein interaction partners for S100A8 [109] and GlialCAM [110]. When the ProteinChip arrays are coated with a specific antibody, specific protein or protein complex can be purified for subsequent analysis. After a specific protein has been captured, SELDI-TOF MS can identify and provide quantitative information of individual variants. They could be structural variants with different amino acid compositions, e.g., amyloid beta peptide variants [111] and SAA variants [112], as well as variants with different post-translational modifications, e.g., S-glutathionylated and S-cysteinylated variants of transthyretin [113] and glycosylated variants of eosinophil cationic protein [114]. Above all, SELDI-TOF MS has been more commonly used for quantitative profiling of biological samples to search for protein biomarkers. Theoretically, SELDI-TOF MS can also be applicable to high-throughput quantification of other types of biomolecules, like metabolites and glycans. In the following sections the applications of SELDI-TOF MS to protein biomarker discovery will be reviewed in more detail.

4.4 High-Throughput Technology for Biomarker Discovery

In a typical SELDI-TOF MS experiment, biological specimens can be directly added on the ProteinChip arrays without any preprocessing, or only after several simple steps for denaturation and dilution. The ProteinChip arrays can be assembled in a 96-well plate format, and the binding and washing procedures can be performed as if carrying out an enzyme-linked immunosorbent assay. The use of the ProteinChip arrays has greatly simplified the protein profiling assay workflow. Another advantage of SELDI-TOF MS is its capacity for high throughput during mass spectrum acquisition, as in the case of MALDI-TOF MS. In addition, the combinations of specific m/z values and the type of ProteinChip arrays used provide unique identities for individual proteins. These are the major reasons why SELDI-TOF MS has been widely used for analysis of biological specimens, especially for biomarker discovery, since it first appeared as a commercially available platform in 1997. As of today, there are at least 850 publications on applications of SELDI-TOF MS to protein biomarker discovery. It has been used to analyze a wide range of biological samples, for example, serum [100, 103106], plasma [110, 113, 115], urine [101, 113, 116], tears [117119], cerebrospinal fluid [120122], amniotic fluid [123125], tissue/cell lysate [107, 126128], etc. It has been applied to the discovery of potential biomarkers for various types of diseases, for example, cancers [103106, 112, 129], infectious diseases [99, 100, 115, 130], autoimmune diseases [118, 131], eye diseases [117, 119], neurological diseases [120122], perinatal and neonatal diseases [123126], etc. In a typical proteomic profiling experiment the individual biological samples were first denatured with urea and non-ionic solvent to denature or destroy the non-covalent protein–protein interaction, and then diluted in an appropriate binding buffer for subsequent SELDI-TOF MS analysis [99, 100, 103106, 112, 115, 129]. In such an approach only a very small amount of biological sample is needed. For serum/plasma specimens, as little as 2 μL of serum samples would be enough [99, 129, 130].

One reason for the popularity of SELDI-TOF MS in biomarker discovery is that it is complementary to two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) for the quantitative analysis of intact proteins or their fragments formed in vivo. The most important point is that 2D-PAGE is the best for resolving proteins with molecular weight in the range of 10–250 kDa, while SELDI-TOF MS has the best resolving range of 1–20 kDa. Furthermore, highly hydrophobic proteins, such as membrane proteins, and proteins with isoelectric points (pI) less than 3 and higher than 11, are usually poorly resolved by 2D-PAGE, but can be satisfactorily analyzed by SELDI-TOF MS.

Another reason for the popularity of SELDI-TOF MS in biomarker discovery is that SELDI-TOF MS itself has a great potential to be used as a clinical diagnostic tool. The simplicity in the assay procedure and its short turn-around time allow this to be implemented in routine clinical chemistry laboratories. We could easily translate the laboratory research findings into clinical use. Diagnosis/prognosis could be based on the intensity of a SELDI peak at a specific m/z value and the usage of a specific type of ProteinChip arrays, or based on a combination of specific protein peaks which have been identified with the use of several specific types of ProteinChip arrays.

4.5 Host Response Proteins Forming the SELDI Proteomic Fingerprints

Proteins corresponding to the diagnostic/prognostic peaks could be purified by micro-scale chromatography with the same binding condition, separated by gel electrophoresis, and finally identified by using typical approaches, e.g., peptide mass fingerprint, tandem MS, etc. With clear protein identities, specific immunoassays could be developed. Now it is clear that the majority of disease-associated proteomic fingerprints are composed of intact forms, fragments, and/or post-translationally modified forms of host response proteins, such as apolipoprotein A1, apolipoprotein A2, apolipoprotein C1, apolipoprotein C2, apolipoprotein C3, alpha-1 antichymotrypsin, complement component 3a, complement component 3c, fibrinogen, immunoglobulin kappa light chain, inter-alpha trypsin inhibitor heavy chain 4, haptoglobin, beta-2 microglobulin, platelet factor 4, SAA, transthyretin, beta-thromboglobulin, etc. [99, 100, 132139]. In fact, host response proteins were also identified as potential biomarkers when the serum/plasma proteomic profiles were compared by using other techniques, such as 2D-PAGE [140, 141], magnetic beads-based MALDI-TOF MS [24, 25], and even shotgun proteomic profiling by LC-ESI-MS [142144]. The use of signatures of host-response proteins as disease biomarkers has both pros and cons. The major advantage is that the host response of a patient helps to amplify the signal for the presence of a particular disease, which may help to identify a disease at an early stage [132]. The major disadvantage is that the specificity of those host-response signatures should be carefully validated before they can be claimed as disease-specific biomarkers. Similar symptoms, which generate specific host-response protein signatures, can easily be found in other diseases.

4.6 Presence of Systemic Bias in Biomarker Discovery Studies

Although SELDI-TOF MS has been a popular technology for biomarker discovery, there have been doubts about the reliability of this technology. Before discussing this issue, I would like to emphasize this should not be a problem that is only restricted to the SELDI proteomic profiling studies. Such a problem has been observed in many SELDI-TOF MS studies. It may be because SELDI-TOF MS was the first high-throughput technology that allowed quantitative profiling and comparison of the serum/plasma proteins in a large number of patient samples within a very short period of time. After much more serum proteomic/metabolomic profiling, data obtained by using other technologies are available, I believe that similar problems will be observed. In this section, selected biomarker discovery studies employing SELDI-TOF MS technology will be used as examples for reviewing this issue.

The typical example is the application of SELDI-TOF MS to the diagnosis of ovarian cancer. In 2002, Petricoin et al. identified a pattern of SELDI peaks that could completely differentiate ovarian cancer cases from non-cancer cases in the training set. For the masked set, the diagnostic pattern achieved a sensitivity of 100% and specificity of 95% [145]. Unfortunately, when the data set was reanalyzed by other teams, it was found that there was significant non-biological experimental bias between the cancer and control subjects [146], and the features in the noise regions of the SELDI mass spectra allowed discrimination of control subjects from cancer patients [147, 148]. These analysis results suggested that (1) the cancer and control samples had been analyzed separately and (2) there was a change in the experimental protocol in the middle of the study.

Another important example is the identification of SELDI peaks for diagnosis of prostate cancer. In 2002, by using the copper(II) ion loaded metal affinity (Cu2+-IMAC) ProteinChip Array, a pilot single-center study showed that serum protein fingerprinting was useful for diagnosis of prostate cancer [149]. When classifying the blind test samples, the sensitivity and specificity were found to be 83% and 97%, respectively. Subsequently, a series of follow-up studies were performed to validate the value of serum proteomic profiling with Cu2+-IMAC ProteinChip arrays in the diagnosis of prostate cancer. To allow validation carried out by six research centers, a standard protocol and quality control system was developed [105]. Then serum samples (181 prostate cancer patients, 143 benign prostatic hyperplasia cases, and 220 normal controls, who were age and race-matched) from Eastern Virginia Medical School (EVMS) were used to construct a decision algorithm for classifying 42 prostate cancer patients and 42 normal controls provided by four institutions [150]. All test samples were distributed to six laboratories for analysis. The final conclusion was that the decision algorithm was unsuccessful in separating cancer from controls. Analysis of the experimental data for biomarker discovery indicated that the sample source is the major factor affecting the results.

4.7 Overcoming Systemic Bias in Biomarker Discovery Studies

Inappropriate selection of control subjects (i.e., selection bias) is one of the major causes of systemic bias. Selection bias and information bias will appear when the diseased and control subjects were recruited from two different populations, such as two different clinics. For example, three laboratories had attempted to use SELDI-TOF MS to identify the biomarkers for detection of severe acute respiratory syndrome (SARS) in adults [99, 151, 152]. In two of the three studies, controls cases were recruited from other clinics [151, 152]. Patients with other types of respiratory infections had been included as the controls. SAA concentration was found to be significantly higher in the SARS patients than in the controls. One study included SAA into the diagnostic model for detection of SARS [152]. In the third study which was performed by our team, both SARS patients and control subjects were recruited from the same clinics. The control subjects were suspected SARS cases, but were later proven to be negative for SARS [99]. In this study, both the SELDI-TOF MS assay and immunoassay showed that SAA was elevated in the SARS patients [153]. However, SAA levels were found to be much higher in the control group, indicating that SAA was not a useful biomarker for diagnosis of SARS [153]. These three studies clearly illustrate the importance of recruiting the diseased and control cases from the same clinic.

For case–control biomarker study, confounding bias should also be controlled. If it is not controlled, the biomarkers found could be related to the characteristics of the disease group, but not related to the disease itself. Patients with gastroenterological cancer may lose appetite, leading to under nutrition [154]. Malnutrition will become one of confounding factors, and some differential SELDI peaks could be related to the nutritional status. For hepatitis virus-related liver cancer, it is well known that gender is one of the confounding factors [155]. Smoking is a well-known confounding factor for lung cancer [156]. Levels of considerable amounts of blood proteins can be changed in response to smoking [157]. For biomarker discovery, even though the diseased and control cases are recruited from the same clinics, unknown confounding factors still exist. In our recent gastric cancer study we attempted to use post-operative serum samples to verify the validity of the potential proteomic markers found by comparing the diseased and control cases from the same clinics [129]. Surprisingly, over 80% of the potential biomarkers could not show a reverse in the serum levels after the removal of the tumors from the patients. This strongly suggested that most of the differential SELDI peaks between the gastric cancer and control groups were not specifically associated with gastric cancer, but only associated with certain characteristics of the gastric cancer patients. In our SARS study we identified the clinical and biochemical variables which were significantly altered in the SARS patients, and attempted to verify the potential diagnostic SELDI peaks by only considering those that were significantly correlated with at least two disease-associated biochemical/clinical parameters as SARS-specific (Fig. 5a) [99]. Similar to the gastric cancer study, about 80% of the differential SELDI peaks were rejected in the SARS study. Both the gastric cancer study and the SARS study have highlighted the high risk of false discovery when we simply consider the differential SELDI peaks as potential biomarkers. The presence of about 80% of the differential SELDI peaks, which are likely caused by confounding bias, are not restricted to the studies employing SELDI-TOF MS. Our group recently attempted to identify circulating host response biomarkers for diagnosis of late-onset sepsis or necrotizing enterocolitis in preterm infants suspected for the diseases by using hydrophobic magnetic beads and MALDI-TOF MS [25]. By using the longitudinal samples to verify the clinical relevance of the differential proteomic features, again about 80% of them were rejected (Fig. 5b). Encouragingly, the diagnostic values of the verified proteomic features were subsequently confirmed in the prospective study [25].

Fig. 5
figure 04135

The study designs which were used to identify biomarkers for diagnosis of SARS in adults (a) and diagnosis of necrotizing enterocolitis/late-onset sepsis in preterm infants (b) by undertaking MS-based proteomic profiling approaches. In the SARS study, potential diagnostic SELDI peaks were filtered by only considering those that were significantly correlated with at least two disease-associated biochemical/clinical parameters as SARS-specific (a) [99]. In the preterm infant study, the longitudinal samples were used to verify the clinical relevance of the differential proteomic features [25]. Only the differential MS peaks showing statistically significant reverse of peak intensities upon recovery were retained (b). In both studies, about 80% of differential SELDI peaks, which were obtained by case–control comparison, were rejected

While one can reduce the systemic bias in a single center study by verification with longitudinal samples or by correlation with known disease-associated changes, one can also reduce the systemic bias by using samples from multiple centers [158]. Multi-center design provides an unbiased clinical validation of the proteomic diagnostic models. Zhang and Chan have proposed a multicenter design that helps to eliminate the systemic biases in samples and site-associated confounding variables in biomarker discovery [159]. In the biomarker discovery phase, cases from independent sites are used separately and independently to identify the potential biomarkers. The potential biomarkers from the different sites are cross-compared to produce a common set. In the validation phase, the clinical value of the common set is further validated using independent samples from additional sites. By using this multicenter design, a panel of ovarian cancer-associated protein biomarkers that were identified in blood samples by SELDI-TOF MS finally became the first in vitro diagnostic multivariate index assay (IVDMIA) of proteomic biomarkers, which was recently cleared by the US FDA (Food and Drug Administration) [158, 160, 161].

5 Future Prospectives

The global research efforts on the development and biomedical applications of MALDI-based technologies in the past 27 years have shown great promise in facilitating biomarker discovery and in clinical diagnostic applications. The concepts of MALDI, SALDI, SELDI, and PFM are complementary to each other. Theoretically, all these technologies can be combined, leading to the next generation of MALDI MS technologies. Although SELDI-TOF MS is commonly regarded as a proteomic fingerprinting technology, SELDI-TOF MS should also be applicable to quantitative profiling of other types of biomolecules, such as glycans and metabolites for biomarker discovery or identification of diagnostic fingerprints. Furthermore, SELDI can be coupled with TOF/TOF MS. Then targeted quantification of specific metabolites, small proteins and proteotypic peptides from large proteins, can be achieved by undertaking the PFM approach, while enrichment/purification procedures are much simplified. When the binding surface of a SELDI chip is made of materials that can also assist laser desorption and ionization process (i.e., combination of SELDI and SALDI), a SELDI-TOF/TOF MS setup will become an instrument for cost-effective measurement of biomarkers with ultrahigh throughput and high detection sensitivity and specificity. Ultimately, when a TOF or TOF/TOF analyzer can be miniaturized to a portable size without sacrificing resolution, medical diagnostic applications of MALDI-based technologies at the bedside or even at home will be become possible.