Introduction

Glycosylation is one of the most ubiquitous post-translational modifications, occurring in approximately half of all eukaryotic proteins [1]. It is also the most diverse, as signified by the presence of a wide range of glycan structures as potential modifiers. At the cell surface, glycans exert their functions by modulating the extent and specificity of the interactions between glycoconjugates and their binding partners [2]. It has become increasingly evident that compositional glycan profiling is insufficient and detailed glycan structural determination is required to correlate their functional features with defined structural parameters. A prime example is the recent discovery that the sialyl Lewis x (sLex) tetrasaccharide sequence, rather than its linkage isomer, sialyl Lewis a (sLea), represents the major carbohydrate ligand for human sperm-egg binding [3]. Differential expression of sLea and sLex glycans was also observed during tumor progression and metastasis in cancers [4].

Glycomics is the comprehensive and systematic study of all types of glycans, whether free or in conjugated forms, that are present in a biological system [5, 6]. Although glycosylation often starts with common core structures, these undergo extensive, unscripted enzymatic modifications during passage through the endoplasmic reticulum and the Golgi apparatus, resulting in tremendous diversity in mature glycan structures, with variations in their composition, branching patterns, linkage positions, stereochemical configurations, and chemical substituents [7,8,9]. Such structural complexity and heterogeneity of glycans are the hallmarks of their template-free biosynthesis, and necessitate the development of analytical methods that combine detailed structural determination with on-line or off-line glycan separation for comprehensive glycome characterization. Preferably, the separation method should be able to resolve the isomeric structures that are often present in biological samples.

Capillary electrophoresis (CE) with laser-induced fluorescence (LIF) is routinely used for glycan analysis due to its high sensitivity, analysis speed, unmatched separation efficiency, and ability to quantify [10,11,12,13,14,15]. Because fluorescence-based detection does not provide definitive glycan identification, structural elucidation by CE-LIF often entails investigating the relative migration shifts following specific exoglycosidase digestions. Such an approach can suffer from the poor reproducibility in electrophoretic migration times and limitations in the content in CE databases of glycan standards. On the other hand, mass spectrometry (MS), and in particular, tandem mass spectrometry (MS/MS) and sequential tandem MS (MSn, n > 2), can be used to deduce the glycan structure from first principles without the need of glycan standards. Although the coupling of CE to MS has shown tremendous promise for characterization of isomeric glycans, compromises must be made in the CE-MS interface and on the buffer additives, and this can lead to reduced CE separation and/or MS performance. Furthermore, derivatization with charged tag(s) is usually required for neutral glycans before electromigrative separation, and this prevents certain types of derivatization schemes, such as permethylation, from being applied for CE-MS analysis.

Ion mobility spectrometry (IMS) is another powerful tool that is well suited for separation of isomeric glycans, as isomers often have different gas-phase conformations, and hence different collisional cross sections and mobilities. Separation and characterization of isomeric glycan mixtures by IM-MS/MS has been demonstrated on several MS platforms [16,17,18,19,20,21,22,23,24,25,26]. However, IMS is not completely orthogonal to MS, and generally does not have a high peak capacity. Moreover, as a post-ionization method, IMS does not address the challenge of ionization suppression, and this limits the dynamic range of glycans that can be simultaneously characterized by IM-MS. IM-MS analysis can also be complicated by the presence of multiple gas-phase conformers for a single isomer.

Liquid chromatography (LC)-MS has long been the method of choice for analysis of complex mixtures, owing to the complementary nature of the LC and MS separations. There exist a wide variety of chromatographic modes that target different chemo-physical properties of analytes, allowing their separation based on their size, charge, polarity, and/or hydrophobicity. Resolution of isomeric glycans has been demonstrated with strong anion exchange (SAX) chromatography [27], high-pH anion exchange chromatography (HPAEC) [28,29,30], hydrophilic interaction chromatography (HILIC) [31,32,33], reversed phase LC (RPLC) [34,35,36], and porous graphitic carbon (PGC)-LC [37,38,39,40,41,42,43,44,45,46]. Among them, RPLC is the most widely used due to its robustness and reproducibility. RPLC also does not require the use of nonvolatile salts or other buffer additives in its mobile phase, and can be easily interfaced with MS analysis. In RPLC, an analyte is retained through noncovalent interactions between its nonpolar moiety and the hydrophobic stationary phase. Thus, hydrophilic glycans must be derivatized before they are suitable for RPLC analysis. This can be accomplished by introducing a hydrophobic tag via reductive amination or hydrazide labeling targeting the reducing end, but this derivatization strategy is not applicable towards reduced sugars, such as O-linked glycans released via reductive β-elimination. Alternatively, the hydrophobicity of glycans may be increased by permethylation. Permethylation also improves the ionization efficiency and gas-phase stability of labile glycans, facilitates differentiation of terminal and internal glycosidic fragments, and minimizes gas-phase structural rearrangements. Reduction is often performed in conjunction with permethylation to eliminate anomerism that leads to undesirable chromatographic peak splitting, which divides ion signals and complicates isomer separation [34].

RPLC-MS/MS and MSn analyses of permethylated glycans were first reported by Vouros and coworkers [47]. With low-energy, on-resonance collision-induced dissociation (CID) as the fragmentation mode, it was not always possible to obtain complete sequence coverage, and the authors acknowledged the need to perform additional MSn for linkage determination. The chromatographic resolution was limited, and no isomer separation was shown. In 2013, Ritamo and coworkers showed that, by raising the column temperature, increased resolution can be attained, allowing separation of reduced and permethylated glycan isomers by RPLC [34]. They further demonstrated that a pair of separated N-glycan isomers released from immunoglobulin gamma could be differentiated by CID- or higher-energy collisional dissociation (HCD)-MS/MS. The authors cautioned that the fragmentation data did not provide sufficient information to fully define the putative structures proposed. A 2016 study by Mechref and coworkers also reported improved isomer resolution by RPLC at elevated temperatures [35]. They attributed such improvement to the unfolding of the glycan structures at higher temperatures, allowing more effective interactions between the analyte and the stationary phase. Additionally, heating reduces diversity in the three-dimensional structures of glycans and leads to a more predictable molecular weight-retention time correlation, and this can facilitate identification of the glycans.

Permethylated glycans can also be analyzed by PGC-LC. In 2007, Costello and coworkers developed a PGC-based LC-MS platform for the analysis of permethylated glycan alditols and demonstrated, for the first time, separation and identification of glycan isomers [39]. The beam-type CID employed in that study produced more informative tandem mass spectra than on-resonance CID, permitting determination of the glycan sequence, branching patterns, and even linkages in some cases. A similar approach was recently adopted by Mechref and coworkers, who showed improved resolution of isomeric glycans by higher-temperature PGC-LC [46]. Reduction and permethylation made it possible to distinguish core fucosylation from terminal fucosylation based on the 30-Da mass shift introduced to the reducing-end N-acetylglucosamine (GlcNAc) residue. Differentiation between 1 → 3- and 1 → 4-linked galactoses (Gal) was also possible with CID and HCD due to the large difference in their bond energies. However, the fragmentation produced by CID/HCD was generally too sparse to allow comprehensive structural elucidation by MS/MS alone. Complementary retention time information may be required to make or validate details of the structural assignment, such as the location of the galactose residue on the 3- or the 6-arm of a biantennary glycan.

Because of the tendency of CID to preferentially break the most labile glycosidic bond(s), the amount of structural information produced by CID in a single stage of MS/MS is often limited. Several groups showed that electron activated dissociation (ExD) methods, such as electron capture dissociation (ECD) and electron transfer dissociation (ETD), can generate MS/MS spectral data complementary to CID [48,49,50,51,52,53]. ECD and ETD are only applicable towards multiply charged precursor ions, and they do not always produce complete sequence information, as the fragmentation is highly dependent on the nature and the binding pattern of the metal charge carriers [54]. These limitations can be overcome by activating the precursor ions with higher-energy electron irradiation, as employed in hotECD and in electronic excitation dissociation (EED) [53, 55]. EED, in particular, works with singly charged precursors and ions with a broad range of charge carriers, including a fixed charge [56]. The term EED was first introduced by Zubarev to describe a fragmentation process that is initiated by ionization and electron recapture [57]. EED is different from electron ionization dissociation (EID) which involves tandem ionization [58], and electron impact excitation of ions from organics (EIEIO) which proceeds via direct electronic and vibrational excitation [59]. EED is capable of producing extensive glycosidic and cross-ring fragments throughout the glycan sequence, for both native and derivatized glycans, including permethylated glycans and glycans derivatized with a reducing-end chromophore [55]. The rich structural information provided by EED was the major driving force behind the recent development of a de novo glycan sequencing algorithm that can accurately deduce glycan topologies based solely on their EED tandem mass spectra [60]. Coupling of EED-MS/MS to on-line mobility-based separation has been achieved via selective accumulation-trapped IMS (SA-TIMS), and successfully applied to isomeric glycan analysis [26]. However, the peak capacity, throughput, and dynamic range of the SA-TIMS-EED-MS/MS approach still need to be improved before this approach can be efficiently applied to the analysis of complex biological samples.

We have previously demonstrated the feasibility of on-line RPLC-EED-MS/MS analysis on a mixture of reduced and permethylated maltodextrin oligomers [55]. Here, we are applying RPLC-EED-MS/MS to characterize isomeric glycan mixtures. Analysis of such mixtures poses a far greater challenge than a maltodextrin ladder, as it requires both higher chromatographic resolution for isomer separation and efficient EED fragmentation that can produce diagnostic ions for each isomer. We present the current benchmark and discuss ways to further improve the method towards the ultimate goal of comprehensive, high-throughput glycome characterization.

Experimental Section

Materials

Lacto-N-tetraose (LNT), lacto-N-neotetraose (LNnT), and lacto-N-fucopentaose I, II, and III (LNFP I, II, and III) were acquired from V-Labs, Inc. (Covington, LA). LNFP V and VI were purchased from Carbosynth Limited (Berkshire, UK). HPLC grade water, acetonitrile, chloroform, and formic acid were obtained from Fisher Scientific (Pittsburgh, PA). Methyl iodide, dimethyl sulfoxide (DMSO), sodium acetate, sodium borohydride (NaBH4), sodium borodeuteride (NaBD4), and acetic acid were purchased from Sigma-Aldrich (St. Louis, MO).

Sample Preparation

For reduction (or deutero-reduction), 2 μg of native glycans were dissolved in 200 μL of 0.1 M NH4OH/0.25 M NaBH4 (or NaBD4) solution for 2 h at room temperature. Acetic acid (10%) was added drop by drop until bubbling ceased. Reduced glycans were permethylated using the method developed by Ciucanu and Kerek with slight modifications [61, 62]. Dried reduced glycan powders were resuspended in 100 μL of NaOH/DMSO mixture and vortexed for 1 h at room temperature, followed by addition of 50 μL of methyl iodide and 1 h of gentle vortexing in the dark. Another 100 μL of NaOH/DMSO and 50 μL of methyl iodide were added to the reaction mixture, and the reaction was allowed to proceed for 1 h with vortexing. This process was repeated four times to ensure complete methylation, followed by addition of 200 μL of chloroform to quench the reaction. Excess salt was removed by washing with 400 μL of water, repeated several times until neutral pH was reached. The aqueous layer was discarded, and permethylated glycans were recovered in the organic layer, and dried with a SpeedVac system (ThermoFisher Scientific).

Liquid Chromatography and Mass Spectrometry Analysis

On-line RPLC separation was carried out on a nanoACQUITY UPLC system (Waters, Milford, MA). A nanoACQUITY UPLC 2G-VMTrap column (5 μm Symmetry C18, 180 μm ID × 20 mm) was used as the trapping column, and a nanoACQUITY UPLC Peptide BEH C18 column (1.7 μm, 150 μm ID × 100 mm) was used as the analytical column with the temperature kept at 60 °C for improved chromatographic performance. Mobile phase A consisted of 99% water, 1% acetonitrile, and 0.1% formic acid, and mobile phase B consisted of 1% water, 99% acetonitrile, and 0.1% formic acid. In the first 2 min, on-line desalting was carried out using the trapping column with 2% B at a flow rate of 4 μL/min. The analytical gradient started at 38% B for 2 min and was ramped linearly to 42% B over the next 46 min. It was then ramped up to 80% B within 2 min and kept at 80% B for 4 min before being dropped to 2% B over 2 min for column re-equilibration.

MS/MS analyses were performed on a 12-T solariX™ hybrid Qh-Fourier-transform ion cyclotron resonance (FTICR) mass spectrometer (Bruker Daltonics, Bremen, Germany) equipped with an Advion Triversa nanoMate™ system (Ithaca, NY). Glycan standards were introduced into the mass spectrometer either via direct infusion at ~ 5 pmol/μL concentration or through LC (2 pmol per injection). Off-line CID experiments were conducted with the collision energy set at 47 eV. On-line LC-EED-MS/MS analysis was performed with the cathode bias set between 16 and 18 V, and an electron irradiation time of 0.5 s. Auto MS/MS was performed with alternating MS scan and (two) MS/MS scans. An inclusion list was used without dynamic exclusion to allow the sodiated precursor ions be repeatedly selected for fragmentation. A 0.5-s transient was acquired for each mass spectrum. For each LC-MS/MS run, a single MS/MS spectrum was chosen for internal calibration, using several fragments assigned with high confidence, typically Y-, Z-, and 1,5X ions. The same calibration constants were then automatically applied to all mass spectra acquired on the same LC run, resulting in a typical mass assignment accuracy of < 0.5 ppm for internally calibrated spectra, and < 1 ppm for all others. Peak picking was performed using the SNAP™ algorithm (Bruker Daltonics, Bremen, Germany) with the quality factor threshold set at 0.1, S/N cutoff at 5, maximum charge at 4, and a relative ion abundance threshold at 0.01%. The precursor ion elemental composition was chosen to calculate the averagine formula, with one Na as an additional constant unit. Peak assignment was achieved with the assistance of a custom VBA (Visual Basic for Applications) program, and ChemDraw 16.0.

Results and Discussion

Two sets of isomeric glycan standards were studied here, with their structures shown in Figs. 1 and 3, insets. The first set contains two linear tetrasaccharide isomers, LNT (Galβ1 → 3GlcNAcβ1 → 3Galβ1 → 4Glc) and LNnT (Galβ1 → 4GlcNAcβ1 → 3Galβ1 → 4Glc), that differ in only one linkage position. The second set includes five pentasaccharide isomers with either linear or branched structures. Among them, LNFP I, II, and V are topological isomers resulting from the addition of a fucose (Fuc) residue to the LNT structure, at the non-reducing-end Gal, GlcNAc, and the reducing-end Glc residue, respectively. LNFP III and VI are derived from the LNnT structure and are linkage isomers to LNFP II and V, respectively. All glycans were reduced and permethylated before LC-MS analysis.

Figure 1
figure 1

On-line nanoRPLC-EED-MS/MS (16 eV) analysis of a mixture of reduced and permethylated LNT and LNnT. (a) EIC of the precursor ion, [M + Na]+, m/z 942.48. (b, c) Single-scan EED spectra acquired near the peak of elution for each separated isomer, as marked by arrows in (a). Diagnostic fragments are labeled in color. The double dagger sign indicates two hydrogen losses from the canonical structure. Multiple assignments are listed one above another

NanoLC-MS/MS Analysis of Reduced and Permethylated LNT and LNnT

It is often advantageous to analyze glycans in their metal-adducted forms, as they tend to produce more extensive fragmentation than protonated species. Furthermore, protonated glycans can undergo gas-phase structural rearrangements, even after permethylation, and this can lead to erroneous structural determination [63,64,65,66,67,68,69]. In RPLC, however, eluted glycans are most commonly ionized via protonation or ammonium adduction, with ammonium adducts readily converted to protonated species upon collisional activation. It was shown that formation of sodium adducts could be promoted by adding sodium hydroxide or sodium acetate to the mobile phase [34, 70], or by spiking a high concentration of sodium salt into the glycan sample before injection [46, 71]. Here, we adopted the latter approach to avoid continuous flow of sodium salt into the mass spectrometer and the need for frequent source cleaning, as the majority of the salt added to the sample went to waste during trapping. Supporting Figure S1 shows a mass spectrum from an LC-MS run of a mixture of LNT and LNnT, with pre-injection incubation in 25 mM of sodium acetate for 3 min with vortexing. Sodium adducts are the most abundant ions detected, with some protonated species and ammonium adducts also present. For analysis of biological samples, to mitigate signal dilution and to improve sensitivities, further concentrating the ion signals into sodium adducts via addition of sodium salt to the mobile phase may be necessary.

Figure 1a shows the extracted ion chromatograms (EICs) of reduced and permethylated LNT and LNnT ([M + Na]+). These two linkage isomers were baseline resolved by nano C18 LC, with each exhibiting a single peak, as expected for reduced glycans. Figure 1b, c shows the single-scan EED spectra acquired at two retention times near the peak of elution for each isomer. Lists of all assigned peaks for these two spectra can be found in Supporting Tables S1 and S2. Complete series of Z-, Y-, and 1,5X-ions are observed in both EED spectra, and they form triplets with characteristic spacing: 18.011 Da (H2O) between the neighboring Z- and Y-ions and 27.995 Da (CO) between the adjacent Y- and 1,5X-ions. This is similar to what was observed on EED of glycans with a reducing-end fixed charge label [56]. Here, without the fixed charge label on the reducing-end, many more non-reducing-end fragments are observed. For example, near complete series of 1,5A-, B-, and C-ion triplets are present in the spectra, but with the spacing in reversed order, and this allows differentiation of non-reducing- and reducing-end fragments. They form “golden pairs” with the Z-, Y-, and 1,5X-ion triplets for more confident glycan sequencing.

Because glycans are built from a smaller number of building blocks than peptides, yet they can generate a greater number of fragments per residue, isobaric fragments are commonly present in glycan tandem mass spectra. This creates a major challenge for de novo glycan sequencing, as some cross-ring, internal or secondary fragments can be misinterpreted as (glycosidic) sequence ions. Our earlier work showed that a given peak assignment may be confirmed (or cast out) by considering its contextual feature, defined as a collection of its neighboring peaks with characteristic mass shifts and relative abundances. An IonClassifier that calculates the probability of an ion’s being a certain type, e.g., a Y-ion, can be constructed via machine learning by investigating the contextual features of the same type of fragments in the tandem mass spectra of glycan standards [60]. The Z-/Y-/1,5X- and 1,5A-/B-/C-triplets are consistently observed in the glycan EED tandem mass spectra, with well-defined spacing independent of the derivatization scheme (permethylated or native), linkage position, and residue type (with the exception of sialic acids); thus, they are weighted heavily towards the IonClassifier score. Other contextual features, such as Z-CH2OCH3 and Z-NCH3Ac (discussed below), are also included in the IonClassifier through training, albeit with lower weights because they are residue- and/or linkage-dependent and not universally observed.

It was suggested that EED proceeds via an ionization/electron recapture mechanism through an open-ring di-radical intermediate, with the C1-C2 di-radical being energetically favored due to the stabilization of the C1 radical by the glycosidic and ring oxygen atoms [72] (Supporting Scheme S1). Proposed reaction pathways from various di-radicals to B, C, Y, Z, 1,5A••, and 1,5X ions are shown in Scheme 1. The B and Z ions can subsequently lose a hydrogen atom in the alpha position to form even-electron B and Z ions, whereas hydrogen loss from C and Y ions generates C-2H (or C) and Y-2H (or Y) ions, respectively, where the double dagger symbol indicates that the ion contains two fewer hydrogens than the C or Y ion as defined in the Domon-Costello nomenclature [73]. Intramolecular hydrogen transfer before product separation may also occur, leading to the formation of C, Y, and Z + 2H (or Z″) ions, where the double prime symbol indicates two hydrogen gains from the canonical Z structure. The 1,5A•• di-radical can either lose two alpha hydrogens to form a stable 1,5A-2H (or 1,5A) ion, or form a closed-ring 1,5A ion via radical recombination. The 1,5X ions are formed as stable closed-ring species, and often among the most abundant ions observed in the EED spectra. Because EED is primarily a radical-driven process, it is capable of generating extensive fragmentation throughout the glycan sequence. In contrast, glycan CID tandem mass spectra are often dominated by a few fragments resulting from the cleavage of the most labile glycosidic bond(s). Figure 2 shows the 47 eV CID spectra of reduced and permethylated LNnT and LNT, acquired via direct infusion with 24 transients averaged. For LNnT, the two most abundant fragment ions, B2 and Y2, both resulting from the GlcNAcβ1 → 3Gal glycosidic bond cleavage, contributed to > 96% of the total fragment ion signals. For LNT, the two most abundant fragments, Y2 and Z3, made up around 91% of the total fragment ion signals. Other sequence ions were produced with much lower efficiency, e.g., Y3 from LNT amounted to less than 0.2% of the total ion counts. For complex biological samples that must be analyzed by on-line LC-MS/MS, where signal averaging is not a viable option, these low-abundance sequence ions may not be detected, especially for minor glycan species. Moreover, there are few contextual features for these sequence ions to validate their assignments.

Scheme 1
scheme 1

Proposed EED mechanisms for the formation of B-, C-, Y-, Z-, 1,5A-, and 1,5X ions

Figure 2
figure 2

Off-line CID (47 eV) tandem mass spectra of reduced and permethylated LNnT (a) and LNT (b)

Whereas the presence of glycosidic and 1,5-cross-ring fragments allows accurate elucidation of the glycan sequence (topology), they provide no information on the linkage positions. For tandem MS-based glycan structural characterization, linkages have usually been inferred from the presence and/or absence of diagnostic cross-ring fragments. In an earlier study where these two linkage isomers were analyzed by SA-TIMS-EED-MS/MS in the non-reduced, permethylated form, linkage-informative 3,5A2 and 1,3A2 ions were observed for the 1 → 4-linked LNnT and the 1 → 3-linked LNT, respectively, allowing confident differentiation of the two isomers [26]. Supporting Figure S2 shows the zoomed-in regions of the EED spectra of permethylated, 18O-labeled LNnT and LNT, where the 3,5A2 (m/z 329.157) and 1,3A2 (m/z 356.168) ions are observed. In the present study, the 3,5A2 ion (S/N 44), along with another 1 → 4-linkage-diagnostic cross-ring fragment, 0,3X2 (m/z 620.325, S/N 18), is observed in the EED spectrum of the isomer that eluted first, suggesting that it is LNnT. Although the monoisotopic peak of the 1,3A2 ion (m/z 356.167, S/N 8) is also present in the EED spectrum of the latter eluting species, it was not identified by the peak picking algorithm, due to the absence of its A + 1 isotopic peak.

Because the LNT-specific 1,3A2 ion is too low in abundance for reliable linkage assignment, other regions of the EED spectra were examined in search of linkage-diagnostic fragments. One such region is populated with several secondary fragments from Z3. For the latter eluting species, presumably LNT, a Z3-NCH3Ac (Ac = CH3CO) ion (m/z 635.324, S/N 42, Supporting Fig. S3a) was observed. This ion was likely generated via direct alpha cleavage from the C3 radical, leading to the loss of the entire C2 substituent (Scheme 2a), and could be diagnostic for the 1 → 3-linked GlcNAc residue. Another diagnostic ion for LNT is 0,4X2/Z3″ (m/z 634.340, S/N 164), which was likely produced from the C3 radical via consecutive alpha cleavages, followed by intramolecular H transfer (Scheme 2b). Note that the 0,4X2/Z3″ ion contains two more hydrogens than the 0,4X2/Z3 ion (m/z 632.325) produced by CID (Fig. 2b), suggesting that it was not formed via an ergodic process. Although the m/z values of Z3-NCH3Ac and the A + 1 isotope peak of 0,4X2/Z3″ differ by only 0.019 Da, these two peaks can be easily differentiated by the FTICR MS measurement. For LNnT, consecutive alpha cleavages from the C4 radical would result in the formation of a Z3-Ac ion (Scheme 2c). The Z3-Ac ion (m/z 664.352, S/N 150, Supporting Fig. S3b) is indeed observed in the EED spectrum of the first eluting species, presumably LNnT. However, unlike the Z3-NCH3Ac ion, which is only observed in the LNT spectrum, the 0,4X2/Z3″ and Z3-Ac ions are present in both EED spectra. Formation of 0,4X2/Z3″ in LNnT and Z3-Ac in LNT likely follows radical migration (via 1,2-H migration) prior to alpha cleavages, and this process is associated with a considerable activation barrier [54]. Therefore, the abundance ratio of 0,4X2/Z3″ to Z3-Ac shows substantial difference, ranging from 0.6 in the LNnT spectrum to 16.9 in the LNT spectrum, and can also be used to distinguish 1 → 3- and 1 → 4-linked GlcNAc residues.

Scheme 2
scheme 2

Proposed EED mechanisms for the formation of various linkage-diagnostic fragments at a GlcNAc site

Other types of secondary fragments from Z ions were also generated, including Z-CH3, Z-OCH3, and Z-CH2OCH3, with the proposed mechanisms shown in Supporting Scheme S2. It was previously reported that for native glycans, Z-CH2OH is preferentially formed at 1 → 4-linked residues (loss of the C6 substituent), whereas Z-OH is the favored product at 1 → 3-linked residues [56]. For permethylated glycans, preferential loss of the C6 substituent, or formation of Z-CH2OCH3, at 1 → 4-linked sites is not observed. This is perhaps not surprising, given that a methoxy radical (OCH3) is far more stable than a hydroxyl radical (OH) due to the hyperconjugative interactions between the methyl group and the oxygen radical [74]. Thus, loss of the C3 substituent (Z-OCH3) from a C4 radical competes more favorably with loss of the C6 substituent (Z-CH2OCH3) in permethylated glycans than in native glycans (Z-OH versus Z-CH2OH). Nevertheless, it appears that the abundance ratio of Z3-CH2OCH3 (m/z 662.336) to Z3-OCH3 (m/z 676.351) is still far greater in the LNnT spectrum than in the LNT spectrum (Supporting Fig. S3b). It is important to note that because Z3-CH2OCH3 is isomeric to 1,5A3, reducing-end isotope labeling is required for peak assignment validation. The bottom panels of Supporting Fig. S3b show the EED spectra of deutero-reduced, permethylated LNnT and LNT, respectively. It is evident that the vast majority of ions that contributed to the peak at m/z 662.336 were reducing-end fragments, as this peak was shifted almost entirely to m/z 663.343 in the deuterium-labeled spectra. With deuterium labeling, the abundance ratio of Z3-CH2OCH3 to Z3-OCH3 can be accurately determined, and is more than six times higher in the LNnT spectrum (0.51) than in the LNT spectrum (0.08). Therefore, the general trend that it is easier to lose the C6 substituent from a C4 radical than from a C3 radical still holds true for permethylated glycans, and such knowledge can be useful for relative abundance-based linkage determination. On the other hand, the Z-CH3 fragments can only be formed after radical migration for all linkage configurations, and thus have little values in linkage analysis. Nonetheless, Z-CH3 product ions, like other secondary fragments from the Z ion, provide additional peak contexts that can be used to boost the IonClassifier scores of true Z and Y ions during topology reconstruction.

Another ion of interest is an internal fragment (m/z 268.115, Supporting Fig. S3c) that is tentatively assigned as C2/Z3. Scheme 2d shows the proposed mechanism for the formation of C2/Z3 via direct alpha cleavages from the C1–C2 di-radical at the GlcNAc site in LNT. For LNnT, formation of C2/Z3 requires radical migration. Consequently, the relative abundance of the C2/Z3 ion in the LNnT spectrum is 38 times lower than in the LNT spectrum. Note that the C2/Z3 ion is essentially an internal HexNAc (N-acetylhexosamine) ion; thus, its prominent presence only indicates the existence of a 1 → 3-(mono)-linked HexNAc residue, but does not specify its location within the glycan sequence. Nonetheless, because there is only one HexNAc residue in each linkage isomer studied here, the propensity for formation of the HexNAc internal ion can serve as an additional indicator for differentiation of LNT and LNnT.

NanoLC-MS/MS Analysis of Reduced and Permethylated LNFP I, II, III, V, and VI

Figure 3a shows the EIC of the precursor ion (m/z 1117.58, [M + Na]+) from an LC-MS analysis of a mixture of reduced and permethylated LNFP I, II, III, V, and VI (black trace), as well as the overlaid EICs of the same precursor from LC-MS analyses of each individual isomer. Although most isomers exhibited a single chromatographic peak, LNFP VI gave rise to two peaks: a major one centering around a retention time (R.T.) of 30.9 min and a minor one around an R.T. of 32.9 min. It is possible that these two peaks resulted from different conformations rather than impurities in the glycan standard, as their EED spectra are quite similar. With five isomers eluting within a 7-min window, some isomers were only partially resolved. In particular, the elution profiles of the linkage isomers LNFP II and III have significant overlap, resulting in a broad peak centering around an R.T. of 33.6 min. Figure 4a–e show the single-scan EED spectra acquired at five different retention times, including two acquired at the rising (R.T. 33.10 min) and falling edges (R.T. 34.17 min) of the unresolved peak. Nonetheless, these five spectra exhibited considerable differences in their fragmentation patterns and can be used to assign the structures at each elution time, from the earliest to the latest, to LNFP I, VI, III, II and V, respectively. These assignments are consistent with the elution profiles of individual glycan standards. For ease of discussion, hereafter we will refer to each EED spectrum by the structure from which it was produced rather than the retention time when it was acquired. Because of the constraint of space, only peaks with high abundance or those useful for topology and linkage determination are labeled in the spectra. Complete lists of assigned peaks with their relative abundances, S/N ratios and ppm errors can be found in Supporting Tables S3 to S7.

Figure 3
figure 3

On-line nanoRPLC-EED-MS/MS analysis (18 eV) of a mixture of reduced and permethylated LNFP glycans. Vertical dashed lines mark the time points when EED spectra shown in Figs. 4a–e were acquired. (a) Black trace: EIC of the precursor ion, [M + Na]+, m/z 1117.58. Overlaid colored traces: EICs of the precursor ion from LC runs of individual standards. (b) Y (FucGalGlc) ions were produced only when LNFP VI and V were eluting. (c) C/Z (GlcNAc) ions were produced only when LNFP I and V were eluting. (d) C/Z (FucGlcNAc) ions were produced only when LNFP II was eluting. (e) C/Z (GalGlcNAc) ions were most abundantly formed in LFNP III but could also be produced in LNFP I and V via radical migration. LNFP VI can produce a C-OCH3 ion that has the same m/z value as the C/Z (GalGlcNAc) ion. The oscillations in the EICs are caused by switching between the MS and MS/MS acquisition modes

Figure 4
figure 4

Single-scan EED spectra of LNFP glycans acquired at five different elution times as marked in Fig. 3, with the major eluting species shown on the top left corner of each panel. Diagnostic fragments are labeled in color matching that of the EIC trace of the precursor in Fig. 3a. The Z-, Y-, 1,5X-triplets are labeled in orange. The double dagger sign indicates two hydrogen losses from the canonical structure. Multiple assignments are listed one above another

Rather than trying to interpret every single peak in the LNFP spectra, we focus our discussion on fragments that are important for isomer differentiation. The ions of interest are each labeled in a color that matches the EIC trace of its respective precursor in Fig. 3a. For topology analysis, complete series of Z-, Y-, and 1,5X-triplets (labeled in orange) are present in all LNFP spectra, as well as the majority of 1,5A-, B-, and C-ions. One complicating factor for topology analysis of branched structures is the presence of internal fragments resulting from the loss of more than one of the non-reducing-end branches. These are mainly Z/Z″-ions, produced via β-elimination of an entire non-reducing-end branch from Z radicals (Supporting Scheme S3) [56, 75]. Z/Z″-ions are most frequently observed at the branching sites, and the appearance of high-abundance Z/Z″-ions could facilitate determination of the branching pattern. Importantly, internal fragments do not share the same contextual features as reducing-end glycosidic fragments, and they can be easily differentiated from terminal fragments based on their peak contexts, in addition to the number of naked glycosidic oxygen (hydroxyl groups) exposed by each glycosidic bond cleavage.

A potential issue in MS/MS-based glycan sequencing is the occurrence of gas-phase structural rearrangements during ion activation, most commonly observed in the form of fucose migration. Although it is reported that permethylation and metal adduction eliminate structural rearrangements in CID, it is important to examine whether this is also true in EED. A quick glance at Fig. 4 seems to suggest that fucose transfer from a GlcNAc residue to the reducing-end hexose might have occurred, as the LNFP V- and VI-specific Z (m/z 636.331) and Y (m/z 654.341) ions are also seen in the EED spectra of LNFP II and III. However, this is actually not the case, considering that the LNFP III spectrum was acquired at R.T. 33.1 min, near the time when the minor LNFP VI species eluted, while the LNFP II spectrum was acquired at R.T. 34.17 min, when elution of LNFP V had already begun. Because the elution profiles of these five isomers overlap in several positions, a better way to gauge whether and to what extent structural rearrangement took place, is to compare the EICs of diagnostic fragments to those of each individual isomer. Figure 3b shows the EIC at m/z 654.341, with vertical dashed lines marking the positions where the spectra shown in Fig. 4a–e were acquired. It is clear that the ion at m/z 654.341 observed in the EED spectra of LNFP III and II was produced from the co-eluting LNFP VI and V, respectively, as its EIC follows closely the elution profiles of LNFP VI and V, with a nearly flat baseline within the retention time window when only LNFP III and II were eluting. Thus, structural rearrangement does not appear to occur during EED of metal-adducted, permethylated glycans.

For linkage analysis, the 3,5A and 3,5A2 ions (m/z 329.157) diagnostic for the Gal1 → 4GlcNAc linkage are observed in the EED spectra of LNnT-derived LNFP VI and III, respectively. For the LNT-derived LNFP isomers, the Gal1 → 3GlcNAc linkage-specific 1,3A3 ion (m/z 530.257) is observed only in the LNFP I spectrum, whereas the 1,3A2 ion (m/z 356.168) is not identified in the EED spectra of LNFP V and II by the SNAP algorithm due to the absence of its A + 1 isotope. Although diagnostic cross-ring fragments are not always present in EED spectra, there are many internal and secondary fragments that can assist with linkage assignment. Similar to LNT, a high-abundance C/Z (GlcNAc) internal fragment (m/z 268.115) is observed in the EED spectra of LNFP I (C3/Z3) and V (C/Z), both containing a 1 → 3-monolinked GlcNAc residue, with a S/N of 149 and 154, respectively. For LNFP VI with a 1 → 4-monolinked GlcNAc, the C/Z ion was generated at a much lower abundance with a S/N of 15. For LNFP II and III where GlcNAc is present at a branched site, loss of the C3 substituent generated a C2/Z (m/z 442.205, S/N 550) ion and a C2/Z (m/z 472.215, S/N 479) ion, respectively, both with the C4 substituent still attached. Loss of the C4 substituent via radical migration is possible, but to a much smaller degree. The relative abundance of C2/Z to C2/Z ranges from 0.095 in LNFP II to 12.4 in LNFP III and can be utilized to differentiate these two linkage isomers. The C/Z internal fragments, though, do not provide site-specific linkage information. This is not a problem for glycans with a single GlcNAc residue, as is the case here. For glycans with multiple GlcNAc residues, so long as they have different C4 and C6 substituents (combined), the C3 branch can also be determined at each site based on the C/Z internal ions. EICs of various GlcNAc-containing C/Z ions were shown in Fig. 3c–e.

For site-specific linkage determination, one needs to examine secondary losses from the corresponding Z ions. In the LNT/LNnT study, we showed that Z-NCH3Ac is a diagnostic fragment for a 1 → 3-linked GlcNAc residue. This appears to hold true for the LNFP glycans as well. For LNFP II and V where the C3 substituent at the GlcNAc site is a Gal residue, EED generated a Z-NCH3Ac ion (m/z 810.420); for LNFP III with a Fuc residue as the C3 substituent, EED produced a Z-NCH3Ac ion (m/z 840.432); for LNFP I with Fuc-Gal as the C3 substituent, EED produced a Z3-NCH3Ac ion (m/z 636.332); and for LNFP VI with no C3 substituent, no Z-NCH3Ac ion was detected. Another pair of ions of interest are the Gal1 → 3GlcNAc-favored 0,4X2/Z″ (m/z 809.436) and the Gal1 → 4GlcNAc-favored Z-Ac, with their abundance ratio ranging from 0.94 in LNFP VI and 0.38 in LNFP III to 14 in LNFP V and 9 in LNFP II. Zoom-in views of spectral regions containing these fragment ions are shown in Supporting Figure S4. On the other hand, although the abundance ratio of Z-CH2OCH3 (m/z 837.431) to Z-OCH3 (m/z 851.447) is still higher in LNFP VI (0.31) and III (0.31) than in LNFP V (0.10) and II (0.24), such a small difference has little diagnostic value for linkage analysis.

The discussion thus far has been focused on the differentiation of 1 → 3- and 1 → 4-linkages. As one last example, we will turn our attention to the only 1 → 2-linkage site present in these glycan standards, the Fuc1 → 2Gal moiety in LNFP I. Positive identification of the 1 → 2-linkage at a mono-substituted hexose is very difficult by MS/MS, since it does not produce any unique cross-ring fragment that cannot also be generated by a 1 → 3-linked hexose. In principle, one could infer the 1 → 2-linkage based on the absence of a 0,2X ion, because the 0,2-cross-ring cleavage does not break the glycan sequence at a 1 → 2-linked residue, but instead leads to the formation of a neutral loss product, e.g., the low-abundance “0,2X3” ion at m/z 955.494 in the LNFP I spectrum. However, linkage determination via negative inference is risky, especially since 0,2X ions are not usually produced in high abundance at non-1 → 2-linked sites. We note that there is a high-abundance C2-OCH3 ion (m/z 401.178, S/N 290) present in the LNFP I spectrum, which could be produced via loss of the C3 substituent from the C1–C2 di-radical as shown in Scheme 3a. For 1 → 3-linked hexose, the same fragmentation pathway would have generated a C/Z (Hex) internal ion. The C1–C2 di-radical can also produce a C/Y (Hex) internal ion at a 1 → 2-linked site (Scheme 3b). C/Y internal ions are generally not observed with any other linkage configuration. Together, these two fragments provide strong, positive evidence for the 1 → 2-linkage assignment. We should caution that C-OCH3 may also be produced at 1 → 3-linkage sites via radical migration. For example, in LNFP V, C-OCH3 (m/z 472.215, S/N 34) is observed at the 1 → 3-linked GlcNAc site, but it is far less abundant than the C/Z ion (m/z 268.115, S/N 154) generated via direct loss of the C3 branch. Thus, by taking into consideration the propensity of all these diagnostic, but not necessarily unique, ions, linkage assignment can be far more reliably achieved than by inference from cross-ring fragments alone.

Scheme 3
scheme 3

Proposed EED mechanisms for the formation of 1 → 2-linkage-diagnostic fragments

Conclusions and Future Perspectives

In this study, we investigated the potential of nanoRPLC-EED-MS/MS for characterization of isomeric glycan mixtures. We showed that up to five isomeric species, including both linear and branched structures, could be at least partially resolved and characterized in a single LC-MS/MS run. For the two linkage isomers with significant overlap in their elution profiles, it was still possible to determine their presence and elucidate their structures by examining the EED spectra acquired at the rising and falling edges of the overlapped chromatographic peak. Compared with CID, EED generated far more structurally informative fragments. Translating these complex glycan EED tandem mass spectra into structures requires a sound understanding of the fragmentation process. We showed that the formation of most fragment ions, as well as their relative abundances, can be explained within the frame of the ionization/electron recapture mechanism and fundamental radical chemistry. For all glycan standards studied, EED was able to indiscriminately cleave every single glycosidic bond, generating complete C-, Y-, and Z-ion series with distinctive peak contexts for each fragment type, allowing accurate glycan topology elucidation from the first principle. We further demonstrated a new strategy for glycan sequencing that deduces linkages based on the propensity of a combination of diagnostic cross-ring, secondary, and internal fragments.

The current study represents an important step towards the development of an LC-EED-MS/MS approach for automated and comprehensive glycome characterization, although several challenges remain before it can be routinely applied to analysis of biological samples. First, the chromatographic resolution needs to be improved to minimize spectral contamination by fragments from co-eluting isomers. We are currently implementing EED with on-line PGC-LC separation at elevated temperatures as PGC-LC is one of the best chromatographic methods for resolving isomeric structures. Second, for complex biological samples, it is likely that some isomeric glycans will remain incompletely resolved, even on PGC-LC. Partially resolved chromatographic peaks can be deconvoluted by principal component analysis (PCA) at the MS2 level, similar to deconvolution of overlapped ion mobility profiles [76, 77]. For isomeric glycans, common fragments are sometimes produced with substantially different abundances for each isomer, and this may lead to failure with PCA. We are currently developing a new component deconvolution algorithm that uses a parametric kernel to constrain the shape of chromatographic peak of each isomer, with promising preliminary results. Third, there will be a need to acquire and analyze a large number of EED spectra of glycans that include structures with a variety of residues, branching patterns, and linkage configurations, from which a comprehensive training dataset may be generated for the construction of a universal LinkageClassifier that would ultimately allow fully automated linkage analysis.