Introduction

Substituted benzoic acid/ester derivatives are important building blocks used in many pharmaceutical compounds and in organic synthesis. They have notably been used as preservatives for food, cosmetics, and pharmaceutical products because of their antimicrobial and antifungal properties [1,2,3]. Compounds that contain the benzoic acid/benzamide moiety have been fundamental to the pharmaceutical industry in products that cover a bevy of therapeutic benefits, including inflammation (aspirin, salicylic acid), schizophrenia (amisulpride), insomnia (suvorexant), and cancer therapy (pemetrexed, imatinib).

Different positional isomers substituted at ortho, meta, and para positions as well as conformational isomers such as E/Z isomers can be generated during the synthesis of the aromatic substituted benzoic acid/ester derivatives. Traditionally, analytical methods such as nuclear magnetic resonance (NMR) and high performance liquid chromatography-high resolution tandem mass spectrometry (HPLC-HRMS/MS) are able to differentiate these sets of isomers. However, each method has limitations. NMR studies can often be time-consuming, and depend on the purity of samples making complex mixture analysis difficult [4,5,6,7,8,9]. HPLC-HRMS/MS has the advantages of chromatographic separation of mixtures, but identification of each chromatographically resolved peak often requires the retention time match with an authentic sample because most positional isomers of aromatic substituted benzoic acid/ester derivatives have indistinguishable fragmentation patterns under tandem mass spectrometric analysis [10,11,12,13,14,15,16]. The neighboring group participation (NGP) effect in organic chemistry has been defined by IUPAC as the intramolecular interaction of a reaction center with a lone pair of electrons in an atom or the electrons present in a σ-bond or π-bond [17]. While NGP influences many reactions in solution, the gas-phase NGP effect has also been extensively studied to explain many fragmentation patterns during collision induced dissociation (CID) [18,19,20,21,22,23,24]. Cooks et al. showed δ-cleavage from gas-phase aryl participation in several azulene compounds after electron ionization [23]. Bigler and Hesse identified cinnamic acid derivatives of diamines based on electrospray ionization (ESI)-MS fragmentation and noted the difference in reactivity correlated with ring strain of cyclized products [19]. While these previous studies evaluated the NGP effect under CID conditions, this study mainly focuses on the NGP effect that can take place during MS1 analysis of protonated ortho-substituted benzoic acid/ester derivatives with nucleophilic functional groups containing lone pair or π-bond electrons. The high energy barrier for intramolecular cyclization as seen in CID could be overcome if the nucleophile and electrophile is in small spatial proximity as is the case for ortho-substituted benzoic acid/ester derivatives. These nucleophilic groups can attack the protonated carboxylic acid/ester moiety to facilitate the water/alcohol neutral molecule loss. On the other hand, no significant water/alcohol neutral loss would be expected for the meta- and para-substituted benzoic acid/ester derivatives due to the larger spatial distance and steric penalty for nucleophilic attack. This observation can be used to differentiate some ortho-substituted benzoic acid/ester derivatives from their meta- and para-substituted isomers.

In the present study, 22 ortho-, meta-, and para-substituted benzoic acid scaffolds (1-22) (Figure 1) were chosen to study this effect. These compounds were incubated in 0.05% H2SO4 in methanol (MeOH), ethanol (EtOH), and isopropanol (i-PrOH) solutions in order to form the corresponding ester products. The resulting 65 acid and ester products were separated by LC followed by (+)-ESI/MS analysis. Three factors were determined to impact the differentiation of isomers based on MS1 analysis: (1) the effects of the competing protonation of the carboxyl group versus the other nucleophilic group during ESI; (2) the steric effects of cyclization; and (3) the nucleophilic group proximity to the electrophilic center. Each compound was tested to see if neutral loss of either H2O for the acids or MeOH, EtOH or i-PrOH for the esters occurred during MS1 analysis. These losses will be referred to as neutral group losses. Furthermore, efforts were made toward understanding the gas-phase NGP effect in two triazole benzoic acid isomers (2324) as shown in Figure 1 generated during the synthesis of suvorexant (trade name Belsomra) and analyzed here by HRMS, MS/MS, Ion mobility spectrometry-mass spectrometry (IMS-MS), NMR spectroscopy, and density functional theory (DFT) calculations.

Figure 1
figure 1

Model compounds grouped by the extent of water/alcohol loss. Nucleophilic groups are marked in red and the R group is marked in blue. Two triazole compounds were also tested with 23 having large loss (>30%) and 24 having no loss (0%)

Experimental

Chemicals and Reagents

Methanol, ethanol, isopropanol, acetonitrile, water with 0.1% formic acid solvents, and sulfuric acid were obtained from Fischer Scientific (Optima LC-MS grade, Waltham, MA, USA). Compounds were sourced as follows: 1a, 4, 7a, 12a, 13a, 15a, 17a, and 21a were obtained from Sigma-Aldrich (St. Louis, MO, USA); 2a, 6a, 8a, 19a, and 20a were obtained from Enamine LLC (Monmouth, NJ, USA); 3a, 11a, and 14a were obtained from Maybridge (Loughborough, UK); 5a was obtained from Rieke Metals (Lincoln, NE, USA); 9 was obtained from ChemBridge (San Diego, CA, USA); 10a was obtained from Matrix Scientific (Elgrin, SC, USA); 16a was obtained from Chem-Impex International (Wood Dale, IL, USA); 18a was obtained from Apollo Scientific (Manchester, UK); 22a was obtained from ArkPharm (Libertyville, IL, USA); 23a and 24a were synthesized according to literature procedure [25,26,27].

Sample Preparation

Samples were prepared in 2 mL HPLC vials using an Andrew Alliance pipetting robot. The 24 analyte compounds were prepared in 0.05% H2SO4 in three different solvents (methanol, ethanol, and isopropanol) for esterification to the b, c, and d compounds by adding a few crystals of each analyte compound to the vials and mixing. Dissolved analytes were stored at room temperature for 2 d to allow for esterification of the carboxylic acid to occur.

Instrumentation

HPLC/UV

LC separation was performed on a Waters Acquity UPLC system, consisting of a binary pump system, a sample manager, and a PDA detector (Waters Corporation, Milford, MA, USA). The output signal was monitored and processed with MassLynx software designed by Waters Corporation (Milford, MA, USA).

Separation was carried out on a Waters BEH C18 column (2.1 × 50 mm, 1.7 μm particle sizes). The mobile phase consisted of water with 0.1% formic acid (mobile phase A) and acetonitrile (mobile phase B). The injection volume was 3 μL. Analytes were eluted using a gradient method consisting of an initial hold at 1% mobile phase B for 0.5 min, followed by a linear gradient to 90% B over 3.5 min, then a linear gradient to 0% B over 0.1 min, a hold at 0% B for 0.4 min, and finally a hold at 1% B for 0.5 min affording a total run time of 5 min. The flow rate was 0.5 mL/min. The column temperature was maintained at 40 oC. The PDA detector was scanned from 200 to 400 nm.

Mass Spectrometry

The eluent was introduced directly into the mass spectrometer via an electrospray ionization source. MS analysis was performed on a Waters Premier quadrupole time-of-flight (Q-ToF) mass spectrometer operating in positive ion mode. Source temperature and desolvation temperature were set to 120 oC and 400 oC, respectively. Nitrogen was used as both the cone gas (50 L/h) and desolvation gas (800 liters/h). The capillary voltage was set to 3 kV. The cone voltage applied was 10 V. Leucine enkephalin was used as the lock mass (m/z of 556.2772) for accurate mass calibration and was introduced using the lock spray interface at 20 μL/min at a concentration of 0.5 mg/mL in 50% aqueous acetonitrile containing 0.1% formic acid. During MS scanning, data were acquired in centroid mode from m/z 50 to 1000.

Ion Mobility Spectrometry (IMS) Instrumentation

IMS-MS experiments were conducted on Agilent 6560 uniform-field ion mobility-Q-ToF instrument (Agilent Technologies, Santa Clara, CA, USA). Samples were infused by syringe pump (5 μL/min) and ionized by ESI (3.5 kV capillary voltage). The instrument was tuned for fragile ions, an automated procedure that lowers ion optics voltages to preserve fragile ion structures as they are transported through the instrument from the source to detector. Sheath and desolvation gas temperatures were set to 150 °C and 100 °C, respectively. Ion mobility measurements were performed in both helium and nitrogen drift gases (4.0 Torr) with a drift tube entrance voltage of 1200 V and a drift tube exit voltage of 250 V. Experimental nitrogen and helium collision cross-sections (CCS) were measured by single-field calibration with the Agilent Tune Mix ions m/z 118 and 322. Theoretical CCS values were calculated by the trajectory method in MOBCAL [28] from energy-minimized structures obtained from the DFT calculations described below.

NMR

NMR data were acquired in DMSO-d6 at room temperature using a 3 mm tube at 500 MHz from a Bruker instrument equipped with a triple-resonance (TCI) Prodigy CryoProbe. For the 1D selROE experiment acquired for isomer 24a, a 20 ms Gaussian selective pulse was used to excite proton Hb. A mixing time of 300 ms was used for the ROESY step.

Density Functional Theory (DFT) Calculations

An ensemble of molecular mechanics (MM) conformations for each intermediate were generated with Openeye’s Omega [29, 30], followed by DFT optimization using Jaguar 9.1 [31, 32]. All calculations were performed at the B3LYP-D3/6-31++G**level in the gas phase [33,34,35,36,37].

Results and Discussion

MS1 Spectra of Benzoic Acids and Esters

Table 1 summarizes the MS1 data obtained for all 65 model compounds. The stability of the ester compounds was not uniform and some compounds were unable to form all three ester products, as shown in Figure 1. The compounds were divided into four groups also shown in Figure 1, with Group 1 including compounds that exhibited a neutral group loss greater than 30%, Group 2 including compounds with a neutral loss between 0% and 30%, and Group 3 with no neutral group loss observed. Examples of the MS1 spectra observed for 6a-b and 20a-b are shown in Figure 2. All other MS1 spectra can be found in the Supplementary material.

Table 1 MS1 Data for benzoic acid and ester model compounds. Nominal m/z provided with relative abundance in parenthesis. Compounds in blue are >30% abundance. Compounds in green are between 0% and 30% abundance relative to the base peak. Compounds in red are 0% abundance (not detected)
Figure 2
figure 2

The MS1 spectra of 6a and 20b under (+)-ESI. The pyrazole substitution group is highlighted in red. The acid/ester functional group is highlighted in blue

The compounds in Group 1 (1-6) showed significant neutral group loss greater than 30%. These compounds all included nucleophilic groups ortho to the acid/ester functional group. These nucleophiles ranged from amine groups in 1, to oxygen nucleophiles in 3 and 5, to π-bonds in 2. Group 2 compounds (7-14) showed moderate neutral group loss with ortho nucleophilic groups and two para nucleophilic groups in 8 and 10. Interestingly, two para compounds include longer chain nucleophiles that might allow nucleophilic attack to bridge the aromatic ring. Another explanation for the neutral group loss observed in MS1 analysis of 8 and 10 is that the carboxylic acid/ester is the main protonation site during ionization due to the relative low proton affinity of the para substitution group. Group 3 compounds (15-22) included only compounds that had nucleophilic groups meta and para to the acid/ester, which did not show neutral loss during ionization, as expected.

A proposed mechanism to explain the neutral group loss from the model compounds can be found in Scheme 1. The reaction proceeds via two pathways, a 5- or 6-member ring formation depending on the placement of the nucleophilic group. In the case of carboxylic acid, the reaction is initiated by protonation on the carbonyl group of the carboxylic acid during ionization. The NGP effect then occurs as the nucleophilic substituent attacks the electrophilic carbonyl with subsequent elimination of the neutral leaving group to form a cyclic product ion. The three controlling factors governing this cyclization are proposed as follows: (1) the proton affinity of the carboxylic acid/ester competing with other basic sites on the compound; (2) the steric penalty of forming the 5- or 6-member ring; and (3) the proximity of the nucleophilic group to the carbonyl.

Scheme 1
scheme 1

Proposed mechanism for neutral group loss based on the NGP effect. The protonation is on the carbonyl group to facilitate the subsequent intramolecular nucleophilic addition

Proton Affinity (PA) of Nucleophilic Group and Carboxylic Acid/Ester

The gas-phase PA of each functional group within a molecule is closely related to the probability of protonation during (+)-ESI ionization [38]. As shown in Table 1, it was observed that the proton affinity (PA) of the neighboring nucleophilic groups and the carboxylic acid/ester closely correlate with the extent of neutral loss observed experimentally. On one hand, the higher PA of the leaving group (carboxylic acid/ester) led to more neutral loss. For example, methyl ester group (203.3 kcal/mol) in 1b had higher PA than carboxylic acid group (196.2 kcal/mol) in 1a [38]. Hence, more protonation was expected on the methyl ester, resulting in a larger extent of neutral loss (100% for 1b vs 22% for 1a). On the other hand, the higher PA of the neighboring substituent led to less neutral loss. For example, a larger extent of neutral loss was observed for 3a (100%) compared with 1a (22%) because the benzyl amino group (218.3 kcal/mol) has higher PA than furan (192.0 kcal/mol) [38]. Interestingly, compounds 1a, 16a-d, and 17a-c with ortho, meta, and para-amino substitutions showed more NH3 loss rather than water/alcohol loss, which is another indication that high PA of the neighboring substituent is not favored in cases where the NGP effect occurs.

Steric Effect of Ring Formation

The formation of the 5- or 6-member rings following nucleophilic attack of the carbonyl was also seen to be impacted by the bulkiness of the groups attached to the nucleophile. As seen in 4 and 9, the water loss was much greater in 4 (59.0%) than in 9 (4.0%). This difference in reactivity can be attributed to the steric penalty of the bulky isopropyl group attached to 9 compared with the smaller methoxyl group attached to 4. It was also observed that 12a-d had reduced neutral group losses (3.2%–21.2%) compared with 6a-d (51.2%–100%). The differences in these structures are the dimethyl substitution of the pyrazole rings in 12a-d where as 6a-d do not have these substituents. These methyl groups also create steric interference that reduces the stability of the ring formation.

Proximity of the Nucleophilic Group to the Carbonyl

The ortho, meta, and para relationships of the nucleophilic groups were also investigated in this study. As shown for 1a-d, 16a-d, and 17a-c, the ortho-substituted primary amine exhibited significant neutral group loss that was not seen in the meta- and para-amines. Cyclization resulting from nucleophilic attack appears to be constrained to ortho-positioned nucleophilic groups. This effect was visible in numerous other positional isomeric compound pairs in this study, including 2a-b and 8a-c, 3a-b and 18a-b, 5a-d and 10a-c, 7a-d and 15a-d, and 12a-d and 21a-d. This differentiation based on nucleophilic group placement proves to be an effective means for differentiating ortho-placed nucleophilic groups from meta and para groups by neutral group loss.

Studies of Two Triazole Acid 23a and 24a (Synthetic Intermediates for Suvorexant)

This research was also extended to an analysis of a benzoic acid building block of suvorexant (Belsomra), a commercially available drug produced by MSD approved by the FDA in 2014 for the treatment of insomnia. The synthesis of suvorexant includes two benzoic acid isomer intermediates as shown in Figure 3 after the triazole acid addition step [25,26,27]. These intermediates (shown in Figure 1) were also converted to esters and tested by UPLC-HRMS in order to investigate if MS spectrum differences can be found based on NGP effects. The MS1 spectra and IMS drift time distributions for 23a and 24a can be found in Figure 4. The products were also studied with IMS, NMR, and density functional theory (DFT) calculations.

Figure 3
figure 3

Intermediate step in synthesis of suvorexant [25]

Figure 4
figure 4

(Left) MS1 spectra of 23a and 24a under (+)-ESI. (Right) IMS difference of 23a and 24a in drift time

MS1 Data

Table 1 summarizes the MS1 data obtained for 23a-c and 24a-c. Among all acids and esters, it can be seen that significant neutral group loss of 100% occurs for 23a-c while a minimal loss of 0%–4.4% occurs for 24a-c. Model compound analysis suggests that significant neutral group loss would occur in both compounds due to the ortho-triazole ring to the carboxylic acid in both cases. Two factors were proposed to explain the low reactivity of 24a-c: (1) The main protonation site is on position 3 nitrogen instead of position 2 due to the PA difference, which would inhibit the following proton transfer from triazole to acid/ester group; (2) the triazole and phenyl ring were not favored to be synperiplanar due to the steric hindrance of the two hydrogens (Figure 5). A rotational barrier must be overcome to form cyclized neutral loss product for 24a but not for 23a. This hypothesis was further investigated using NMR spectroscopy, IMS-MS, and DFT analysis.

Figure 5
figure 5

DFT optimization (B3LYP-D3-631++G**, gas phase) of relevant species for 23a

NMR Data

To investigate the orientation of 24a towards the acid group, we acquired a 1D selective ROE applying selective inversion at the aromatic proton Hb. As shown in Figure 5 and Figure S49 (in Supporting Information), a strong ROE correlation was observed between protons Hb and Ha, indicating that they are close in the space (~2–3 Å distance), which would be consistent with the nitrogen at the 2-position oriented toward the carboxylic acid moiety. However, it must be noted that NMR could not determine the exact rotamer of 24a in solution because the C-N bond was expected to rotate in solution. Evaluation of potential hydrogen bonding was carried out by acquiring 1D proton NMR for both 23a and 24a. Assuming that there is no hydrogen bonding in 23a and due to the fact that proton spectra revealed that the OH proton had the same chemical shift in both isomers (Figures S49-S51), it was concluded that there was no hydrogen bonding in isomer 24a either. Note that in the case of hydrogen bonding, the OH proton would have resonated at a different, more downfield chemical shift. In summary, the 1D selROE data were consistent with the nitrogen at the 2-position in 24a pointing toward the carboxylic acid group. However, this could not be further confirmed by hydrogen bonding formation as it was not observed in proton NMR spectra.

DFT Calculations

The protonation and cyclization processes of 23a and 24a were also investigated using DFT calculations with results shown in Figures 5 and 6. The most stable species of 24a in gas phase corresponds to an anti-planar conformation, where the triazole group is orthogonal to the phenyl ring. Therefore, the lack of neutral group loss in 24a can be partially attributed to the position of the nucleophilic triazole group, which is not aligned with the carbonyl reactive site. Thus, no nucleophilic attack can occur without a conformational rearrangement. Additionally, the 3-position is the most basic site in 24a. As shown in Figure 6, in order for the proton transfer to occur, the nitrogen at the 2-position needs to be protonated, which is 6.5 kcal/mol higher in energy than the protonated isomer at position 3. This pathway was compared with the corresponding one for 23a in which significant water loss is observed experimentally. DFT calculations show that the neutral species of 23a has the triazole pointing towards the carbonyl group. Initial protonation of the triazole is likely leading to a rapid proton transfer to the hydroxyl portion of the carboxylic acid with subsequent addition and elimination of the water.

Figure 6
figure 6

DFT optimization (B3LYP-D3-631++G**, gas phase) of relevant species for 24a

Ion Mobility Spectrometry-Mass Spectrometry Analysis

Ion mobility spectrometry (IMS) provides an additional dimension of analysis through gas-phase shape and charge separation of ionized analytes prior to MS detection [39]. In this study, IMS-MS was utilized to compliment NMR and computational efforts to distinguish isomer structures 23a and 24a. Ion mobility drift time distributions extracted from precursor m/z 204 are shown in Figure 4 for both triazole benzoic acid isomers analyzed in nitrogen drift gas. The faster drift time for isomer 23a indicates a more compact structure compared to 24a. The proposed mechanism for intramolecular cyclization and subsequent neutral loss prevalent in MS1 of compound 23a suggests the amine-containing neighboring group is positioned in close proximity to the acidic functional group, which is supported by the observed relative ordering of a higher mobility isomer 23a. Despite the close structural similarity in these isomers, IMS allows for differentiation between the two protonated precursor ions. When combined with computational modeling, experimentally and theoretically derived collision cross-section (CCS) values can be compared to proposed all-atom structures [40]. Energy minimization of the DFT-generated protonated 23a and 24a structures, followed by calculation of CCS values by the trajectory method in MOBCAL yields theoretical CCS values in agreement with the experimentally measured CCS values; 23a (77.7 Å2) has a smaller calculated trajectory method CCS than 24a (80.1 Å2). MOBCAL calculates CCS with helium as the neutral collision gas, and thus experimentally derived CCS values in helium are reported for direct comparison with the calculations with values of 78.3 Å2 for 23a and 79.7 Å2 for 24a. Experimental and theoretical CCS (trajectory method) values are typically considered to be in agreement when values are within 2% relative difference [41]; this is the case for these two isomers, with 23a (0.8% difference) and 24a (0.5% difference). Characterization of subtle structural differences between these two triazole benzoic acid isomers through a combination of NMR, IMS-MS, and computational modeling helps reinforce the conclusions drawn from each of these complementary structure elucidation approaches, and supports the hypothesis of water loss due to a gas-phase NGP effect for the scaffolds investigated in this study.

Conclusion

Ortho-substituted benzoic acid/ester derivatives were identified among other positional isomers using HRMS1 by the neighboring group participation (NGP) effect. A total of 65 model compounds were tested, and it was determined that the NGP effect depends on three factors: the proton affinity of the nucleophile, the bulkiness of the nucleophile, and the positional isomer (ortho- compared to meta- and para-nucleophile). The trends found for the model compounds were then extended to investigate two triazole isomers that form during the synthesis of the active pharmaceutical ingredient (API) suvorexant (Belsomra®). MS1 spectra differentiated the isomers based on the NGP effect, as 23 showed large neutral group loss whereas 24 did not. Further IMS-MS, NMR and DFT studies were also conducted to show the non-reactivity of 24 (in terms of neutral loss during MS1) can be attributed to two main factors: 1) higher PA of the nitrogen at the 3-position of the triazole; and 2) the protonated triazole in 24 is likely oriented 90 degrees, orthogonal to the carboxylic acid in the gas phase, disfavoring nucleophilic attack by the nitrogen at the 2-position. The results from this work aid structural elucidation of substituted benzoic acid/ester isomers through facile interrogation of the gas-phase NGP effect in MS1 analysis.