1 Introduction

The usual claim that the Brout–Englert–Higgs mechanism of mass generation for elementary particles [1,2,3] has been experimentally confirmed at the Large Hadron Collider (LHC) thanks to the Higgs boson discovery [4, 5], and subsequent studies of its properties [6, 7], is only valid so far for the heaviest Standard Model (SM) particles: W and Z weak bosons, and quarks and leptons of the third family (t, b and \(\tau \)). Today, not only the generation of all neutrino masses remains a mystery [8], but at the end of the LHC lifetime only a fraction of the Higgs Yukawa couplings to the second-family fermions (the muon and, maybe, the charm quark) will have been probed. On the other hand, due to their low masses and thereby small Yukawa couplings to the Higgs field, the mass generation mechanism for the stable matter of the visible universe, composed of u and d quarks plus the electron and neutrinos (\(\nu \)), will remain experimentally untested. The smallest Yukawa coupling, aside from the Dirac \(\nu \)’s case, is that of the electron given by \(y_\mathrm {e} = \sqrt{2} m_\mathrm {e}/v = 2.9\cdot 10^{-6}\) for \(m_\mathrm {e}= 0.511\cdot 10^{-3}\) GeV and Higgs vacuum expectation value \(v = (\sqrt{2}\mathrm {G_F})^{-1/2} = 246.22\) GeV. Measuring the Higgs coupling to the electron is impossible at hadron colliders because the \(\mathrm {H} \rightarrow \mathrm {e^+e^-}\) decay has a tiny branching fraction of \(\mathcal {B}(\mathrm {H}\rightarrow \mathrm {e^+e^-}) = 5.22\cdot 10^{-9}\) (see Eq. (2) below) and is completely swamped by a Drell–Yan \(\mathrm {e^+e^-}\) continuum whose cross section is many orders of magnitude larger. Measurements in p-p collisions at the LHC, assuming the SM Higgs production cross section, lead to an upper bound on the branching fraction of \(\mathcal {B}(\mathrm {H}\rightarrow \mathrm {e^+e^-})< 3.6\cdot 10^{-4}\) at 95% confidence level (CL), corresponding to an upper limit on the Yukawa coupling \(y_\mathrm {e}\propto \mathcal {B}(\mathrm {H}\rightarrow \mathrm {e^+e^-})^{1/2}\) of 260 times the SM value [9, 10]. Such a constraint can be translated into a lower bound on the energy scale of any physics beyond the SM (BSM) affecting \(y_\mathrm {e}\), of \(\varLambda _{\textsc {bsm}} \approx v^{3/2}(\sqrt{2}m_\mathrm {e}\cdot (y_\mathrm {e}/y^\mathrm {\textsc {sm}}_\mathrm {e}))^{-1/2} > rsim 8.8\) TeV [11]. Assuming that the sensitivity to the \(\mathrm {H} \rightarrow \mathrm {e^+e^-}\) decay scales with the square root of the integrated luminosity, the high-luminosity LHC phase with a \(\mathcal {L}_\mathrm {\tiny {int}}= 3\,\mathrm {ab}^{-1}\) data sample [12] will result in \(y_\mathrm {e}\lesssim 120 y^\mathrm {\textsc {sm}}_\mathrm {e}\) (i.e. \(\varLambda _{\textsc {bsm}} > rsim 13\) TeV).

The possibility of studying resonant Higgs production at leptons colliders has been considered in the literature so far only for \(\mu ^+\mu ^-\) annihilation at \(\sqrt{s} = m_\mathrm {H}\), notably as a means to directly and precisely measure \(\varGamma _\mathrm {H}\), \(m_\mathrm {H}\), and the muon Yukawa coupling, by exploiting a large peak production cross section of \(\sigma _{\mu \mu \rightarrow \mathrm {H}} = 70\) pb [13]. The same measurement at an \(\mathrm {e^+e^-}\) machine had never been seriously considered given the sub-femtobarn cross section for the \(\mathrm {e^+e^-}\rightarrow \mathrm {H}\) process, suppressed by at least a factor \(m_\mathrm {e}^2/m_\mu ^2\) compared to the muon collider case. Notwithstanding this difficulty, when the FCC-ee was first proposed [14], it was noticed that the unparalleled integrated luminosities of about \(\mathcal {L}_\mathrm {\tiny {int}}= 10\) ab\(^{-1}\)/year available at \(\sqrt{s}= 125\) GeV, would make it possible to attempt an observation of the direct production of the scalar boson [15, 16]. Such a consideration motivated a few subsequent works on various \(\mathrm {e^+e^-}\rightarrow \mathrm {H}\) theoretical [11, 17,18,19] and accelerator [20, 21] aspects.

The Feynman diagram for s-channel Higgs production (and dominant decays) is shown in Fig. 1 (left). Other \(\mathrm {e^+e^-}\rightarrow \mathrm {H}\) production processes, through W and Z loops, are suppressed by the electron mass for on-shell external fermions and have negligible cross sections [11]. The resonant Higgs cross section at any given c.m. energy \(\sqrt{s}\) is theoretically given by the relativistic Breit–Wigner (BW) expression:

$$\begin{aligned} \sigma _\mathrm {ee\rightarrow H} = \frac{4\pi \varGamma _\mathrm {H}\varGamma (\mathrm {H}\rightarrow \mathrm {e^+e^-})}{(s-m_\mathrm {H}^2)^2 + m_\mathrm {H}^2\varGamma _\mathrm {H}^2}, \end{aligned}$$
(1)

where \(\varGamma _\mathrm {H} = 4.1\) MeV is the total Higgs width [22], \(m_\mathrm {H}=125\) GeV its mass, and the partial decay width \(\varGamma (\mathrm {H}\rightarrow \mathrm {e^+e^-})\), given by the tree-level relation

$$\begin{aligned} \varGamma (\mathrm {H}\rightarrow \mathrm {e^+e^-}) = \frac{\mathrm {G_F}m_\mathrm {H}m_\mathrm {e}^2}{4\sqrt{2}\,\pi }\left( 1-\frac{4\,m^2_\mathrm {e}}{m^2_\mathrm {H}}\right) ^{3/2} = 2.14\cdot 10^{-11}\,\mathrm {GeV\,,} \end{aligned}$$
(2)

is tiny due to its dependence on the square of the \(\mathrm {e}^\pm \) mass. From the BW expression (1), it is clear that an accurate knowledge of the \(m_\mathrm {H}\) value is critical to maximize the resonant cross section. Combining three \(\mathrm {e^+e^-}\rightarrow \mathrm {H}\mathrm {Z}\) measurements at FCC-ee (recoil mass, peak cross section and threshold scan), a \(\mathcal {O}\)(2 MeV) mass precision is achievable [23] before a dedicated \(\mathrm {e^+e^-}\rightarrow \mathrm {H}\) run. In addition, the FCC-ee beam energies will be monitored with a relative precision of \(10^{-6}\) [24], warranting a sub-MeV accuracy of the exact point in the Higgs lineshape being probed at any moment. Taking \(m_\mathrm {H} = 125\) GeV, Eq. (1) gives \(\sigma _\mathrm {ee\rightarrow H} = 4\pi \mathcal {B}(\mathrm {H}\rightarrow \mathrm {e^+e^-})/m_\mathrm {H}^2 = 1.64\) fb as peak cross section. Two effects, however, lead to a significant broadening of the Born-level result: (i) initial-state \(\gamma \) radiation (ISR) reduces the cross section and generates an asymmetry of the Higgs lineshape, and (ii) the actual beams are never perfectly monoenergetic, i.e. the collision \(\sqrt{s}\) has a spread \(\delta _{\sqrt{s}}\) around its centre value, further leading to a smearing of the BW peak. The reduction of the BW cross section due to IS photon emission(s) is of factor of 0.35 and leads to \(\sigma _\mathrm {ee\rightarrow H} = 0.57\) fb [17]. The additional impact of a given c.m. energy spread on the Higgs BW shape can be quantified through the convolution of BW and Gaussian distributions, i.e. a relativistic Voigtian function [25]. Figure 1 (right) shows the Higgs lineshape for various \(\delta _{\sqrt{s}}\) values. The combination of ISR plus \(\delta _{\sqrt{s}} = \varGamma _\mathrm {H} = 4.1\) MeV reduces the peak Higgs cross section by a total factor of 0.17, down to \(\sigma _\mathrm {ee\rightarrow H} = 0.28\) fb. As a baseline study, we will use this latter value as our default expectation for the signal production cross section and compute the corresponding significance for a 1-year operation with 10 ab\(^{-1}\) integrated luminosities per FCC-ee interaction point (IP). The computed signal yields and associated significances can then be subsequently rescaled to any other choice of \((\delta _{\sqrt{s}}, \mathcal {L}_\mathrm {\tiny {int}})\) values given by the chosen beam monochromatization scheme [20, 21].

Fig. 1
figure 1

Typical diagrams for the direct Higgs channel production (left) decaying into electroweak bosons (top) and fermions or gluons (bottom) and associated backgrounds (centre), considered in this work. Right: Resonant Higgs production cross section, including ISR effects, for several values of the \(\mathrm {e^+e^-}\) c.m. energy spread \(\delta _{\sqrt{s}}\) = 0, 4.1, 7, 15, 30 and 100 MeV [17]

2 Analysis strategy. Simulation of signal and background processes

The strategy to observe the resonant production of the Higgs boson is based on identifying final states in \(\mathrm {e^+e^-}\) collisions at \(\sqrt{s}= m_\mathrm {H}\), consistent with any of the H decay modes, that lead to a small increase (but, hopefully, statistically significant when combined together) of the measured cross sections with respect to the theoretical expectation for their occurrence via background processes alone, involving \(\mathrm {Z}^*\), \(\gamma ^*\), or t-channel exchanges (Fig. 1, centre diagrams). The assumption is that, after various years of FCC-ee operation at the Z pole and HZ c.m. energies [26, 27], the theoretical knowledge of the overwhelming background cross sections will be at the \(10^{-5}\) level or better [28], and that experimental systematic uncertainties (detector acceptance, reconstruction efficiencies, luminosity, etc.) will be controlled at the same level of precision [27, 29] and/or will partially cancel out in ratios of number of signal over backgrounds yields. Under such circumstances, the proposed measurement can be considered as a very-rare “counting experiment” that aims at adding up the individual statistical significances for various final states consistent with known Higgs decay channels in the hope to observe an excess above the background counts expectations.

In order to carry out our simulation studies, we generate individual samples of 10\(^5\)–10\(^7\) \(\mathrm {e^+e^-}\) annihilation events at \(\sqrt{s}= m_Z = 125.00\) GeV with the pythia 8 Monte Carlo (MC) code [30], for each of the 11 final states for signal and associated backgrounds listed in Table 1. The Higgs decay branching fractions used are those from the hdecay code at NLO accuracy [31]. The pythia 8 signal cross sections are absolutely normalized to match our benchmark \(\sigma _\mathrm {ee\rightarrow H} = 0.28\) fb value for ISR plus \(\delta _{\sqrt{s}} = 4.1\)-MeV energy spread discussed above (second curve of Fig. 1 right). Higgs decay modes not listed in Table 1 are either completely swamped by background (e.g. \(\mathrm {H}\rightarrow \mathrm {Z}\mathrm {Z}^*\rightarrow 4j\)) or have too low \(\mathcal {B}\)’s (e.g. \(\mathrm {H}\rightarrow \mathrm {Z}\mathrm {Z}^*\rightarrow 4\ell \)) and thereby have zero expected counts for any realistic integrated luminosity. The generator-level background cross sections in Table 1 are indicatively quoted without ISR to avoid artificial enhancements of their values due to radiative returns to the Z pole, which can be easily removed experimentally (e.g. tagging the ISR photon and/or imposing requirements on the total energy of the event). The last column lists the indicative signal-over-background (S/B) expected for the dominant (irreducible) background of each channel, at the generator level without any analysis cuts. Three broad categories can be identified:

  1. (i)

    Final states with pairs of jets or tau leptons, with very large backgrounds leading to \(S/B\approx 10^{-7}\)\(10^{-5}\), except for the \(\mathrm {H}\rightarrow gg\) case for which no actual physical background exists (\(Z^*,\gamma ^*\) do not couple to gluons), but for an experimental misidentification probability of light-quarks for gluons that we take as 1% (Table 2);

  2. (ii)

    Final states from intermediate \(\mathrm {W}\mathrm {W}^*\) decays, with \(S/B\approx 10^{-3}\);

  3. (iii)

    Final states from intermediate \(\mathrm {Z}\mathrm {Z}^*\) decays with \(S/B\approx 10^{-2}\), but very small signal cross sections.

In addition, the last row of the table lists the Higgs diphoton decay mode (discovery channel at the LHC) that suffers from both, a tiny signal cross section and 8 orders-of-magnitude larger backgrounds. A swift analysis of this table allows one to identify two channels with some potentiality in terms of statistical significances, \(\mathrm {H}\rightarrow gg\) and \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*\rightarrow \ell \nu \;2j\), which both feature \(\sim \)25-ab cross sections and \(S/B\approx 10^{-3}\).

Table 1 Cross sections (including ISR and \(\delta _{\sqrt{s}} = 4.1\) MeV) times branching fractions (\(\mathcal {B}\)) for 11 final states in \(\mathrm {e^+e^-}\rightarrow \mathrm {H}(XX)\) signal processes and associated dominant \(\mathrm {e^+e^-}\rightarrow XX\) backgrounds (without ISR), and ratio of signal-over-background for each channel before any analysis cuts (the digluon S/B quoted assumes a light-\(q\rightarrow g\) mistagging rate of 1%)

It is worth noting that the background cross sections computed with pythia 8 for two-particle final states (\(\mathrm {e^+e^-}\rightarrow q\overline{q},c\overline{c}, b\overline{b},\tau \tau ,\gamma \,\gamma \)) are found consistent with those obtained running alternative calculators, such as MadGraph 5 [32, 33], but that those for 4-fermion processes with intermediate \(\mathrm {W}\mathrm {W}^*\) and \(\mathrm {Z}\mathrm {Z}^*\) are prone to ambiguities in the internal definition of the contributing diagrams, and the ISR treatment, and are not always numerically compatible among them. We trust that such differences will not significantly alter our final results, given that the applied multivariate analysis will remove most non-signal-like topologies, but a dedicated study of 4-fermion backgrounds with an alternative MC generator (such as whizard [34] or kkmc [35]) is left for a forthcoming work. In this context, a few of the quoted background diboson cross sections in Table 1 should be just taken as indicative of the order-of-magnitude irreducible contributions expected for the corresponding Higgs decay.

3 Event reconstruction and preselection

Signal and background events are generated, showered and decayed with pythia 8 (v2.26). Initial state radiation is activated for all backgrounds, and the signal cross section samples are scaled to the ISR-plus-energy-spread benchmark point discussed in Sect. 1. A detector polar angle acceptance of \(5^\circ > rsim \theta > rsim 175^\circ \) is assumed for all final-state particles (defined as those with lifetime \(c\tau _0>10\) mm). The FastJet package [36] is used to reconstruct all jets using the \(k_\mathrm {T}\) algorithm [37, 38] in its exclusive variant that clusterizes all hadrons in the event into a prefixed number \(N_j=2,4\) of jets (the \(N_j\) choice depends on the particular final state aimed at, e.g. \(\mathrm {H}\rightarrow q\overline{q}\rightarrow 2j\), \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*,\mathrm {Z}\mathrm {Z}^*\rightarrow \,2j+\ell /\nu \), or \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*\rightarrow \,4j\)). Whenever photons or charged leptons are required to be isolated, standard criteria are applied: the sum of all particles energies must be below 1 GeV within a radius \(\varDelta R=0.25\) around the \(\gamma \) or \(\ell ^\pm \) direction. Neutrinos and particles beyond the angular acceptance are added to the missing energy (\(E_\mathrm {miss}\)) 4-vector. The impact of detector (in)efficiencies on the reconstruction of relevant final states is implemented in a simplified manner, according to the performances listed in Table 2.

Table 2 Bottom (b), charm (c) and light (uds) quarks, gluon (g), tau lepton (\(\tau _\text {had}\), hadronically decaying), and photon/electron (mis)reconstruction performances assumed in this study

The (mis)tagging jet-flavour performances are beyond the current state-of-the-art reached at the LHC today, but reasonably achievable in the “clean” environment of \(\mathrm {e^+e^-}\) collisions with dedicated high-precision FCC-ee detectors after various years of operation at the Z pole and HZ energies. More details on the various jet working points assumed are provided in the next section. We note that since the analysis boils down to basically just counting the number of events sharing a given predefined final state, any detector resolution/smearing effects on kinematic properties of the reconstructed objects (jets, \(\ell ^\pm \), \(\gamma \), etc.) impact identically signals and backgrounds, will be very well controlled comparing real data and simulations and can be accounted for here just through a (small) assigned systematic uncertainty on the final yields when computing the final statistical significance of each channel.

In Table 3, we list the criteria applied to all signal and backgrounds events aiming at a first preselection of final-state topologies consistent with each considered Higgs decay channel. The goal of this first set of cuts is to remove reducible backgrounds as much as possible, while keeping the largest possible signal cross section. For the \(\mathrm {H}\rightarrow \tau \tau \) channel, we consider only the fully hadronic (\(\tau _\mathrm {had}\tau _\mathrm {had}\)) decay, which is \(0.65\!\cdot \!0.65/(0.35\!\cdot \!0.35)\approx 3.5\) times more probable than the fully leptonic one \(\mathrm {H}\rightarrow \tau _\mathrm {lep}\tau _\mathrm {lep}\) (that has thereby a negligible number of signal counts expected after cuts). The last column quotes the approximate percentage of cross section signal retained by the chosen criteria.

Table 3 Minimal event final-state definition for each considered Higgs decay channel and associated preselection efficiency (after acceptance, and reconstruction (in)efficiencies of Table 2). The \(\ell ^\pm \) symbol indicates \(\mathrm {e}^\pm , \mu ^\pm , \tau _\mathrm {lep}^\pm \) charged leptons

4 Multivariate analysis (MVA) per channel

For each reconstructed event of all generated MC samples passing the aforementioned preselection criteria per target Higgs channel, we define \(\mathcal {O}(50)\) variables for single and combined (n-wise) physics objects (jets, charged leptons, photons, neutrinos), as well as for global event properties, in order to provide as much information as possible to a subsequent MVA used to discriminate signal and the remaining backgrounds. The defined variables include kinematic components \((p_{_{\mathrm {T}}},\eta ,\phi ,E)\), charge, mass (invariant and transverse) for each single object—as well the same quantities for sums and differences of 4-momenta of selected n-wise objects combinations—the maximum and minimum values of \(p_{_{\mathrm {T}}}^{i(ij)},\,\eta ^{i,(ij)},\phi ^{i,(ij)}\), \(m_{ij}\), etc., in the event for all (pairs of) objects i (ij), as well as quantities associated with global event topologies (sphericity, linearity, aplanarity, thrust max/min, etc.). Angular information is particularly useful in diboson channels with decay leptons in order to separate final states coming through the spin-0 Higgs resonance or proceeding through t-channel processes or via spin-1 s-channel continuum and/or \(Z^{*}\), \(\gamma ^{*}\), \(\text {W}^{\pm }\) decays. For such cases, angular discrimination variables based on the Matrix Element Likelihood Analysis (MELA) [39] are also defined and incorporated into the MVA. We used the TMVA framework [40] to train and test boosted decision-tree (BDT) classifiers in order to provide statistical discrimination between each Higgs decay channel and all relevant background final states, and maximize the signal significance. Examples of the BDT variables used for a particular channel (\(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*\rightarrow \ell \nu 2j\)) are shown in Fig. 2 (right) later, as well as listed with their individual relative weights in the final signal significance in Table 5.

Fig. 2
figure 2

Left: Example of normalized BDT response distributions for signal and backgrounds in the \(\mathrm {H}\rightarrow gg\) channel. Right: Examples of a few of the most discriminating (normalized) BDT variables of the \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^* \rightarrow \ell \nu \;2j\) analysis

Table 4 lists the number of signal and background(s) events expected after preselection and BDT output cuts, for 9 different final states. We omit the \(\mathrm {H}\rightarrow \gamma \,\gamma ,c\overline{c}\) channels from the table given that they are fully swamped by backgrounds and have a negligible statistical significance. The first observation is that except for the \(\mathrm {H}\rightarrow b\overline{b}\) decay, which is anyway overwhelmed by the continuum background, the final number of signal events is (well) below 100 counts for each individual channel, and that the remaining backgrounds counts are orders-of-magnitude larger. Therefore, the leading uncertainty of the signal will be of statistical nature, and evidence of any excess will rely on an accurate control of the background systematic uncertainties (which must be well below the statistical ones). Among the listed channels, we observe that \(\mathrm {H}\rightarrow gg\) and \(\mathrm {H}(\mathrm {W}\mathrm {W}^*) \rightarrow \ell \nu \;2j\) feature the largest \(S/\sqrt{B}\) significances,Footnote 1 and are discussed in more detail in dedicated subsections below. The \(\mathrm {H}\rightarrow b\overline{b}\) channel suffers from a very large irreducible background, the MVA is unable to improve the rejection of the \(\mathrm {e^+e^-}\rightarrow b\overline{b}\) continuum much beyond the preselection result, and the final statistical significance remains very low (\(S/\sqrt{B}\approx 0.12\)). Although orders-of-magnitude smaller, we also quote the number of misidentified \(\mathrm {e^+e^-}\rightarrow c\overline{c},q\overline{q}\) background events expected for this channel, so as to assess the potential contamination from such processes if the mistagging points assumed in Table 2 are changed. The \(\mathrm {H}\rightarrow \tau _\mathrm {had}\tau _\mathrm {had}\) decay mode (as well as, similarly, the \(\mathrm {H}\rightarrow c\overline{c}\) one not listed) suffers from very low signal counts and a daunting continuum background that yields a negligible statistical significance (\(S/\sqrt{B}\approx 0.02\)). Among \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*\) final states, the fully leptonic one (\(2\ell 2\nu \)) features the smallest branching fraction and thereby very low final signal counts. For the two others, lepton+jets (\(\ell \nu \;2j\)) and fully hadronic (4j) decays, although they have the same branching fraction, only the former can take advantage of background removal by exploiting the different \(\mathrm {W}^\pm \rightarrow \ell ^\pm \) decay lepton polarizations for signal and background processes, as explained below. Finally, Table 4 shows that the \(\mathrm {H}\rightarrow \mathrm {Z}\mathrm {Z}^*\) final states will have less than \(\sim \)10 signal events expected after cuts, over much larger backgrounds, and appear statistically marginal in terms of signal significance.

Table 4 Number of reconstructed events expected after preselection N(presel.) and BDT output N(MVA) cuts, for s-channel Higgs decay modes and associated dominant backgrounds in \(\mathrm {e^+e^-}\) collisions at \(\sqrt{s}= m_\mathrm {H}\) (\(\delta _{\sqrt{s}} = 4.1\) MeV and \(\mathcal {L}_\mathrm {\tiny {int}}= 10\) ab\(^{-1}\))

4.1 Analysis of \(\mathrm {e^+e^-}\rightarrow \mathrm {H}(gg) \rightarrow jj\)

At face value, the digluon decay is a very promising signal channel as it has the third most abundant Higgs branching fraction (\(\mathcal {B}(\mathrm {H}\rightarrow gg) = 8.2\%\)) and has no irreducible physical background because Z and \(\gamma \) bosons do not couple to gluons. However, the production of light quark (uds) pairs in the much more abundant \(\mathrm {e^+e^-}\rightarrow \mathrm {Z}^*,\gamma ^*\rightarrow q\overline{q}\) process (with cross sections million times larger than that of the signal, Table 1), jeopardizes the observation of \(\mathrm {H}\rightarrow gg\) because experimentally separating jets issuing from the showering and hadronization of light-quarks and from gluons is not perfect.Footnote 2 An illustrative case would be the emission of a very hard gluon from each one of the \(\mathrm {Z}^*\rightarrow q\overline{q}\) quarks that could mimic the Higgs digluon final state. Fortunately, in the last years there has been tremendous progress on quark-gluon tagging studies exploiting jet substructure properties with machine learning techniques [42]. The latest LHC results reach \(\varepsilon _g\approx 60\%\) gluon efficiencies with \(\varepsilon ^\mathrm {mistag}_{q\rightarrow g}\approx 10\%\) false positive rates using advanced multivariate analyses [43, 44], or \(\varepsilon ^\mathrm {mistag}_{q\rightarrow g}\approx 7\%\) [45] further exploiting Lund jet plane information [46]. Reaching mistagging rates down to \(\varepsilon ^\mathrm {mistag}_{q\rightarrow g}\approx 1\%\), while keeping large gluon reconstruction efficiencies, appears feasible in the clean and kinematically constrained QCD environment of future \(\mathrm {e^+e^-}\) machines, in particular taking advantage of the very large samples of \(\mathrm {Z}\rightarrow q\overline{q}(g)\) events at the Z pole, and the \(\mathcal {O}(10^5)\) \(\mathrm {H}\rightarrow gg\) events collected during the \(\mathrm {e^+e^-}\rightarrow \mathrm {Z}\mathrm {H}\) runs, available for dedicated studies of the different colour, radiation, spin, charge, hadronization properties of quark and gluon jets [47,48,49]. The addition of advanced hadron identification capabilities to the FCC-ee detectors for dedicated flavour [50] (and QCD) studies, will further reduce the parton-to-hadron fragmentation uncertainties [51]. Our assigned (mis)reconstruction jet working point for this channel is \((\varepsilon _g,\varepsilon ^\mathrm {mistag}_{q\rightarrow g})=(70\%,1\%)\), which leads to a \(10^{-4}\) background rejection factor when requiring two gluon-tagged exclusive jets in the event. The corresponding number of events expected in 10 ab\(^{-1}\) for signal and background, after acceptance and efficiency preselections, is 110 and \(\sim \)61 000, respectively (Table 4). The subsequent MVA is performed removing beforehand any jet variable that may have been potentially used to define the light-q/gluon separation, and which is therefore de facto already accounted for by the chosen preselection (mis)tagging efficiency. An analysis of the BDT response (Fig. 2, left) indicates a maximum significance reached for a BDT output cut that further reduces the background by a factor \(\times 0.06\) while only losing 50% of the signal. The final statistical significance reached for this channel is approximately given by \(S/\sqrt{B} = 55/\sqrt{2400} = 1.1\sigma \) per FCC-ee IP per year.

Table 5 Indicative list of BDT variables used in the \(\mathrm {H} \rightarrow \mathrm {W}\mathrm {W}^* \rightarrow \ell \nu \;2j\) analysis, with their relative weight in the statistical significance for this channel

4.2 Analysis of \(\mathrm {e^+e^-}\rightarrow \mathrm {H}(\mathrm {W}\mathrm {W}^*) \rightarrow \ell \nu \;2j\)

The event signature of the \(\mathrm {H}(\mathrm {W}\mathrm {W}^*) \rightarrow \ell \nu \;2j\) signal is one isolated charged lepton, missing energy from the neutrino, and two exclusive jets. In principle, such a final state can be present in multiple reducible backgrounds (Table 4), but the MVA study allows to remove basically all of them, leaving just a fraction of the \(\mathrm {e^+e^-}\rightarrow \mathrm {W}\mathrm {W}^* \rightarrow \ell \nu \;2j\) continuum. Table 5 lists the BDT variables used in the analysis, together with their relative weight in the final signal significance for this channel. Apart from blindly running the MVA, it is instructive to show the impact of different kinematic cuts to get rid of reducible backgrounds. Thus, for example, a significant fraction of \(\mathrm {e^+e^-}\rightarrow q\overline{q},c\overline{c},b\overline{b}\) events can be eliminated by requiring e.g.: \(E_{j1,j2} < 52,45\) GeV, \(m_{\mathrm {W}(\ell \nu )} > 12\) GeV, \(E_{\ell } > 10\) GeV, \(E_\mathrm {miss}>\) 20 GeV. The additional requirement on the mass of the missing 4-momentum vector \(m_\mathrm {miss} < 3\) GeV further discards many \(\mathrm {e^+e^-}\rightarrow \tau \tau \) events.

The remaining background is dominated by the \(\mathrm {W}\mathrm {W}^*\) continuum that can then be reduced by exploiting, among others, the different \(\mathrm {W}^\pm \) polarizations for signal and background processes. The signal decay \(\mathrm {H} \rightarrow \mathrm {W}^+\mathrm {W}^-\) is that of a scalar to a pair of distinguishable spin-1 bosons. The subsequent W bosons decays maximally violate chirality: a \(\mathrm {W}^+\) (\(\mathrm {W}^-\)) boson preferentially emits a \(\ell ^+\) (\(\ell ^-\)) along (against) its spin direction. The anticorrelation between the \(\mathrm {W}^\pm \) polarizations expected in spin-zero Higgs decays is transferred into a correlation between the momenta of the charged leptons in their decays that manifests itself in the distributions of relative \(\ell ^\pm \) polar angles and a preference for a small azimuthal angle (\(\phi \)) between the \(\ell ^+\ell ^-\) pair. Such angular correlations of the emitted charged leptons are encoded into the MELA variables exploited by the ATLAS and CMS collaborations to separate Higgs decays from \(\mathrm {W}^+\mathrm {W}^-\) backgrounds in their original searches [4, 5]. Examples of discriminating BDT variables distributions for signal and backgrounds are shown in Fig. 2 (right). Applying an appropriate cut on the BDT response output, keeps a 58% efficiency on signal, while removing 80% of the continuum background. The final statistical significance of this final state is of the order of \(S/\sqrt{B} = 55/\sqrt{11\,000}\approx 0.5\sigma \) per FCC-ee IP per year.

5 Beam monochromatization, expected signal significance and \(y_\mathrm {e}\) constraints

Table 6 lists the statistical significances, in units of std. deviations \(\sigma \), for each individual s-channel Higgs decay channel studied here, for our baseline \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}}) = (4.1\,\mathrm {MeV},10\,\mathrm {ab}^{-1})\) monochromatization assumption. The combined final significance and associated 95% CL upper limit are calculated considering a multibin counting experiment with a profile likelihood for hypothesis test and confidence interval, using the RooStats statistical package [52]. We have considered \(10^{-4}\) fractional systematic uncertaintiesFootnote 3 for the backgrounds, consistent with the expected experimental precision aimed at FCC-ee [24]. The final combined significance is \(1.3\sigma \), which is also very close to the naive quadratic sum of individual \(S/\sqrt{B}\) values per channel. Such a result is equivalent to setting a 95% CL upper limit of 2.6 times the SM Higgs s-channel cross section, per FCC-ee IP and per year. Since the cross section depends on the square of the electron Yukawa, \(\sigma _{\mathrm {e^+e^-}\rightarrow \mathrm {H}}\propto y_\mathrm {e}^2\), this corresponds to placing an upper bound on the coupling at \(\sqrt{2.6}=1.6\) times the SM value, i.e. \(|y_\mathrm {e}|<1.6|y^\mathrm {\textsc {sm}}_\mathrm {e}|\) (95% CL).

The expected final significance of the \(\sigma _{\mathrm {e^+e^-}\rightarrow \mathrm {H}}\) measurement and associated 95% CL limits on \(|y_\mathrm {e}|\), derived for a benchmark \(\delta _{\sqrt{s}} = 4.1\) MeV collision-energy spread and \(\mathcal {L}_\mathrm {\tiny {int}}= 10\,\mathrm {ab}^{-1}\) integrated luminosities, can be easily derived for any other combination of \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}})\) values achievable through beam monochromatization. Figure 3 shows the bidimensional maps for the significance of s-channel Higgs production (left) and the corresponding 95% CL upper limits on the electron Yukawa (right), as a function of both parameters. The signal significance and associated upper limits improve with the square root of the integrated luminosity (along the x-axes of both plots) and diminish for larger values \(\delta _{\sqrt{s}}\) (along the y-axes of the maps) following the relativistic Voigtian dependence of the signal yield on the energy spread shown in Fig. 1 (right).

Table 6 Individual significances (in std. deviations \(\sigma \)) expected per decay channel for s-channel Higgs boson production in \(\mathrm {e^+e^-}\) collisions at FCC-ee for \(\mathcal {L}_\mathrm {\tiny {int}}= 10\) ab\(^{-1}\) and \(\delta _{\sqrt{s}}=4.1\) MeV. The last column quotes the combined significance
Fig. 3
figure 3

Left: Significance contours (in std. dev. units \(\sigma \)) in the c.m. energy spread vs. integrated luminosity plane for the resonant \(\sigma _{\mathrm {e^+e^-}\rightarrow \mathrm {H}}\) cross section at \(\sqrt{s}= m_\mathrm {H}\). Right: Associated upper limits contours (95% CL) on the electron Yukawa \(y_\mathrm {e}\). The red curves show the range of parameters presently reached in FCC-ee monochromatization studies [20, 21]. The red star indicates the best signal strength monochromatization point in the plane (the pink star over the \(\delta _{\sqrt{s}} = \varGamma _\mathrm {H} =4.1\) MeV dashed line, indicates the ideal baseline point assumed in our default analysis). All results are given per IP and per year

The red curves in Fig. 3 show the current expectations for the range of \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}})\) values achievable at FCC-ee with the investigated monochromatization schemes [20, 21]. Without monochromatization, the FCC-ee natural collision-energy spread at \(\sqrt{s}= 125\) GeV is about \(\delta _{\sqrt{s}} = 46\) MeV due to synchrotron radiation. Its reduction to the few-MeV level desired for the s-channel Higgs run can be accomplished by means of monochromatization, e.g. by introducing nonzero horizontal dispersions at the IP (\(D_x^*\)) of opposite sign for the two beams in collisions without a crossing angle. The beam energy spread reduction factor is given by \(\lambda = \sqrt{({D_x^*}^2\sigma _{\delta }^{2})/(\varepsilon _x\beta _x^*)+1}\), where \(\beta _{x(y)}^*\) denotes the horizontal (vertical) beta function at the IP and \(\varepsilon _{x(y)}\) the corresponding emittance. The need to generate a significant IP dispersion implies a change of beamline geometry in the interaction region and the use of crab cavities to compensate for the existing, or remaining, crossing angle. A nonzero IP dispersion leads to an increase of the transverse horizontal emittance from beamstrahlung, thereby impacting the beam luminosity. Optimization of the IP optics parameters (\(D_x^*\), \(\beta _{x,y}^*\), etc.) yields the corresponding red curves of Fig. 3. For the lowest collision-energy spread achieved of \(\delta _{\sqrt{s}} = 6\) MeV, the anticipated monochromatized luminosity per IP exceeds \(10^{35}\,\mathrm {cm}^{-2}\mathrm {s}^{-1}\) [21]. This translates into an integrated luminosityFootnote 4 of at least 1.2 ab\(^{-1}\) per IP per year. One can reach larger integrated luminosities at the expense of a worse beam energy spread. The point (red star) over the red curves that has the highest signal strength today corresponds to \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}}) \approx (7\,\mathrm {MeV}, 2\,\mathrm {ab}^{-1})\), to be compared to our original baseline point (pink star) over the \(\delta _{\sqrt{s}} = \varGamma _\mathrm {H} =4.1\) MeV dashed line. For such a 7-MeV c.m. energy spread, the peak of the relativistic Voigtian distribution describing the s-channel cross section is located at about 1 MeV above the mass of the Higgs boson (Fig. 1, right). Therefore, the optimal c.m. energy of the dedicated \(\mathrm {e^+e^-}\) run needs also to be carefully chosen to maximize the resonant cross section for any given monochromatization point.

Compared to our baseline values (pink stars on the plots), the signal significance for the currently best monochromatization settings, \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}}) \approx (7\,\mathrm {MeV}, 2\,\mathrm {ab}^{-1})\), drops to \(\mathcal {S}\approx 0.4\sigma \)/year/IP, and the corresponding upper bound on the \(\mathrm {e}^\pm \) Yukawa becomes \(y_\mathrm {e}\lesssim 2.5 y^\mathrm {\textsc {sm}}_\mathrm {e}\) (95% CL) per year and per IP. Assuming 2 years of FCC-ee operation at the Higgs pole and combining four detectors/IPs, this would translate into a \(1.2\sigma \) significance and a \(y_\mathrm {e}\lesssim 1.6 y^\mathrm {\textsc {sm}}_\mathrm {e}\) limit. Such a result, although clearly short of an evidence for s-channel Higgs production, is still about 100 (30) times better [53] than that reachable at HL-LHC (FCC-hh [54]) and would imply setting a constraint on new physics affecting the electron-Higgs coupling above \(\varLambda _{\textsc {bsm}} > rsim 110\) TeV.

Given that any improved analysis of the Higgs decay channels is unlikely to increase much more the final signal significance, alternative paths need to be considered in order to measure more precisely the electron Yukawa coupling at FCC-ee. The possibility of introducing beam longitudinal polarizations (\(P_\mathrm {L}\)) would enhance the signal by \((1+P_\mathrm {L}^2)\) and suppress backgrounds by \((1-P_\mathrm {L}^2)\), i.e. running with \(P_\mathrm {L} = 68\%\) (90%) would increase by a factor of two (four) the statistical significance of the signal. However, for realistic longitudinal polarizations reachable at FCC-ee (\(P_\mathrm {L}=20\)–30%) the gain would be insufficient and higher polarizations would significantly reduce the luminosity. The only approach seemingly left to carry out an \(\mathrm {e^+e^-}\rightarrow \mathrm {H}\) measurement with a sensitivity reaching the SM electron Yukawa level requires improving the beam monochromatization beyond the current state-of-the-art [20, 21]. Alternative or modified monochromatization scenarios [55,56,57] are being explored that however, for now, do not improve the results of the red curves shown in Fig. 3.

6 Summary and outlook

The prospects for a potential FCC-ee measurement of the direct s-channel Higgs boson production in \(\mathrm {e^+e^-}\) collisions at \(\sqrt{s}= m_\mathrm {H}\) have been studied as a means to determine the Higgs Yukawa coupling of the electron (\(y_\mathrm {e}\)). The three main challenges of such a measurement have been discussed: (i) the need to accurately know (within MeV’s) beforehand the value of the Higgs boson mass where to operate the collider, (ii) the smallness of the resonant Higgs boson cross section (few hundred ab) due to ISR and beam energy spread (\(\delta _{\sqrt{s}}\)) that requires to monochromatize the beams, i.e. reduce \(\delta _{\sqrt{s}}\) at the few MeV scale, while still delivering large (few ab\(^{-1}\)) integrated luminosities \(\mathcal {L}_\mathrm {\tiny {int}}\), and (iii) the existence of multiple backgrounds with orders-of-magnitude larger cross sections than the Higgs signal decay channels themselves. The knowledge of \(m_\mathrm {H}\) with a few MeV accuracy seems feasible at FCC-ee as per dedicated studies reported in Ref. [23]. This present work has focused on the points (ii) and (iii) above, by performing a generator-level study that has chosen as benchmark point a baseline monochromatization scheme leading to \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}})=(4.1\,\mathrm {MeV},10\,\mathrm {ab}^{-1})\), corresponding to a peak s-channel cross section of \(\sigma _{\mathrm {e^+e^-}\rightarrow \mathrm {H}} = 280\) ab.

Large simulated event samples of signal and associated backgrounds have been generated with the pythia 8 Monte Carlo (MC) code for 11 Higgs boson decay channels. A simplified description of the expected experimental performances has been assumed for the reconstruction and (mis)tagging of heavy-quark (c, b) and light-quark and gluons (udsg) jets, photons, electrons and hadronically decaying tau leptons. Generic preselection criteria have been defined that target the 11 Higgs boson channels, suppressing reducible backgrounds while keeping the largest fraction of the signal events. A subsequent multivariate analysis of \(\mathcal {O}(50)\) kinematic and global topological variables, defined for each event, has been carried out. Boosted-Decision-Trees (BDT) classifiers have been trained on signal and background events, to maximize the signal significances for each individual channel. The most significant Higgs decay channels are found to be \(\mathrm {H}\rightarrow gg\) (for a gluon efficiency of 70% and a uds-for-g jet mistagging rate of 1%) and \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*\rightarrow \ell \nu 2j\). Combining all results, a \(1.3\sigma \) signal significance can be achieved, corresponding to an upper limit on the e\(^\pm \) Yukawa coupling at 1.6 times the SM value: \(|y_\mathrm {e}|<1.6|y^\mathrm {\textsc {sm}}_\mathrm {e}|\) at 95% confidence level (CL), per FCC-ee interaction point (IP) and per year. Such a bound is about \(\times \)100 (\(\times \)30) times better than that reachable at HL-LHC (FCC-hh) and can be translated into a lower limit on the energy scale of any physics beyond the SM (BSM) affecting the electron Yukawa coupling, of \(\varLambda _{\textsc {bsm}} \approx v^{3/2}(\sqrt{2}m_\mathrm {e}\cdot (y_\mathrm {e}/y^\mathrm {\textsc {sm}}_\mathrm {e}))^{-1/2} > rsim 110\) TeV.

Details on the status of ongoing FCC-ee monochromatization studies have been provided. The current monochromatization settings with largest Higgs signal strength correspond to \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}})\approx (7\,\mathrm {MeV},2\,\mathrm {ab}^{-1})\) and translate into a \(0.4\sigma \) significance on the Higgs boson cross section, or correspondingly a \(|y_\mathrm {e}|<2.6|y^\mathrm {\textsc {sm}}_\mathrm {e}|\) (95% CL) upper bound, per IP and per year. Forthcoming extension and consolidation of this work, in the context of the anticipated FCC feasibility study, require at least the following activities:

  1. (i)

    Confirming the current signal significances with alternative MC event generators for the Higgs diboson backgrounds, in particular for the promising \(\mathrm {H}\rightarrow \mathrm {W}\mathrm {W}^*\rightarrow \ell \nu 2j\) channel.

  2. (ii)

    Studying the improvements of the FCC-ee detectors design needed in order to achieve the required accuracy and precision in key aspects of the analysis, such as the small light-quark-for-gluon mistagging efficiency of 1% assumed in the key \(\mathrm {H}\rightarrow gg\) channel.

  3. (iii)

    Redoing the analysis using a more realistic (parametrized or full simulation) description of the detector response to more accurately assess the impact on the final signal significances of the reconstruction and selection efficiencies expected at FCC-ee.

  4. (iv)

    Continuing and extending the accelerator monochromatization studies to improve the currently best FCC-ee working point of \((\delta _{\sqrt{s}},\mathcal {L}_\mathrm {\tiny {int}})\approx (7\,\mathrm {MeV},2\,\mathrm {ab}^{-1})\), aiming at further reducing \(\delta _{\sqrt{s}}\) while increasing \(\mathcal {L}\), and developing the corresponding optical lattices for the required beam optics parameters at the IP.

It is worth noting that running FCC-ee at \(\sqrt{s}= m_\mathrm {H}\) for a couple (or more) years can provide many more scientific outputs than the direct s-channel measurement considered here. Indeed, integrating tens of ab\(^{-1}\) in \(\mathrm {e^+e^-}\) collisions at 125 GeV provides useful means to accurately determine the number of light neutrino families via \(Z(\nu \nu )\gamma \) radiative return [58], search for weakly coupled BSM physics between the Z and Higgs mass poles [59], and carry out other luminosity-demanding SM studies not accessible at the Z pole.

In summary, the results presented in this essay demonstrate that FCC-ee is the most well-suited (if not, arguably, the unique) collider that can aim at a measurement of the electron Yukawa coupling via direct s-channel Higgs boson production. Such a measurement has many fundamental physics motivations and implications, among which: (i) it will explore the (so far hypothetical) Higgs mass generation mechanism for elementary particles of the first family of fermions that form the stable matter of the visible universe, (ii) it will scrutinize the electron’s Yukawa coupling that, through its impact on the electron mass, sets the size of atoms and their energy levels (the Bohr radius is proportional to \(1/m_\mathrm {e}\)), (iii) it can access BSM scalar physics connected to the electron above the \(\sim \)100 TeV scale, and (iv) it can directly probe the potential presence of any new particle that is quasi-degenerate (at the MeV level) with the Higgs boson mass.