1 Introduction

In the early days of computational chemistry, it was routine to check the stability of the optimized Hartree-Fock (HF) wave function. As the field of computational chemistry has grown to include more neophytes utilizing a theoretical approach in the course of research, however, this practice has fallen into disuse. This can be problematic in the prediction of the structural and energetic properties of the systems under investigation, especially when determination of the correct electronic state is essential to achieve results that are even qualitatively correct.

It is known that there is more than one solution to the Hartree-Fock equations [13]. In fact, within a finite basis there can be O(3N) solutions for a closed-shell system, where N is equal to the number of basis functions used [1]. Most HF algorithms populate the initial orbitals based on the aufbau principle, wherein the lowest energy orbitals subject to the initial guess of LCAO coefficients are populated to determine the lowest energy solution and thus the ground state of the molecule [46]. There are cases, however, when the algorithm can produce an excited state determinant rather than the ground state as has been noted in the literature [3, 7, 8]. Optimization to an excited state can happen when there is a small HOMO-LUMO gap, when there are nearly degenerate determinants, or in other cases where a multireference treatment is more appropriate such as when bonds are broken or formed. Even multireference calculations depend upon single-reference methods as the source of the initial orbitals from which an active space is chosen, and this is often reflected in the rate of convergence of the multireference wave function.

Optimization of an excited state determinant with Hartree-Fock orbitals forms the basis of extended Hartree-Fock theory for excited states [9]. However, it has been shown that in the case of a closed shell system, an electron in a virtual orbital does not experience the full interaction of the 2N electrons, and so a modified Fock operator should be employed to obtain a well-defined excited state in terms of a single determinant of Hartree-Fock orbitals. When the excited states are obtained unintentionally through population based on orbital energies, the Fock operator is not modified to account for the missing electronic interactions. Therefore, while the excited states obtained may be representative of the dominant configuration of a multiconfigurational excited state wave function, they do not include the entire mean field correlation.

It is imperative that the optimized Hartree-Fock wave function be scrutinized to ensure that the correct state has been determined. While there are some systems for which optimization to an excited state determinant is not detrimental to geometry optimizations (e.g. situations in which the potential energy curves (PEC) are mostly parallel, such as with Ln(III)-halide bonds), [10] this frequently is not the case. The optimized orbitals need to be investigated to ensure that the appropriate orbitals are being populated, a process that generally can rely on chemical intuition.

A condition of Hartree-Fock convergence is that the orbital gradient is zero, i.e. ∂E/∂C i  = 0. However, as this condition can be met at several places on the orbital potential energy surface, the stability of the solution may need to be tested through the calculation of the orbital Hessian matrix [11]. Negative eigenvalues indicate that there is still a lower energy solution that can be reached; essentially, this indicates that a saddle point in the potential energy surface has been found, rather than a minimum. Such a situation could be remedied in a black-box manner by performing a Hartree-Fock instability test. However, this is only beneficial when the Hartree-Fock solution is a saddle point on the orbital potential energy surface.

Often testing the stability of the wave function based on the orbital Hessian matrix can be unnecessary due to the optimization technique employed. One such situation is when a Newton-Raphson (NR) optimization technique is used. This technique depends on the calculation of both the orbital gradient and orbital Hessian at each step as shown in Eq. 1.

$$ C_{i,n + 1} = C_{i,n} - \frac{{\frac{\partial E}{{\partial C_{i,n} }}}}{{\frac{{\partial^{2} E}}{{\partial C_{i,n}^{2} }}}} $$
(1)

The orbital Hessian is used as in the denominator of the second term of Eq. 1, and thus, the NR method ensures that a local minimum on the orbital potential energy surface is located rather than a saddle point. However, the NR method only determines the nearest minimum rather than the global minimum. There exists a basin of attraction that is bounded by saddle points on the orbital potential energy surface. Only the minimum in this region is located and there is no knowledge of any other minima outside the basin of attraction. Each local minimum on the orbital potential energy surface corresponds to a unique single determinant solution of the Hartree-Fock equations, i.e. a unique electronic state . Thus, the NR method guarantees convergence to a local minimum as long as ∂E/∂C i  ≠ 0 at the initial guess.

Due to the expense involved in the computation of the full orbital Hessian, other convergence aides such as an approximate NR method or direct inversion of iterative subspace (DIIS) is frequently employed [12]. Approximate NR methods frequently use an exact orbital gradient with an approximate orbital Hessian. Even the approximate orbital Hessian is often sufficient to ensure both convergence and convergence to a minimum, but the minimum will be the nearest minimum in the same basin of attraction. DIIS achieves convergence in an entirely different manner. An error vector is used to determine convergence as shown in Eq. 2.

$$ {\mathbf{e}} = {\mathbf{FDS}} - {\mathbf{SDF}} $$
(2)

The error vector is constructed from the Fock matrices (F), density matrices (D), and overlap matrices (S) from previous SCF iterations. Convergence is reached when the DIIS error goes to zero. However, the error vector is related to the orbital gradient and not the orbital Hessian. So with DIIS, there may occasionally be a need to test the stability of the Hartree-Fock solution since it converges to the nearest stationary point rather than the nearest minimum on the orbital potential energy surface.

For ease of both convergence and the determination of the correct ground state , a good initial guess of the LCAO coefficients used to construct the molecular orbitals is essential. The initial guess is the only control anyone has over which solution is obtained since this determines the basin of attraction. There are several different options for the initial guess within each commonly available software package. One popular method is the diagonalization of a Fock matrix that contains only the one-electron terms, referred to as the core Hamiltonian matrix. Within this paper, this is referred to as the Hcore guess, following the nomenclature in GAMESS [13]. While this initial guess is usually fairly poor, it has the advantage of being implemented in most computational codes [5]. Another common approach is to use the guess generated by a semi-empirical procedure such as Extended Hückel Theory (referred to in this paper as the Hückel guess) and projected onto the current basis. The initial guess provided by Hückel theory is generally superior to that provided by Hcore, yet can still have difficulty assigning the initial electronic configuration based on the orbital population [13]. Another approach that can be useful, especially when calculating a potential energy curve , is to use the optimized orbitals from a nearby point on the potential energy surface, as is done automatically during a gradient driven geometry optimization.

As Hartree-Fock does not account for electron correlation beyond mean field correlation, a myriad of correlated electron methods have been developed that are based upon a Hartree-Fock reference wave function. Configuration interaction (CI), coupled cluster (CC), and many body perturbation theory are examples of such theories designed to recover electron correlation energy. Full CI calculations do not depend on the quality of the reference wave function, yet full CI with an appropriately large basis set quickly becomes computationally intractable. Because of this increase in computational cost (in terms of memory and CPUs required for the calculation), more approximate methods are used. An example of a more approximate method would be coupled cluster including single and double excitations, with perturbative triples [CCSD(T)]. Although truncated, CCSD(T) has been shown to calculate ground state properties such as heats of formation, at times achieving more accurate results (i.e. closer to results achieved by full CI) than less approximate methods such as CCSDT [14].

However, the accuracy of truncated correlated methods can be sensitive to the choice of the reference wave function. CCSD(T) is a single reference post-Hartree-Fock method, generally based upon a HF wave function. If the reference wave function as determined by HF corresponds to an excited state determinant, truncated correlated methods are not necessarily able to produce the correct ground state. Even single reference theories such as the completely renormalized coupled cluster method, including singles, doubles, and perturbative triples [CR-CC(2,3)] that have been shown to treat some multireference problems [15, 16] (i.e. bond breaking, singlet-triplet gaps in biradical systems, and other systems with strong static correlation) still are subject to the limitations of the reference wave function. Furthermore, while it is possible to converge to a ground state wave function while using an excited state reference, the amplitudes can be much more challenging to converge.

There are other single-reference methodologies that use more than one determinant within their formulation, for example the spin flip method [17, 18]. The spin flip method and its variants use multiple determinants (the reference state and additional excited states that result from a spin flip of an electron) and are able to better describe events such as bond breaking. The description of the reference state can increase to higher correlated methods, yet this would increase the computational cost involved within the calculation. Additionally, for the simplest version of the spin-flip method, Hartree-Fock is used to describe the reference system, so there is still some dependence on HF being able to determine the correct electronic configuration.

Multireference methods such as the multiconfigurational self-consistent field (MCSCF) [19] and complete active space perturbation theory, second order (CASPT2) [20], among many others, are designed specifically to recover non-dynamical correlation energy. However, there are drawbacks to these methods [21, 22]. The scaling of these methods is such that it is generally limited to systems of no more than sixteen active electrons within sixteen active orbitals [19]. Furthermore, the selection of an appropriate active space can be system dependent and ensures that these methods are by no means “black box”. The HF wavefunction generally serves as the reference wavefunction for these methods as well and can be used to help determine the orbitals that should be included within the active space. A wavefunction that has converged to the wrong state can lead to very slow convergence or to completely inappropriate orbitals included within the active space.

Another popular approach to incorporate some degree of electron correlation is density functional theory (DFT). It has found use in the calculation of ground state properties for organic and inorganic molecules [23]. DFT has the benefit of including correlation within the calculation beyond the mean field correlation that is included within Hartree-Fock, yet the computational cost is on par with a Hartree-Fock calculation. DFT generally achieves a reasonable balance between computational cost and accuracy [23]. However, as a single reference method, it may suffer from some of the same limitations as Hartree-Fock theory.

The molecules included in this study were chosen for illustrative purposes only. While the molecules all have been the focus of prior extensive theoretical and experimental studies, our goal is not to provide a broad review of the literature, but rather to illustrate some of the problems that can manifest when the optimized orbital occupancy is not considered.

2 Computational Methods

The diatomic molecules chosen for this study were O2, F2, Cl2, Br2, LiF, NaCl, CaO, MgO, ScO, FeO, TiO, YO, and ZrO. These molecules cover many parts of the periodic table including main group diatomics with light and heavy atoms, diatomics containing s-block elements, and diatomics containing transition metals. Additionally, both closed shell and open shell species are included. All molecules studied were the neutral species. The experimental bond lengths were taken from the NIST Chemistry WebBook (http://www.webbook.nist.gov) [24].

To gauge the possible multireference character of the molecules, T1 and D1 diagnostic values were calculated [2527]. The T1 and D1 diagnostic values are related to the magnitude of the oscillator strength of single excitations and thus are frequently used to estimate the multireference character of a molecule. A T1 value of 0.02 and a D1 value of 0.05 are considered the multireference thresholds for main group and s-block containing molecules, while the more recently proposed thresholds of T1 greater than 0.05 and D1 greater than 0.15 are used for transition metal-containing compounds [28].

Calculations were performed using GAMESS [13]. Restricted Hartree-Fock (RHF or ROHF) calculations were performed for all of the molecules. Unrestricted HF (UHF) calculations also were performed for O2 and FeO as UHF can describe multireference character arising from bond breaking, albeit with the disadvantage of producing wave functions that are not spin eigenfunctions. CR-CC(2,3) calculations also were performed to examine the impact of the Hartree-Fock reference on a correlated wave function method. The Sapporo-2012 triple-ζ all-electron basis sets were used for all calculations [29]. The Sapporo-2012 basis sets were chosen because they cover most of the periodic table and are generally more compact than other correlation consistent basis sets. Potential energy curves (PECs) for each diatomic were calculated from about 1.4 Å to about 3.5 Å in 0.1 Å increments, then decreased to 0.01 Å and then 0.001 Å around the minimum of the curve, when possible in order to determine the equilibrium bond length accurate to 10−3 Å. Restricted open-shell (RO) DFT also was used to calculate PECs for FeO using four popular DFT functionals: B3LYP [30, 31], PBE0 [32, 33], M06 [34], and M11 [35]. All calculations were restricted to C2v symmetry.

When single point energy calculations resulted in an excited state for FeO, the full excited state potential energy curves were constructed using the Maximum Overlap Method (MOM) of Gilbert, Besley, and Gill [2]. In this approach, excited state solutions to the Hartree-Fock equations are determined by populating the orbitals that overlap the most with the previously occupied orbitals in contrast to occupation according to the aufbau principle, in which the lowest energy orbitals are always populated first. Using this approach keeps the wave function from collapsing to the lowest energy solution.

3 Results and Discussion

The calculated ground state of each molecule, the experimental bond lengths, and the equilibrium bond lengths based on the computed potential energy curves at both the HF and CR-CC(2,3) levels of theory are shown in Table 1. Bond lengths computed using both the Hcore and Hückel guesses are reported, as well as the mean signed deviation (MSD), mean absolute deviation (MAD), and root-mean-square deviation (RMSD).

Table 1 Equilibrium bond distances (Å), determined from experiment (req) and from the potential energy curves (rPEC)

On average, HF underestimates the bond lengths by 0.014 and 0.020 Å, while CR-CC(2,3) overestimates the bond lengths by 0.033 and 0.039 Å for the Hcore and Hückel guesses, respectively, as accounting for electron correlation tends to make the electron density more diffuse [36, 37]. Two notable exceptions from this trend are NaCl and FeO. The HF calculated equilibrium bond length for NaCl is 0.027 Å too long using either guess. In the case of FeO, the HF calculated bond length using the Hcore guess is 0.054 Å too long, while the Hückel guess is unable to produce a smooth curve in the region of equilibrium, so an estimation of the equilibrium bond length cannot be made. The bond length for CaO calculated with CR-CC(2,3) is significantly too long with a deviation greater than 0.1 Å. Finally, HF produces the same equilibrium bond lengths for each molecule, regardless of initial guess. Overall, the equilibrium bond lengths calculated with CR-CC(2,3) vary between the different initial guesses, as each method populates the orbitals in a slightly different manner. Generally, the deviation between guesses is not large (e.g. 0.001 Å between the Hcore and Hückel guess for Cl2). However, in some cases, the deviation can be quite large, as for ZrO with a deviation of 0.023 Å between the two guesses.

3.1 Main Group Diatomics

The main group diatomics that were included within this study are O2, F2, Cl2, and Br2. The T1/D1 diagnostic values are shown in Table 2. The halide diatomics are well-behaved systems, in that they have low multireference character and the curves that are calculated are smooth and continuous at all points considered. This is to be expected, given that they are all closed shell singlets. The multireference character of the molecule increases as the bond is stretched, yet ROHF still optimizes to a single state wave function for each molecule. Of the main group diatomics, triplet O2 exhibits the most multireference character throughout the entire calculated PEC. Even at equilibrium, the T1/D1 values for triplet O2 approach the multireference threshold, while the CCSD amplitudes do not even converge further towards dissociation (i.e. 3.5 Å). This is indicative of significant multireference character in this area of the curve due to the breaking of the bonds and demonstrates that the single reference wave function determined with HF is not a suitable reference. The multireference character is manifested by the oscillation between different states throughout the entire curve (see Fig. 1).

Table 2 T1/D1 diagnostics for the main group molecules
Fig. 1
figure 1

Points calculated for O2 at the ROHF level of theory, using the Hcore and Hückel initial guess. The point at 2.7 Å using the Hcore guess did not converge

Both initial guesses produce vastly different results at large internuclear distances. Near the equilibrium region, a 3A2 state is found which is a direct product of B1 and B2 orbitals in the C2v point group. These orbitals correspond to the π* antibonding orbitals that result from the linear combination of 2px atomic orbitals and 2py atomic orbitals, respectively. In the region of intermediate internuclear distances (IID, the area of the potential energy curve between equilibrium and dissociation) the points fall on a curve belonging to a state of 3A1. This state results from the direct product of two singly occupied B1 orbitals corresponding to the π and π* orbitals from the overlap of px atomic orbitals. This gives an overall bond order of 1, which is clearly on a different PEC than the ground state . As the internuclear distance grows, the curve resulting from the Hückel initial guess begins to oscillate drastically between different states. The upper state produced by the Hückel guess leads to homolytic dissociation, while the lower state leads to a heterolytic dissociation.

The triplet O2 PEC also was calculated via UHF, as the unrestricted formalism should be able to better describe the bond breaking region of the curve (see Fig. 2).

Fig. 2
figure 2

Points calculated for O2 at the UHF level of theory, using the Hcore and Hückel initial guesses. The difference between the calculated <S2> and the <S2> for the pure spin state (<S2>calc − <S2>exact) are inset

It is important to note that the ROHF wave function is an eigenvalue of the S2 operator while the UHF wave function is not [38]. This means that UHF suffers from spin contamination, as contributions from higher spin states are included in the wave function. The <S2>calc expectation value is used as a measure of the spin contamination within an unrestricted calculation. The <S2>exact for a spin pure state is Sz(Sz + 1). As increasingly higher spin states are included <S2>calc increases. <S2>calc for each curve is compiled in Table 3.

Table 3 <S2>calc − <S2>exact at 0.1 Å increments along the UHF curve for triplet O2

The UHF curve calculated from Hückel guess produces one smooth, continuous curve in the 3A1 state. <S2>calc indicates that, although there is slight mixing of higher spin states, particularly in the IID region, the same state is found throughout the entire curve. The Hcore guess, however, does not remain in one state throughout the curve. At 2.3 Å, a lower energy state is found, with an even lower energy state appearing at 2.8 Å. While each state has 3A1 symmetry, the <S2>calc indicates increasing spin contamination.

3.2 S-Block Diatomics

The s-block diatomics included in this study are LiF, NaCl, CaO, and MgO. The T1/D1 diagnostic values are listed in Table 4.

Table 4 T1/D1 diagnostics for the s-block-containing molecules

As expected, the multireference character increases as the bond breaks when the internuclear separation increases. LiF and NaCl are well-behaved with both Hcore and Hückel guesses producing the same smooth PECs. MgO and CaO both exhibit some multireference character throughout the entire potential energy curve as indicated by the T1/D1 diagnostics, even near the equilibrium bond length. However, with the exception of the large internuclear distances for MgO (see Fig. 3), ROHF optimizes to a single state wave function for each of these molecules.

Fig. 3
figure 3

Points calculated for MgO at the ROHF level of theory, using the Hcore and Hückel initial guesses

Although MgO is significantly multireference throughout the entire curve with T1 and D1 values greater than 0.02 and 0.05, respectively, ROHF produces a smooth, continuous curve describing both the equilibrium region and most of the non-equilibrium region quite well. There is only a small discrepancy between the results from the two initial guesses as the bond length is increased.

3.3 Transition Metal Diatomics

The transition metal diatomics in the test set are ScO, FeO, TiO, YO, and ZrO. The T1/D1 diagnostic values are listed in Table 5. All of the molecules exhibit some multireference character throughout the PECs, although with the exception of FeO, all of the molecules have low diagnostic values around the minimum of the curve. As will become evident throughout the discussion of transition metal diatomics, the possible multireference character and the initial guess of the bond length can affect the convergence of geometry optimizations.

Table 5 T1/D1 diagnostics for the transition metal-containing molecules

3.3.1 ROHF Results

Both ScO and YO are doublets with one unpaired electron (2A1 ground states ). For these diatomics, the calculated PECs are well-behaved around the minimum of the curves (see Figs. 4 and 5, respectively).

Fig. 4
figure 4

Points calculated for ScO at the ROHF level of theory, using both the Hcore and Hückel initial guesses

Fig. 5
figure 5

Points calculated for YO at the ROHF level of theory, using both the Hcore and Hückel initial guesses

Problems arise as the bond distance increases to the intermediate internuclear distances (IID) between equilibrium and dissociation. While the lowest energy state for ScO is still a 2A1 state, it is 21.4 kcal mol−1 lower in energy than the rest of the curve (see Fig. 4). The PECs for YO exhibits state switching, beginning at 2.6 Å (see Fig. 5). This is the area of the PEC that has the highest multireference character, with a T1 value of 0.088 and a D1 of 0.175, and the multireference character manifests itself in the inability of ROHF calculations to find one ground state at each point of the curve.

The initial guess for the wave function, either the Hcore or the Hückel guess, had little impact on the smoothness (or lack thereof) on the PECs for ScO. The Hückel guess for YO was able to determine a single state as the bond length was increased, whereas the Hcore guess produced oscillations between two distinct states near dissociation.

The ground state for ZrO is 3A1 (see Fig. 6) along with a degenerate 3A2 state, yet at 2.6 Å, the lowest energy state is found to be 3B2 when the Hcore initial guess is used. When using the Hückel guess, at 2.4 Å, the PEC jumps to a 3B1 state, which is the direct product of A1 and B1 singly occupied orbitals. Then at 2.6 Å the calculation settles to a 3A1 state. However, this A1 state results from the direct product of two B1 orbitals instead of two A1 orbitals, as with the ground state around the minimum of the curve. This is a direct result of different d-orbital occupation. TiO is also a triplet and the PECs show that the initial guess for the wave function can have an effect as well (see Fig. 7).

Fig. 6
figure 6

Points calculated for ZrO at the ROHF level of theory, using both the Hcore and Hückel initial guesses

Fig. 7
figure 7

Points calculated for TiO at the ROHF level of theory, using the Hcore and Hückel initial guesses. The points between 2.1 and 2.3 Å on the Hückel curve do not converge

With either initial guess, the ground state near equilibrium is 3A1, which results from the direct product of two A1 orbitals (an s and a d x 2 orbital) localized on Ti. Yet at 2.1 Å, the Hcore guess finds the lowest energy state for TiO as 3B1 resulting from the combination of a p x orbital of B1 symmetry localized on O and an s orbital with A1 symmetry localized on Ti. The tail of the PEC oscillates between the 3A1 and 3B1 states. As is shown in Fig. 7, when using the Hcore initial guess, it is not obvious that the wave function has changed states. The Hückel guess, however, falls into the higher energy 3B1 state around the minimum of the curve, which can be problematic for gradient-driven optimizations. Even given an initial geometry close to the equilibrium bond length, HF could still optimize to an excited state resulting in an incorrect geometry, especially if the ground and excited PECs are not parallel. Non-parallel PECs can be problematic as a bad initial guess can result in an incorrect optimized geometry, especially when a gradient-driven optimization algorithm is used. In this case the two different initial guesses produce PECs with two different equilibrium bond lengths, 1.595 Å for the Hcore guess and 1.600 Å for the Hückel guess, indicating that the curves resulting from the two different guesses are not parallel. While this is only a 0.005 Å difference in the calculated bond length, there could be a greater difference for other molecules. Fortuitously, the initial guesses had no effect on the calculated bond length for the rest of the molecules investigated in the present work.

FeO is an open shell quintet. Unsurprisingly, the T1/D1 diagnostics indicate that FeO exhibits multireference character at all internuclear distances. Notably, the area that shows the least multireference character according to the T1/D1 diagnostic is at 2.4 Å, firmly in the IID region of the curve, an area that is typically the most multireference due to bond breaking. The ROHF reference curves using the Hcore and Hückel initial guesses demonstrate this quite clearly (see Fig. 8).

Fig. 8
figure 8

Points calculated for FeO at the ROHF level of theory, using the Hcore and Hückel initial guesses. The point at 3.4 Å on the Hcore curve and 3.2 Å on the Hückel curve do not converge

The Hcore guess is able to produce a smooth curve close to the minimum, yet there is an excited state found at distances just short of equilibrium. The curve also quickly degenerates into several different states in the IID region. The Hückel guess, however, is unable to produce a smooth curve around the minimum. The lowest energy state between 1.4 and 1.66 Å is 5A2, while the lowest energy state from 1.67 Å is 5A1. The orbital occupation for the 5A2 has some mixing between the Fe 3d and O 2p orbitals with the 3d and 4p orbitals singly occupied, while only the 3d orbitals are singly occupied in the 5A1 state.

An attempt was made to calculate the separate ROHF curves for FeO using the Maximum Overlap Method (MOM) approach [2] beginning from an Hcore initial guess, yet even this was unable to isolate the different states towards dissociation (see Fig. 9). Around equilibrium, two distinct states are produced, yet the IID region shows several different states. The different curves have different equilibrium bond lengths as well. The 5A1 ground state produces an equilibrium bond length of 1.68 Å, while the higher 5A2 state has a bond length of 1.79 Å.

Fig. 9
figure 9

Ground state (5A1) and excited state (5A2) curves for FeO, using the maximum overlap method (MOM), calculated at the ROHF level of theory. The optimized orbitals from a point on the excited state curve were used as the initial guess for subsequent points. Orbital rotations were restricted to attempt to stay in the same electronic state . The curves are smooth and continuous around the minima, yet are unable to stay on the same curve as the bond length increases. These curves also demonstrate that the ground state and excited state curves are not always parallel, as the equilibrium bond distance for the ground state is 1.680 Å and for the excited state is 1.790 Å

The inability to produce a smooth potential energy curve is a basis set independent phenomenon. The Dunning-style correlation consistent polarized valence triple-ζ basis sets [28, 29] were also utilized with ROHF to calculate the potential energy curves for FeO as shown in Fig. 10.

Fig. 10
figure 10

Points calculated for FeO at the ROHF level of theory, using the Hcore and Hückel initial guesses and cc-pVTZ basis sets. The points at 2.4 and 3.1 Å on the Hcore curve and 2.5, 2.9, 3.2, and 3.5 Å on the Hückel curve do not converge

As ROHF tends to enforce heterolytic dissociation, UHF was also utilized to calculate points on the potential energy curve for FeO (see Fig. 11). The Hückel initial guess identifies the ground state as 5A1. Notably, the curve is much smoother around the minimum, which is the region that ROHF was unable to describe. At 2.0 Å, the calculation converges to an excited 5A1 state. The Hcore initial guess in conjunction with UHF is unable to produce a smooth curve at equilibrium. There is oscillation between the 5B1 ground state and an excited 5B1 that is noticeably not parallel to the ground state curve. This demonstrates that even a reasonable initial guess of the geometry for FeO can result in the incorrect state and a different equilibrium bond length.

Fig. 11
figure 11

Points calculated for FeO at the UHF level of theory, using the Hcore and Hückel initial guesses. The difference between the calculated <S2> and <S2> for the pure spin state (<S2>calc − <S2>exact) are inset

The spin contamination for the points on the calculated PEC are shown by the inset graph of Fig. 11 and in Table 6. When the calculation converges to an excited state curve, there is an accompanying increase in the <S2> value.

Table 6 <S2>calc − <S2>exact at 0.1 Å increments along the UHF curve for FeO

It is clear that calculation of the FeO potential energy curve will benefit from a multireference treatment. This is a result of near degeneracies arising from the 4s and 3d orbitals in addition to the breaking of the Fe–O bond at long internuclear distance. Thus, any active space employed must account for both sources of multireference character. A sensible active space would then include the Fe 4s and 3d orbitals and the O 2p orbital set. While only two O 2p orbitals may take part in bonding to Fe, the full set should initially be included due to the degeneracy of the O 2p orbital set. Given this initial (12,9) active space, it may be possible to truncate the active space if it can be shown that occupation of any orbitals remains constant (doubly occupied or unoccupied) for all internuclear distances. It is known that the ground state term of FeO is 5∆ with A2 symmetry, [21] while there also is a low-lying 5Σ+ state with A1 symmetry known [39, 40]. If it is assumed that FeO exists as Fe2+O2−, then Fe exists in the 5D state with a valence configuration of 4s 03d 6. Thus, it may be possible to cut the 4s orbital out of the active space if it is shown to have negligible occupation at all internuclear distances, i.e. Fe is Fe(II) at all internuclear distances. This would allow for a reduced active space of (12,8). The smooth potential energy curve with this active space is shown in Fig. 12.

Fig. 12
figure 12

Points calculated for FeO at the CASSCF level of theory, using a (12,8) active space

3.3.2 CR-CC(2,3) Results

If the ROHF wave function providing the reference for correlated calculations is not correct, then the correlated calculations may be incorrect as well. The exception to this would be full CI calculations, or coupled cluster or many-body perturbation theory calculations that converge to full CI in the full expansion. These results are independent of the reference wave function, yet a good reference will speed up convergence. This is demonstrated by the PECs for O2, YO, and FeO (see Figs. 13, 14 and 15). The PECs for O2 show the same oscillating behavior between the homolytic and heterolytic dissociation states, yet the Hcore guess produces only the heterolytic curve (see Fig. 13). The ROHF PEC for YO using the Hcore initial guess (Fig. 5) has some oscillation between states beginning in the IID region of the PEC, extending out towards dissociation. This same problem is evident for CR-CC(2,3) as shown in Fig. 14. The calculations based on the Hückel guess result in a smoother curve, although still not in one single state, as evidenced by the small “kink” in the curve at 2.7 Å.

Fig. 13
figure 13

Points calculated for O2 at the CR-CC(2,3) level of theory, using the Hcore and Hückel initial guesses. The point at 2.7 Å using the Hcore guess did not converge

Fig. 14
figure 14

Points calculated for YO at the CR-CC(2,3) level of theory, using the Hcore and Hückel initial guesses

Fig. 15
figure 15

Points calculated for FeO at the CR-CC(2,3) level of theory, using the Hcore and Hückel initial guesses. The point at 2.6 Å using the Hcore guess and the points at 3.2, 3.4, and 3.5 Å using the Hückel guess do not converge

The PECs for FeO exhibit much less oscillation between two different states, yet the curve resulting from the Hückel guess has a discontinuity between 1.7 and 1.8 Å (see Fig. 15). From 1.4 to 1.7 Å, the curve calculated using the Hückel guess follows the higher energy 5A2 state, and then at 1.8 Å shifts to the lower energy 5A1 state.

3.3.3 DFT Results

PECs for FeO were calculated using RODFT with the B3LYP, PBE0, M06, and M11 functionals (Figs. 16, 17, 18 and 19, respectively). Using B3LYP and the Hcore initial guess, a smooth curve is computed for the region between 1.4 and 2.2 Å, with a ground electronic state of 5A1 resulting from the direct product of an A1 orbital, an A2 orbital, a B1 orbital, and a B2 orbital, all of which are d orbitals on iron. However, B3LYP calculations are unable to converge to any state after 2.2 Å (Fig. 16). The Hückel guess produces two electronic states close to equilibrium. The ground state is a 5A1 state that results from single occupation of an A1 orbital, an A2 orbital, a B1 orbital, and a B2 orbital, all of which are also iron d orbitals. The other electronic state is found at 1.7 Å, which is a 5A2 state that results from a combination of an s orbital and a d z 2 orbital localized on iron with A1 symmetry, a B2 orbital, and a B1 orbital, both of which are d orbitals. The 5A2 excited state is 83 kcal mol−1 higher than the ground state . Beyond 2.0 Å, the calculations that utilize the Hückel guess do not converge.

Fig. 16
figure 16

Points calculated for FeO at the B3LYP level of theory, using the Hcore and Hückel initial guesses. The calculations do not converge past 2.0 Å using the Hückel guess or 2.2 Å using the Hcore guess

Fig. 17
figure 17

Points calculated for FeO at the PBE0 level of theory, using the Hcore and Hückel initial guesses. The calculations do not converge past 2.0 Å using the Hückel guess or 2.1 Å using the Hcore guess

Fig. 18
figure 18

Points calculated for FeO at the M06 level of theory, using the Hcore initial guess. The calculations do not converge past 2.2 Å using the Hcore guess. Only four points on the curve converged using the Hückel guess

Fig. 19
figure 19

Points calculated for FeO at the M1 level of theory, using the Hcore and Hückel initial guesses

PBE0 does not perform much better (Fig. 17) than B3LYP. Around the minimum of the curve, the Hcore guess produces a smooth curve corresponding to a 5A2 state. This state is a combination of a B1 orbital, a B2 orbital, and two A1 orbitals, all of which are d orbitals. At 2.1 Å the calculation optimizes to an excited 5A1 state, which results from a combination of two d orbitals localized on iron with B1 and B2 symmetry, and the 2p x and 2p y orbitals, with B1 and B2 symmetry respectively, localized on oxygen. Past 2.1 Å with the Hcore guess and 2.0 Å with the Hückel guess, PBE0 is also unable to converge to any state. The Hückel guess once again lands in an excited state at 1.7 Å that is 86 kcal mol−1 above the ground state. The ground state is a 5A1 and the excited state found at 1.7 Å is a 5A2 state. The ground 5A1 state is a combination of an A2 orbital, an A1 orbital, a B1 orbital, and a B2 orbital, all of which are d orbitals on iron. The excited state 5A2 results from the occupation of two A1 orbitals, a B2 orbital, and a B1 orbital, similar to the population of the excited state calculated with B3LYP.

Although the M06 functional was fairly broadly parameterized in the course of its construction, problems still arise when calculating a potential energy curve for FeO (Fig. 18). Using the Hcore initial guess, only the points between 1.7 and 2.0 Å converged successfully. From this point, however, the optimized orbitals from a converged point were used as the initial guess for the next point on the curve with orbital rotations restricted. Similar to the MOM method, this does ensure that the calculated curve is constrained to one electronic state. However, with the exception of a converged point at 2.6 Å, the points for the remainder of the curve past 2.2 Å do not converge. Using the Hückel initial guess results in only two points that converged, at 1.8 and 2.0 Å. With the optimized orbitals from these two points, points at 1.7 and 1.9 Å were converged. For the remainder of the curve, however, even utilizing the optimized orbitals as an initial guess did not provide a good enough starting guess for the calculations to converge.

The M11 functional performed by far the best of the tested functionals in calculating the potential energy curve for FeO (Fig. 19). Using the Hcore initial guess a curve was produced that was mostly smooth around the minimum with the exception of a point at 1.55 Å. The calculated electronic state is 5A1, a state that results from a combination of an A2, A1, B2, and B1 orbitals, all of which are iron d orbitals. The Hcore initial guess was sufficient for the points to converge from 1.4 to 1.64 Å. At 1.65 Å, the optimized orbitals were used for the starting guess, but it was not necessary to restrict orbital rotations to reach convergence until points in the IID region of the curve (specifically, 2.2, 2.3, 2.5, 3.2, and 3.3 Å). The point at 2.5 Å is a 5A2 electronic state , from the occupation of a B1, B2, and two A1 orbitals, all of which are still iron d orbitals. At 2.6 Å, there is a discontinuity in the curve and a lower energy electronic state is followed for the remainder of the curve. This electronic state is 5B2, from the occupation of a B1 orbital that is a combination of an iron d orbital and an oxygen p orbital, and A1, B2, and B1 orbitals, all of which are iron d orbitals. This is a limitation inherent in using a restricted open shell formulation to describe dissociation.

The Hückel initial guess is sufficient for points to converge from 1.4 to 2.1 Å, however the curve is not continuous, especially beginning at 2.0 Å. From 2.2 to 2.5 Å, the optimized orbitals had to provide the initial guess in order for the calculation to converge, and from 2.6 Å orbital rotations had to be restricted as well. The calculated electronic ground state at the minimum is a 5A1 state. The occupation is in iron d orbitals of symmetry A2, A1, B2, and B1. At 2.0 Å, the electronic state changes to 5B1 (the direct product of B2, A1, A2, and A1 orbitals), with an oxygen p x orbital included within the singly occupied orbitals. The excited state at 2.2 Å results from the occupation of an iron s orbital as well as an oxygen p x orbital and is a 5A1 state. At 3.0 Å, the electronic state is still a 5A1 state, yet this is a result of a combination of oxygen p x and p y orbitals being populated. The potential energy curve generated by the Hückel guess demonstrates that, although the calculations converged to an answer fairly easily, inspection of the singly occupied orbitals shows that an excited state has been determined to be the ground state.

In addition to determining the equilibrium bond length from the minimum of the calculated potential energy curve , gradient-drive geometry optimizations were performed using the M11 functional as well, beginning with both the Hcore and Hückel initial guesses. The optimized bond length determined using the Hcore initial guess was 1.607 Å, the same as the bond length determined from the minimum of the potential energy curve. The optimized bond length resulting from the Hückel initial guess was 1.643, 0.036 Å longer than the bond length at the bottom of the potential energy curve (also 1.607 Å). The experimental bond length is 1.626 Å, as seen in Table 1. The Hcore-optimized bond length and the PEC-determined bond lengths are all 0.019 Å too short, while the optimized bond length from the Hückel guess is 0.017 Å too long.

These calculated curves demonstrate that DFT is not immune to the excited state optimization problems either. While using a combination of DFT and the Hcore initial guess, along with an initial bond length close to equilibrium, a correct bond distance can be computed. However, using the Hückel guess can result in optimization to the wrong state and neither method is able to produce good initial orbitals at large internuclear distances.

4 Conclusion

Potential energy curves for a set of diatomic molecules were calculated using Hartree-Fock , CR-CC(2,3), and DFT with the B3LYP and PBE0 functionals. For some systems, HF erroneously converged to an excited state instead of the ground state of the molecule. Inspection of the optimized orbitals is imperative to determine that the calculation has converged to the intended state. While optimization techniques such as the Newton-Raphson method will converge to a local minimum with respect to the orbital coefficients, the initial orbital guess must be correctly populated as the Newton-Raphson method finds the nearest minimum rather than the global minimum. The initial guess can be generated through several commonly used options within computational chemistry software packages, but care must still be taken that the correct orbitals are being populated. This requires analysis of the converged orbitals.

In situations that contain parallel potential energy curves, convergence to an excited state during a geometry optimization will not impact the optimized geometry. Not every system has parallel potential energy curves, however, and large deviations in the optimized geometries can be observed.

Even in cases of relatively small multireference character, such as for TiO, HF can have difficulty converging to a single state. For molecules with significant multireference character such as FeO, it is clearly more reasonable to use multireference methods of calculation. However, the SCF wave function provides a basis for a multireference calculation as well, and the convergence rate of the multireference calculation may depend on the quality of the SCF orbitals used. For this reason, it is of utmost importance to review the optimized wave function and ensure that the calculation has converged to the correct state.

It is notable that all molecules examined here possess a high degree of spatial symmetry. Issues with optimization to any of the multiple low-lying excited states may be less severe if less symmetry or no symmetry were present as this lifts many of the degeneracies that exasperate the problem of multiple stable solutions to the Hartree-Fock equations (i.e. Löwdin’s symmetry dilemma) [41].