Abstract
Checking that a scalar potential is bounded from below (BFB) is an ubiquitous and notoriously difficult task in many models with extended scalar sectors. Exact analytic BFB conditions are known only in simple cases. In this work, we present a novel approach to algorithmically establish the BFB conditions for any polynomial scalar potential. The method relies on elements of multivariate algebra, in particular, on resultants and on the spectral theory of tensors, which is being developed by the mathematical community. We give first a pedagogical introduction to this approach, illustrate it with elementary examples, and then present the working Mathematica implementation publicly available at GitHub. Due to the rapidly increasing complexity of the problem, we have not yet produced ready-to-use analytical BFB conditions for new multi-scalar cases. But we are confident that the present implementation can be dramatically improved and may eventually lead to such results.
Similar content being viewed by others
1 Introduction
1.1 The problem
Dealing with scalar potentials is one of the ubiquitous tasks one faces when building models beyond the Standard Model (SM). Since the discovery of the Higgs boson in 2012 [1, 2], we know that the Higgs mechanism, in some form, is at work. What we do not know is whether it is as minimal as in the SM or if the SM-like 125GeV Higgs boson is the tip of the iceberg of a sophisticated scalar sector [3].
When working with multiple interacting scalar fields, one usually builds a scalar potential and then finds its minimum to determine the vacuum expectation value configuration. Before minimizing the potential, it has to be made sure that a global minimum exists in the first place. Thus, one must verify that the potential is bounded from below (BFB).Footnote 1
At tree level, the scalar potential is written as a polynomial in scalar fields. If one keeps the scalar interactions renormalizable, the polynomial degree of the potential is four. By denoting the real scalar fields generically as \(\phi _i\), \(i = 1, \dots , n\), one can represent such a scalar potential as
where \(V_0\) includes all lower-degree monomials and a summation over repeated indices is assumed. At large quasiclassical values of the scalar fields, the quartic term dominates over the lower-degree terms. Therefore, the condition for the potential V to be bounded from below in the strong sense is equivalent to the requirement that
Since the scalar potential depends on several free parameters, which we collectively denote \(\{\Lambda _a\}\), the BFB condition (2) carves out a region in the \(\{\Lambda _a\}\)-space. If one wishes to build a model based on the potential, one must make sure the selected parameters correspond to a point inside it. Thus, the task is to efficiently describe this region, preferably in terms of inequalities on the parameters \(\{\Lambda _a\}\).
It is this task, in the general setting, that we want to attack in this work in an algorithmic fashion.
Before we move on, let us make a few clarifying comments. First, a potential can be bounded from below even if there exist some flat directions of the quartic potential, that is, subspaces of \(\mathbb {R}^n\) in which the quartic term in (2) is exactly zero. In this case, one needs to require that, within these subspaces, the lower-degree terms in the scalar potential grow and not decrease at large values of the fields. This situation was called in [4] stability in the weak sense. Geometrically, it corresponds to the boundary of the BFB region in the \(\{\Lambda _a\}\)-space, which we have just described. The solution to the BFB problem in the strong sense, Eq. (2), is a prerequisite to establishing stability in the weak sense. Therefore, from now on, we focus only on the BFB problem in the strong sense.
Second, one can distinguish necessary BFB conditions and sufficient BFB conditions. Necessary BFB conditions are the ones, which are truly unavoidable: their violation immediately drives the potential to be unbounded from below. However, satisfying a set of necessary conditions does not automatically imply that the potential is BFB: the necessary conditions may be too weak for that. Conversely, sufficient BFB conditions are safe: if a parameter set satisfies them, the potential is guaranteed to be BFB. However, they may be overly restrictive: not satisfying a set of sufficient conditions does not automatically rule out a given parameter set. So, although a set of sufficient BFB conditions may be easy to establish and implement in numerical scans, it will miss potentially interesting parts of the available parameter space.
What we are looking for is a set of BFB conditions which are, simultaneously, necessary and sufficient. They are more difficult to establish than just a set of necessary and a different set of sufficient conditions, but they incorporate the full information on the allowed parameter space in a given class of models.
Third, in quantum fields theory, quantum corrections can destabilize a potential that would be stable in the classical approximation. Finding the quantum corrections to the classical potential and checking their effect on stability is a separate issue, which we do not address in this work. We stress, however, that it is an important problem for various popular multi-Higgs extensions of the Standard Model [5,6,7] and that novel elaborate methods are being proposed to address it in generic settings [8]. Fortunately, in many cases, the main effect of quantum corrections can be absorbed into running parameters of the renormalization-group-improved potential \(\{\Lambda _a\}\) without changing the polynomial structure of the potential. In these cases, the mathematical task of establishing the BFB conditions remains unchanged.
1.2 Overview of the approaches to BFB conditions
Establishing the necessary and sufficient BFB conditions is a technical, but notoriously difficult problem in any sophisticated multi-scalar theory. There is no general, ready-to-use solution to this problem, and various approaches have been proposed for particular scalar sectors. Although our work does not rely on them, we find it instructive to give a brief overview of these approaches. We will explicitly give the potentials and denote their coefficients by \(\Lambda _a\) instead of the more traditional notation \(\lambda _a\) because \(\lambda \)’s will be reserved for the eigenvalues in the remainder of the text.
In models with few degrees of freedom or few interaction terms, the exact BFB conditions can be established with straightforward algebra. The convenient approach is to split the degrees of freedom in the scalar field space into “radial” and “angular” ones, factor out the radial dependence of the quartic potential and explore the full domain of the angular coordinates. For instance, if the quartic scalar potential depends on two fields \(\phi _{1}\) and \(\phi _{2}\), irrespective of their gauge quantum numbers, via the portal-type coupling
then one can parametrize \(|\phi _{1}|^2 = r\cos \theta \), \(|\phi _{2}|^2 = r\sin \theta \), with \(0 \le \theta \le \pi /2\), and rewrite the potential as
which must be positive definite for all values of \(\theta \). Since the angular dependence can be written via the sine and cosine of the single angle \(2\theta \), this requirement immediately leads to \(\Lambda _{1} > 0\), \(\Lambda _{2} > 0\), and \(\Lambda _{3} + 2\sqrt{\Lambda _{1} \Lambda _{2}} > 0\).
This approach was used, for example, back in 1978 [9] to establish the BFB conditions for the two-Higgs-doublet model (2HDM) with unbroken \(\mathbb {Z}_{2}\) symmetry, which was later dubbed the Inert Doublet Model (IDM). This model uses two electroweak Higgs doublets \(\phi _{1}\) and \(\phi _{2}\), and its quartic potential has five terms:
with all parameters being real. The BFB conditions are
In the most general 2HDM, which includes such interaction terms as \((\phi _{1}^\dagger \phi _{2})(\phi _{1}^\dagger \phi _{1})\), this method runs into the difficulty of dealing with several competing angular functions of different periods. It was only after the 2HDM potential was rewritten in the space of gauge-invariant bilinears [4, 10,11,12], that the BFB conditions could be established. They were first presented in the form of an algebraic algorithm [4] and later written in compact closed form [13] as inequalities imposed not on the parameters \(\Lambda _a\) themselves but on four eigenvalues \(\hat{\Lambda }_i\) of a real symmetric \(4\times 4\) matrix \(\Lambda _{ij}\) which encodes all quartic interaction terms. The form of these conditions is very simple and basis-invariant,
but checking them within a specific 2HDM requires first finding these eigenvalues, though this step can be easily implemented in numerical scans of the parameter space.
A somewhat similar systematic method of deriving the exact BFB conditions exists for models, in which the Higgs potential can be written in terms of independent positive-definite field bilinears \(r_i\). In this case, the quartic potential can again be rewritten as a quadratic form \(V = \Lambda _{ij} r_i r_j\), but its positive definiteness must be insured only in the first orthant \(r_i \ge 0\). These conditions are known as copositivity (conditional positivity) criteria. They were developed in [14,15,16] and applied to such cases as some 2HDMs, singlet-doublet models, models with \(\mathbb {Z}_{3}\) symmetric scalar dark matter, and left-right symmetric models.
Beyond two Higgs doublets, in the general N-Higgs-doublet model (NHDM), the exact BFB conditions in closed form are still not known. Several attempts to attack the problem with the bilinear space formalism [4, 17,18,19,20] did not culminate in a closed set of inequalities. The technical challenge is that, with N Higgs doublets, the space of bilinears \(r_a\), \(a = 1, \dots , N^2-1\), does not span the entire \(\mathbb {R}^{N^2-1}\) space but only a lower-dimensional algebraic manifold, which is described with a series of polynomial constraints. Positive-definiteness of a quadratic form on a complicated algebraic manifold cannot be decided with linear algebra and requires algebraic-geometric tools, that have not been found yet.
For larger gauge symmetries and for scalars in higher-dimensional representations, it is appropriate to analyze the scalar potential not in the scalar fields space but in the space of gauge orbits. This approach flourished in 1980’s with the advent of Grand Unification models, see, for example [21,22,23,24], and a short historical overview in [16].
In specific multi-Higgs models, in which large continuous or discrete symmetry groups dramatically simplify the potential, the exact conditions can be established [25,26,27,28,29,30]. We mention, in particular, the method developed in [28, 30] to rewrite the Higgs potential as a linear combination of new variables, the group-invariant quartic field combinations, and to determine the exact shape of the space spanned by these variables. This method is similar to the so-called linear programing, and it gives the BFB constraints directly from the description of the shape of the space available.
In certain cases, when the exact necessary and sufficient conditions are not known but a parameter scan still needs to be performed, it may be enough to write down a set of sufficient conditions. They may be overly restrictive, but if a point satisfies them, the potential is guaranteed to be positive-definite. An example of such conditions was given for a specific 3HDM in [31]. The idea is to pick up all terms with “angular” dependence in the scalar field space and find a lower bound for each term separately. For example, if the potential contains a term \((\phi _{1}^\dagger \phi _{2})(\phi _{1}^\dagger \phi _{3})\) with real coefficient \(\Lambda \), one can place the following lower boundary on it in the \(r_i \equiv |\phi _i|^2\) space:
In this way, the original potential V can be limited from below by another potential \(\tilde{V}\), which is a quadratic form in terms of \(r_i\) and for which the copositivity criteria are applicable.
In this work we present an algorithm, which in principle solves the problem in a generic setting. The algorithm uses elements of the theory of resultants and of the recently developed spectral theory of tensors. However, solving the problem in principle is quite different from solving it in practice. To our best knowledge, the approach was only briefly mentioned in [16] but was not developed any further nor implemented in any code. We have implemented the method in a computer-algebra code, which is available at GitHub [32], and tested it in cases, in which analytical solutions already exist. The complexity of the algorithm implementation grows so fast that, with limited computer resources, we could not apply it to cases where the results are not yet known.
This does not imply, of course, that this direction is a dead-end. The method itself is innovative but the specific algorithm we propose is clearly not optimal. We believe that with additional efforts, it can be seriously improved and may eventually produce a ready-to-use solution in various popular classes of multi-scalar models, such as the general 3HDM.
The structure of the paper is the following. In the next Section, we present our strategy and formulate the algorithm. Section 3 contains an introduction to the spectral theory of tensors, its application to the BFB problem, and describes a practical algorithm to calculate the characteristic polynomial of a symmetric tensor. In Sect. 4, we show how this method works. We first do it with two elementary examples, in which all calculations can be performed manually, and then apply the computer-algebra package to the case of a \(\mathbb {Z}_{2}\)-symmetric 2HDM, where the BFB conditions are known. We find agreement of the results, which serves as a check of the validity of our algorithm. We end with a discussion in Sect. 5 of how the algorithm can be improved in the future and draw conclusions. The Appendix contains a pedagogical introduction to polynomial rings and polynomial division with an application to the theory of resultants.
2 Algorithmic path to BFB conditions
The BFB condition (2) is formulated in terms of positive-definiteness of the real fully symmetric order-four tensor \(Q_{i j k l}\) in the entire space of real non-zero vectors \(\phi _i\), \(i = 1, \dots , n\). If the order of the tensor were not four but two, \(M_{ij}\), then its positive definiteness in the entire space of non-zero vectors \(\phi _i\) could be easily established with elementary linear algebra. One first views the tensor \(M_{i j}\) as a linear operator acting in the space \(\mathbb {R}^n\) of vectors \(\phi _i\), and asks for its eigenvalues and eigenvectors:
For a real symmetric \(M_{ij}\), there are n real eigenvalues, which can be found from the characteristic equation
The tensor \(M_{ij}\) is positive definite if and only if all \(\lambda _i\) are positive: \(\lambda _i > 0\). The calculation of the determinant is done through well-known algorithms, and it produces a polynomial for \(\lambda \) of degree n, whose coefficients are multi-linear functions of each individual entry of the matrix \(M_{ij}\).
The critical complication of recasting the BFB condition (2) into constraints on the order-four tensor \(Q_{ijkl}\) lies precisely in the fact that it has higher order. Linear algebra is of no use anymore. One needs to develop a theory that generalizes the above chain “characteristic equation \(\rightarrow \) determinant \(\rightarrow \) eigenvalues \(\rightarrow \) positivity” to the case of higher-order tensors, and to supplement the general theory with efficient algorithms.
This theory exists and is known as the spectral theory of tensors. Although the issue must have been discussed earlier, it was only in 2005 that Lim [33] and Qi [34], independently from each other, constructed fruitful generalizations of spectral theory to higher-order tensors. These and subsequent works gave a huge boost to the field, resulting in numerous applications in various branches of pure and applied mathematics, for a brief review and a pedagogical introduction, see [35] and the very recent book [36]. We will also provide an introduction in the following section. For the moment, we outline the general strategy.
There indeed exists a way—in fact, several ways—of generalizing eigenvalues and eigenvectors to tensors \(Q_{i_{1} i_{2} \ldots i_m}\) of order m (in our case, \(m=4\)). They can be written as a system not of linear but of polynomial equations of degree \(m-1\). The eigenvalues \(\lambda \) are again determined by a characteristic equation \(\mathrm {Char}(Q,\lambda ) = 0\). However it is calculated not via the determinant but via the resultant of a system of equations. The resultant is a polynomial in \(\lambda \), whose coefficients are polynomial—and not just linear—functions of the entries of the tensor Q. It is much more complicated than the determinant; in particular, its degree can be much larger than n. However, there are algorithms for calculating resultants, that can be implemented in computer-algebra codes.
Once the resultant is found, its roots give all the eigenvalues \(\lambda \). It may happen that some of these real eigenvalues may correspond to complex eigenvectors only. It just so happens that such eigenvalues can be disregarded with respect to positive definiteness. Hence, we focus only on those eigenvalues that produce real eigenvectors. The tensor Q is positive definite if and only if all of these remaining eigenvalues are positive.
From a computational point of view, the most challenging and computer-time consuming step is calculating the resultant for a given model. Just like for determinants, there exists a recursive algorithm, but for non-linear equations its complexity grows dramatically with the number of equations, variables and the degree of the polynomials. The coefficient of the characteristic polynomial may easily become so large that usual computer packages are incapable of manipulating such coefficients. Specialized algebraic–geometric packages are needed for this purpose.
Once the resultant is found in its analytic form, it can be used for any set of parameters. Checking the positivity of those of its roots that correspond to real eigenvectors can be done numerically in short time. In this way, even if the BFB conditions cannot be written in a nice closed form, they can easily be implemented in numerical scans of the parameter space.
3 Elements of the spectral theory of tensors
In this section, we introduce the basics of the spectral theory of tensors, which will be needed to describe the algorithm we implemented. The presentation is based on the theory developed by Qi [34]. A much more detailed introduction can be found in the review [35] and the book [36].
3.1 Eigenvalues and positive definiteness
Let Q be a real, fully symmetric tensor of order m over the vector space \(\mathbb {C}^n\). The elements of this vector space are denoted by \(\mathbf {x}\). Although we will eventually be interested in this tensor over the real vector space \(\mathbb {R}^n\), we need the complex space for the intermediate steps.
We call \(\lambda \in \mathbb {C}\) an eigenvalue of Q if the system of equations
has non-trivial solutions \(\mathbf {x} \in \mathbb {C}^n {\setminus } \{0\}\). These solutions \(\mathbf {x}\) are then called eigenvectors. Notice that in Eq. (11) all indices apart from \(i_{1}\) are summed over. The index \(i_{1} = 1, \dots , n\) is an open index; it labels the \(i_{1}\)-th equation. Thus, Eq. (11) represents a system of n homogeneous polynomial equations of degree \(m-1\) in n variables \(x_i\). The total number of eigenvalues, including multiplicity, is [34]
For \(m = 2\), the definition of Eq. (11) reduces to the eigensystem of square matrices, and the total number of eigenvalues is equal to n.
Even if the tensor Q is real and symmetric, its eigenvalues and eigenvectors can be complex. A real eigenvector is called an H-eigenvector; its associated eigenvalue—which is unavoidably real—is called an H-eigenvalue. Do note that in general there can be real eigenvalues which are not H-eigenvalues, i.e. eigenvalues whose corresponding eigenvectors are all complex. The existence of a real eigenvector is necessary and sufficient for the existence of an H-eigenvalue of a real, symmetric tensor. If one restricts the vector space from \(\mathbb {C}^n\) to \(\mathbb {R}^n\), then only the H-eigenvalues and H-eigenvectors survive. A key theorem that links the spectral theory of tensors with the BFB conditions is due to Qi [34]. Suppose the real symmetric tensor Q is of even order: \(m = 2 k\). Then H-eigenvalues exist, and Q is positive definite,
if and only if all of its H-eigenvalues are positive. The task of establishing the BFB conditions reduces to finding the H-eigenvalues of the tensor Q. An alternative but equivalent criterion is to check that none of the real negative eigenvalues, if any, corresponds to any real-valued eigenvector. Checking this last condition may be less time consuming since it requires eigenvector check only for a subset of all real eigenvalues found.
We remark here that, in contrast to the eigenvalues of matrices, the eigenvalues of tensors defined according to (11) are not invariant under general basis rotations. In particular, the H-eigenvalues are not invariant under generic O(n) rotations. It turns out, however, that the property of all H-eigenvalues being positive is O(n)-invariant. It is this property that makes them a useful indicator of positive-definiteness in any basis.
We note in passing that in certain problems, where the O(n) invariance of eigenvalues is crucial, one can adopt another definition of eigenvalues, which is manifestly basis-change invariant [33]. The problem of positive definiteness of the tensor Q can also be formulated in terms of positivity of these new eigenvalues. In this work, we prefer to stick to the H-eigenvalues, as their application seems to be more straightforward.
3.2 Characteristic polynomial and resultant
In order to find eigenvalues of the tensor Q, let us rewrite the system of coupled homogeneous polynomial Eq. (11) in the following form:
In that way, we simply ask for non-trivial (\(\mathbf {x} \not = 0\)) solutions to n coupled, homogeneous polynomial equations in n variables. In particular, we want to know for which values of \(\lambda \) such solutions exist. For any system of homogeneous polynomials \(f_{1}, \dots , f_n\) of n variables \(x_{1}, \dots , x_n\), there always exists a polynomial in the coefficients of \(f_{1}, \dots , f_n\), called the resultant \(\mathrm {Res}(f_{1}, \dots , f_n)\), with the following property [34]: non-zero solutions to \(f_{1} = 0, \dots , f_n = 0\) exist if and only if \(\mathrm {Res}(f_{1}, \dots , f_n) = 0\). In the case of Eq. (14), the coefficients of \(f_i\) contain \(\lambda \). The resultant \(\mathrm {Res}(f_{1}, f_{2}, \dots , f_n)\) can then be viewed as a single polynomial in \(\lambda \) whose coefficients depend on the entries of the tensor Q. It is called the characteristic polynomial \(\mathrm {Char}(Q,\lambda )\), and its roots give all the eigenvalues of the tensor Q. Just as for determinants, the value of \(\mathrm {Char}(Q,\lambda )\) at \(\lambda =0\) is equal to the product of all eigenvalues.
Resultants are much more difficult to calculate than determinants. In fact, for the fields \(\mathbb {Q}\), \(\mathbb {R}\) and \(\mathbb {C}\) the calculation is at least NP-hard [37]. Every NP-problem has an algorithm for which the execution time scales exponentially with the input. The calculation time is thus extremely sensitive to the number n of polynomials and to their respective degrees.
Multivariate resultants were first studied by Macaulay [38]. Due to him there is an algorithm that expresses the resultant as a quotient of the determinants of two matrices. The size of these two matrices grows rapidly with the number n and the polynomial degrees \(\deg (f_i)\), which renders this algorithm not very space efficient. A more economical algorithm can be found in [39], Theorem 3.4. It uses a recursive approach that we present now. Readers wishing to refresh their knowledge about the ring of polynomials and polynomial division can consult the Appendix 1.
3.3 An explicit resultant algorithm
Given homogeneous polynomials \(f_{1}, \dots , f_n \in \mathbb {C}[x_{1}, \dots , x_n]\) with degrees \(d_i := \deg (f_i)\), we define two sets of new polynomials
The polynomials \(\bar{f}_i\) are again homogeneous and of the same degrees \(d_i\) but of \(n-1\) variables. One can use \(n-1\) of them to define the smaller resultant \(\mathrm {Res}(\bar{f}_{2}, \dots , \bar{f}_n)\). If \(\text {Res}(\bar{f}_{2}, \dots , \bar{f}_n) \ne 0\), one has
Here, \(d_{1}\) is the degree of the eliminated polynomial \(f_{1}\), and the matrix \(M_{1}\) is defined by the map
with the quotient ring \(\mathbb {C}[x_{2}, \dots , x_n] / \langle F_{2}, \dots , F_n \rangle \) viewed as a complex vector space of dimension \(D = d_{2} \times \cdots \times d_n\) with elements [r] and \([F_{1}]\).
Let us explain the last statement in simple terms. It says that we need to consider the remainders r which we get after dividing all possible polynomials in \(x_{2}, \dots , x_n\) by the ring ideal constructed with the generating polynomials \(F_{2}, \dots , F_n\). These remainders form a vector space, and a basis for this vector space must be found. The basis vectors (independent remainders) can be further multiplied by the polynomial \(F_{1}\)—the one dropped in the construction of the ideal—and the results can be again reduced to the remainders and expanded in the same basis. Thus, \(F_{1}\) acts as a linear map in this space, and we describe it with the matrix \(M_{1}\), whose determinant we calculate.
In technical terms, we first build the monomial basis of this vector space by scanning through all possible monomials \(m_a = x_{2}^{e_{2}} \dots x_n^{e_n}\) of ascending total degree \(\deg (m_a) = \sum _{i=2}^n e_i = 0, 1, 2,\) etc. Then we divide all monomials by the ideal \(\langle F_{2}, \dots , F_n \rangle \), for which we first need to find the Gröbner basis \(G_i\) (see the brief introduction in Appendix 1). At the end, we obtain \(D = d_{2} \times \cdots \times d_n\) unique, non-zero, linear independent monomial remainders \(r_a\) which serve as basis vectors \([r_a]\) of the quotient ring viewed as vector space. The same division is repeated for the polynomials \(r_a \cdot F_{1}\) whose remainders \([r_a \cdot F_{1}]\) can be expanded in this basis
In this way we obtain the desired square matrix \(M_{1}\) and calculate its determinant.
One can recursively repeat the procedure \(n-1\) times to end up with the resultant of a single homogeneous polynomial \(\tilde{f}_n\) in one variable \(x_n\) of degree \(d_n\). The only possible form for this polynomial is
with some \(\alpha \in \mathbb {C}\). By definition, the resultant \(\text {Res}(\tilde{f}_n)\) is zero if and only if there are non-trivial solutions to \(\tilde{f}_n = 0\). Therefore,
Hence, after \(n - 1\) steps the calculation of the resultant terminates with a trivial relation.
If it happens that one of \(\bar{f}_i \equiv 0\), \(i = 1, \dots , n\), then we must eliminate it instead of \(\bar{f}_{1}\) and proceed further. But it may also happen that two or more among \(\bar{f}_i \equiv 0\). In this case, we get no more than \(n-2\) polynomial conditions \(\bar{f}_i=0\) on \(n-1\) variables, so that the system becomes underdetermined, and non-trivial solutions always exist, which implies that \(\text {Res}(\bar{f}_{2}, \dots , \bar{f}_n) = 0\).
In the following section and in the appendix, we give a few examples of how this algorithm works.
4 Applications
4.1 Elementary example 1
We start with the simplest possible example: the quadratic potential in two variables
The eigenvalues are defined according to
These polynomials are of degrees \(d_{1} = d_{2} = 1\). According to the algorithm, we build two other polynomial sets:
and
and calculate the resultant as
In the ring of all polynomials in \(x_{2}\), we define the ideal \(\langle F_{2}\rangle \), and need to describe the space of remainders r of polynomial division by the ideal \(\langle F_{2}\rangle \). This ideal is generated by the single polynomial, so there is no need to search for the Gröbner basis. This space is one-dimensional, \(D=d_{2}=1\), and the real unit 1 can serve as the basis vector in this space. The polynomial \(F_{1}\) can be divided by this ideal giving the following remainder r:
Thus, the matrix \(M_{1}\) is just a single number describing the linear map \([1] \rightarrow [1 \cdot F_{1}] = [r]\), giving \(M_{1} = a-\lambda - b^2/(c-\lambda )\). Finally, according to (19), the resultant \(\mathrm {Res}(\bar{f}_{2}) = c-\lambda \). Therefore, the total resultant in Eq. (25) is
which coincides with the usual determinant of the matrix \((Q - \lambda \cdot 1\!\!1)\). By setting this resultant to zero, we obtain the characteristic equation, whose roots give the eigenvalues \(\lambda \):
These roots are real and correspond to real eigenvectors, therefore they qualify as H-eigenvalues. The BFB conditions for the potential (21) are \(\lambda _{1} > 0\), \(\lambda _{2} > 0\). One can recast these conditions into \(\lambda _{1} + \lambda _{2} > 0\) and \(\lambda _{1} \lambda _{2} > 0\), which are then translated into the usual expressions \(a > 0\), \(c> 0\), and \(ac - b^2 > 0\).
4.2 Elementary example 2
The previous calculation was so simple because (1) we needed just one iteration, (2) the vector space of the remainders was one-dimensional, (3) the polynomial equations were of degree 1. Let us now consider a slightly more elaborate example:
The standard treatment of this potential resorts to the so-called copositivity criteria [14]. One defines new variables \(z_{1} = x_{1}^2\), \(z_{2}=x_{2}^2\), and rewrites the potential as a quadratic form in terms of \(z_{1}\) and \(z_{2}\). Then one asks for the positive definiteness of this quadratic form not on the entire \((z_{1},z_{2})\) real plane but only in the first quadrant, \(z_{1}, z_{2} \ge 0\). The final result is similar to the previous case with the third condition being more relaxed:
which implies that b can now be arbitrarily large provided it is positive.
Let us rederive these results via resultants. The eigenvalues are defined according to
These polynomials are of degrees \(d_{1} = d_{2} = 3\). The two auxiliary polynomial sets are
and
The ideal \(\langle F_{2} \rangle \) is again generated by a single polynomial in one variable, and we do not need to search for the Gröbner basis. The vector space of remainders of the polynomial division of all polynomials in \(x_{2}\) by this ideal is three-dimensional. The basis vectors can be chosen \(r_{1} = 1\), \(r_{2} = x_{2}\), \(r_{3} = x_{2}^2\). Higher powers of \(x_{2}\) can be divided giving remainders in this space; for example
which is equivalent to \(-b/(c-\lambda )\cdot r_{2}\). We can then calculate the action of \(F_{1}\) in this space:
The matrix \(M_{1}\) is
Knowing that \(\mathrm {Res}(\bar{f}_{2}) = c-\lambda \), we can calculate the full resultant as
Solving \(\text {Char}(Q,\lambda ) = 0\) yields six eigenvalues in accordance with Eq. (12):
all of which are always real. In order to find which of them are relevant for the BFB check, we need to find their eigenvectors. This can be done by substituting eigenvalues back into the original Eq. (31). We find that \(\lambda _{1}\) corresponds to \(\mathbf {x} \propto (1,0)\), \(\lambda _{2}\) corresponds to \(\mathbf {x} \propto (0,1)\). Thus, they qualify for H-eigenvalues and produce conditions \(a> 0\) and \(c> 0\).
For the remaining eigenvalues, the discussion requires some care. If \(b=0\), no additional eigenvalues appear; thus, we can safely consider \(b \not = 0\). In this case, the eigenvectors lie on the rays \(x_{2} = k \cdot x_{1}\) with the proportionality coefficient defined by
where the ± sign is the same as in (37). Since the square root is always larger than \(|c-a|\), we always get one positive and one negative expressions for \(k^2\). Since we are looking for the real solutions, k must be real, and we always keep only one \(k^2\), depending on the sign of b. Thus, we get the additional H-eigenvalue:
where \(\sqrt{(\cdot )}\) denotes \(\sqrt{(a-c)^2 + 4b^2}\). This additional H-eigenvalue must also be positive for the potential to satisfy BFB conditions. However, in the former case, \(b>0\), the conditions we have already established \(a >0\), \(c>0\), guarantee that this extra \(\lambda \) is positive. No extra constraint is needed in this case. In the latter case, \(b < 0\), the condition \(\lambda >0\) is a new one and it restricts the absolute value of the negative parameter b: \(|b| < \sqrt{ac}\). In this way, we recover the copositivity result (30).
We can draw several observations from this example. First, we see that the degree of the characteristic polynomial quickly grows for non-linear equations. Fortunately, we had to perform only one iteration in this example, and the degree stopped at six. In more elaborate situations, even with two iterations, the degree will grow very fast. At each iteration, the resultant is factorized into a secondary resultant and a determinant of a matrix M. However it does not imply that the final expression for the resultant could be easily factorized into these blocks. We saw that \(\det M_{1}\) was not a polynomial in \(\lambda \) on its own because it contained \(\lambda \) in the denominator. It required two extra powers of \(c-\lambda \) to become a polynomial. Therefore, for situations slightly more sophisticated than the elementary examples considered, we may easily run into higher-order polynomials in \(\lambda \) whose solutions cannot be written in closed algebraic form.
This leads us to the conclusion that one should abandon the hope to represent the BFB conditions in such elaborate situations in terms of explicit inequalities for the parameters of the potential. The final analytical form of the exact BFB conditions will be \(\mathrm {Char}(Q, \lambda ) = 0\), and one would need to resort to numerical methods to find all real solutions of the characteristic equation. Fortunately, numerically solving polynomial equations in a single variable can be done in short time even for very high-degree polynomials.
Another observation is that eigenvalues themselves do not provide the final answer; one also needs to check the corresponding eigenvectors. Whether a given real eigenvalue is an H-eigenvalue or not depends on the numerical values of the tensor entries. This is an additional complication for the fully analytic treatment of the problem but it can be resolved in reasonable time with numerical methods.
We wrap up this example by noticing that even if one considers, instead of Eq. (29), the most general quartic polynomial in two real variables, the resultant can still be found analytically with the same strategy. This case, however, has also been studied previously, [16].
4.3 Implementation
The above two elementary examples were simple enough to be done by hand. Although the calculations become much more involved in less trivial examples, the algorithm remains unchanged and can be implemented in a computer-algebra code. We did it within the Mathematica [40] and Macaulay2 [41] platforms, and our Mathematica package BFB [32] is publicly available at GitHub. In this subsection we describe its implementation and the challenges we had to tackle.
The algorithm for testing BFB conditions of a given scalar potential V includes the following steps:
-
1.
Rewrite the potential V in terms of real scalar fields \(\mathbf {x} \in \mathbb {R}^n\), extract the tensor of quartic couplings \(Q_{i j k l}\), and set up the polynomials \(f_i = Q_{i j k l} \cdot x_j x_k x_l - \lambda \cdot x_i^3\).
-
2.
Calculate \(\text {Char}(Q, \lambda ) = \text {Res}(f_{1}, f_{2}, \dots , f_n)\).
-
3.
Find all real roots \(\lambda \in \mathbb {R}\) of \(\text {Char}(Q, \lambda ) = 0\).
-
4.
Check all non-positive roots \(\lambda \le 0\) for non-trivial, real solutions \(\mathbf {x} \in \mathbb {R}^n {\setminus } \{0\}\) to the equations \(f_{1} = f_{2} = \dots = f_n = 0\).
-
5.
The potential V is bounded from below if and only if there are no such real solutions for the non-positive roots.
In step 1, it is important to make sure that all real fields \(x_i\) can span the entire real space and not just a subset of it. If this condition is not met, the algorithm may only yield sufficient but not necessary constraints on the scalar potential parameters, simply because positive definiteness of Q in the entire space may be too restrictive. It is this requirement that impedes its application in the space of gauge-invariant bilinears in multi-Higgs-doublet models.
Step 2 is the key step of the algorithm and it is more complicated. The usual computer-algebra packages such as Mathematica and Maple have implementations for the calculation of resultants for two polynomials in at most two variables. They do not have a general implementation for the calculation of multivariate resultants. One way to proceed would be to implement the resultant algorithm presented in Section 3.3 within Mathematica or Maple, relying on their support for polynomial division algorithms such as finding Gröbner bases etc. An alternative procedure is to use a more specialized computer algebra system such as Macaulay2 [41], which is designed for problems in algebraic geometry. It allows for the symbolic manipulation of polynomials and the calculation within quotient rings and ideals over the field of integers or rational numbers. The implementation of multivariate resultants is provided as a package called Resultants [42]. The currently tested version of BFB [32] uses this package.
At step 3, for analytic Higgs potential parameters, it is not clear if it is in general possible to decide whether a root is real or not. Hence, most of the time this has to be decided after numeric values have been chosen.
Similarly to the calculation of resultants, performing step 4 can be rather involved. It reduces to a proof of existence of real solutions for a given set of polynomial equations. In the univariate case this problem can be tackled by the Sturm sequence. For the more interesting multivariate case, the decision problem of real solutions has been solved by the Tarski-Seidenberg theorem [43]. The implementation of BFB uses Mathematica’s function called FindInstance to construct a real solution if possible.
In practice there are two different scenarios for which one would apply this algorithm. Firstly, to have a numerical check of boundedness for a given point in the parameters space of the Higgs potential. The Higgs potential will have numeric coefficients from the beginning.
Secondly, one would want to derive analytic constraints that can be later evaluated numerically. The algorithm in step 2 can in principle produce the characteristic polynomial \(\text {Char}(Q, \lambda )\) in analytic form for any model. However, because calculating resultants is NP-hard [37], this step can be very challenging. We saw that calculating the resultant in analytic form within the IDM, which is discussed below, easily exceeds the time scale of several weeks with the current implementation of BFB. We are confident that this implementation is not the most optimal one, and we hope that more efficient algorithms can be applied.
Next, even if the characteristic polynomial \(\text {Char}(Q, \lambda )\) is known in analytic form, its degree can easily grow far above four, which may preclude expressing its roots in an analytic way. Thus, at this stage, one would need to resort to numerical methods and explore the parameter space with numerical scans. Fortunately, numerically solving a polynomial equation in a single real variable can be done in relatively short time. We found that, for the IDM case, numerical calculations in step 3 and 4 take at most a few seconds.
4.4 Inert Doublet Model
The quartic part of the Higgs potential of the Inert Doublet Model, which makes use of two Higgs electroweak doublets \(\phi _{1}, \phi _{2} \in \mathbb {C}^2\), is given by Eq. (5). The analytic BFB conditions were first derived in [9] and are given in (6).
We treated this problem with the BFB package [32]. The scan of the parameter space was done numerically. This means that we did not attempt to derive the analytical expression of the characteristic polynomial but, for each point in the scan, the Higgs potential parameters were assigned numerical values before running the algorithm. To reduce the complexity of the problem, the \(\mathrm {SU}(2)\) symmetry of the Higgs potential was exploited. A given potential value V at a certain point \(\mathbf {x} \in \mathbb {R}^8\) of variables can be equally expressed by a different point \(\mathbf {x}' \in \mathbb {R}^8\) if the two points are connected through an \(\mathrm {SU}(2)\) transformation of the two Higgs doublets. Hence by an appropriate choice of transformation, one can make three of the eight variables vanish. This eliminates flat directions of the potential and corresponds to the calculation of constraints in unitary gauge.
In Fig. 1 we show the exclusion plots in a selection of two parameter planes; additional plots can be found in [44].
The green region is excluded by the analytic constraints of Eq. (6). The yellow region is allowed. Black dots are those points from the numerical scan which were approved by the package BFB. They perfectly agree with the analytical conditions.
It is worth mentioning that Macaulay2 [41] allows one to calculate the resultant not only over a field but also over the ring of integers. This is on average faster because the intermediate polynomial division steps require the division of the coefficients. Within the ring of integers this is effectively done by a modulo operation which is much faster than an actual division. The computation time varied from 3 to 8 h per parameter space point. Almost the whole time was spent for the calculation of the resultant. We also observed a strong dependence of the calculation time on the complexity of the input parameters: simpler coefficients such as 1/10 would result in a faster calculation than coefficients like 743/999.
According to (12), for the five-variable version of the IDM, the degree of the characteristic polynomial is equal to \(N_Q = 5 \times 3^4 = 405\). Hence the initial parameters will approximately be raised to this total power making the resulting numerator and denominator a huge number which cannot be stored in CPU registers. For instance, with \(\Lambda _{1} = \Lambda _{2} = \Lambda _{3} = 1\), \(\Lambda _{4} = 9.01587 \) and \(\Lambda _5 = - 10.2132\) the largest coefficient of the characteristic polynomial is \(\approx 3.452 \times 10^{1137}\). Thus, special libraries for integer manipulation, which emulate the CPU’s arithmetic logic unit, have to be used. The runtime of the remaining algorithm, after the calculation of the characteristic polynomial, is negligible. For the above parameter point, calculating and testing all H-eigenvalues takes no longer than 3 s.
5 Discussion
5.1 The present situation
Checking that scalar potentials are BFB is a notoriously difficult problem, which impedes efficient exploration of many models with extended scalar sectors. Analytical BFB conditions are known only in special and rather simple cases. For example, in models with three Higgs doublets the BFB conditions remain unknown beyond the few cases with large symmetry groups.
In this work we presented and developed a novel approach to establishing the BFB conditions of generic polynomial scalar potentials, which, to our best knowledge, was briefly mentioned only in [16] and was not pursued any further by the HEP community. The method relies on certain unconventional mathematical methods such as the theory of resultants and the spectral theory of tensors. In this approach, the BFB conditions are equivalent to calculating a well defined characteristic polynomial and checking that its real roots satisfy certain conditions. We described an explicit algorithm of calculating the characteristic polynomial and illustrated it with two elementary cases, where all calculations can be done manually. We also implemented the algorithm in a Mathematica package BFB [32] which is publicly available at GitHub. We validated its performance with the case of the Inert Doublet Model, for which the conditions are known analytically, and we found perfect agreement.
Unfortunately, we have not yet produced ready-to-use analytical results for other, more complicated cases, where the BFB conditions are at present unknown. This is in part due to the intrinsic complexity of the problem: it is NP-hard and the computation time grows exponentially with the input information. However, we also believe that our current implementation is not the most optimal one, and we hope that it can be dramatically improved in the future. Since the approach is novel, we call for a community effort in optimizing this approach.
5.2 Directions for future work
The algorithm presented in this work is capable of constructing BFB constraints for any Higgs potential. The bottleneck of runtime is the calculation of the characteristic polynomial. We see four possible improvements that may increase the speed drastically.
First, the current implementation of BFB [32] uses no parallelization even though there is great potential to do so. This is mainly because the whole calculation of the resultant is outsourced to the computer algebra system Macaulay2 [41]. There are two critical algorithms that may be subject to improvement: the calculation of the Gröbner bases and the calculation of the resultant. Both of them are under steady investigation of the mathematical community. For Gröbner bases, there are Faugére’s algorithms F4 [45] and F5 [46] both of which are highly parallelizable. Macaulay2 includes already four different algorithms for the calculation of the resultant. The algorithm presented in Sect. 3.3 from [39], Theorem 3.4, is one of them. Part of it is the calculation of the intermediate matrices \(M_i\). Currently, the elements are obtained in a linear way on one CPU only. However, each row can be calculated independently. For the IDM test of Sect. 4.4, \(M_{1}\) already has 81 rows, so here is a huge potential for parallelization. Also, the calculation of the basis of the quotient ring is a simple scan through low degree polynomials and can be distributed over any number of cores. Macaulay2 implements also the classic algorithm by Macaulay [38]. It is less space efficient but may be more time efficient when it comes to the calculation of resultants of polynomials with many variables. Furthermore, Macaulay2 implements a variation of these two algorithms that makes use of polynomial interpolation (see for instance [47, 48]).
Second, as one can probably already conclude, not only the possibility of parallelization may speed up the process of resultant calculations, but also the choice of the respective algorithm. There is a multitude of publications on this topic. Depending on the specific form of the input polynomials there might exist much faster algorithms than the presented one. For instance, Macaulay proposed a modified version of his algorithm that can be used if all polynomials share the same degree [49]. This is applicable to the current case of Higgs potential boundedness and should definitely be tested. It is this approach that might bypass the NP-hardness [37] of resultant calculations for Higgs potential boundedness.
Third, the scalar potentials we encounter are gauge invariant, and this implies a certain redundancy when writing them in terms of real fields. For example, for the IDM test of Sect. 4.4, we used the \(\mathrm {SU}(2)\) symmetry of the Higgs potential to reduce the number of variables from 8 to 5. It is plausible that additional symmetries of other multi-Higgs models can be exploited in a similar way. Furthermore, since the BFB check can be performed in any basis, one may take advantage of the basis-change freedom to switch to a basis that is more convenient. This may result in a further reduction of variables or parameters. Another symmetry driven approach is the usage of E-eigenvalues [35], which are, unlike H-eigenvalues, invariant under orthogonal transformations. A short discussion of the implications with respect to Higgs potential boundedness can be found in [44].
Lastly, when performing the scans of the parameter space, one can use the ring of integers instead of a field of numbers for the polynomial coefficients. As we saw with the IDM example in Sect. 4.4, this option changes the runtime. Rational numbers \(\mathbb {Q}\) might be the worst choice because they incorporate an inefficient division algorithm (finding greatest common divisors etc.) and have a bad scaling with powers (numerator and denominator can get very large). Integers \(\mathbb {Z}\) are more efficient when it comes to the used division operations (modulo operations) but still possess a bad scaling with powers. Macaulay2 only allows for these two options. The field of real numbers \(\mathbb {R}\) may be an intermediate solution that trades accuracy for runtime. The division algorithm is not as fast as for integers but calculations of powers are faster and more space efficient (floating point numbers store powers separately). Currently there exists no implementation of resultant algorithms that work with both analytic parameters and real numbers. There is a working framework called MARS [50] that can handle the calculation of the resultant numerically. It is possible to perform a scan over a bounded range of values for the eigenvalues \(\lambda \) and test for the numerical vanishing of the resultant. This is numerically unstable though, since the resultant is in general a high-degree polynomial in \(\lambda \) and accuracy will play an important role here. Nevertheless, this is a feasible approach.
The long term goal is to have an algorithm that can produce the analytic form of the characteristic polynomial for various Higgs potentials. It is true that computing this polynomial in a specific model, for example, in 3HDM, even after parallelization and optimizaiton may require much computer time. However, once the characteristic polynomial is calculated in its full analytic form, it can be published and distributed, and it can be readily used for all subsequent checks of BFB conditions in this model. Such “mining of characteristic polynomials” is definitely worthy of extra efforts.
Notes
To be precise, boundedness from below is a necessary but not sufficient condition for a minimum to exist. Consider, for example, the following function of two real variables x and y:
$$\begin{aligned} V(x,y) = (xy-1)^2 + y^4 \text { .} \end{aligned}$$It is clearly bounded from below, as both terms are strictly non-negative, but it does not possess a global minimum. As one moves along the hyperbole \(xy = 1\) to large x values, \(V\rightarrow 0\) but never reaches zero. However, we know of no multi-scalar example which makes use of this mathematical peculiarity. Therefore, in this paper, the BFB conditions will be understood as equivalent to the existence of a minimum.
References
G. Aad et al., Phys. Lett. B 716, 1 (2012). https://doi.org/10.1016/j.physletb.2012.08.020
S. Chatrchyan et al., Phys. Lett. B 716, 30 (2012). https://doi.org/10.1016/j.physletb.2012.08.021
I.P. Ivanov, Prog. Part. Nucl. Phys. 95, 160 (2017). https://doi.org/10.1016/j.ppnp.2017.03.001
M. Maniatis, A. von Manteuffel, O. Nachtmann, F. Nagel, Eur. Phys. J. C 48, 805 (2006). https://doi.org/10.1140/epjc/s10052-006-0016-6
P. Basler, M. Krause, M. Muhlleitner, J. Wittbrodt, A. Wlotzka, JHEP 02, 121 (2017). https://doi.org/10.1007/JHEP02(2017)121
P. Basler, M. Mühlleitner, J. Wittbrodt, JHEP 03, 061 (2018). https://doi.org/10.1007/JHEP03(2018)061
P. Basler, M. Muhlleitner, BSMPT—Beyond the Standard Model Phase Transitions—A Tool for the Electroweak Phase Transition in Extended Higgs Sectors (2018)
L. Chataignier, T. Prokopec, M.G. Schmidt, B. Swiezewska, JHEP 03, 014 (2018). https://doi.org/10.1007/JHEP03(2018)014
N.G. Deshpande, E. Ma, Phys. Rev. D 18, 2574 (1978). https://doi.org/10.1103/PhysRevD.18.2574
F. Nagel, New Aspects of Gauge–Boson Couplings and the Higgs Sector. Ph.D. Thesis, Heidelberg U (2004). http://www.ub.uni-heidelberg.de/archiv/4803. Accessed 23 May 2018
I.P. Ivanov, Phys. Lett. B 632, 360 (2006). https://doi.org/10.1016/j.physletb.2005.10.015
C.C. Nishi, Phys. Rev. D 74, 036003 (2006). https://doi.org/10.1103/PhysRevD.76.119901, https://doi.org/10.1103/PhysRevD.74.036003 [Erratum: Phys. Rev. D76,119901(2007)]
I.P. Ivanov, Phys. Rev. D 75, 035001 (2007). https://doi.org/10.1103/PhysRevD.76.039902, https://doi.org/10.1103/PhysRevD.75.035001 [Erratum: Phys. Rev. D76,039902(2007)]
K. Kannike, Eur. Phys. J. C 72, 2093 (2012). https://doi.org/10.1140/epjc/s10052-012-2093-z
J. Chakrabortty, P. Konar, T. Mondal, Phys. Rev. D 89(9), 095008 (2014). https://doi.org/10.1103/PhysRevD.89.095008
K. Kannike, Eur. Phys. J. C 76(6), 324 (2016). https://doi.org/10.1140/epjc/s10052-016-4160-3
C.C. Nishi, Phys. Rev. D 76, 055013 (2007). https://doi.org/10.1103/PhysRevD.76.055013
I.P. Ivanov, JHEP 07, 020 (2010). https://doi.org/10.1007/JHEP07(2010)020
M. Maniatis, O. Nachtmann, JHEP 02, 058 (2015). https://doi.org/10.1007/JHEP10(2015)149, https://doi.org/10.1007/JHEP02(2015)058 [Erratum: JHEP10,149(2015)]
M. Maniatis, O. Nachtmann, Phys. Rev. D 92(7), 075017 (2015). https://doi.org/10.1103/PhysRevD.92.075017
J. Kim, Nucl. Phys. B 196, 285 (1982). https://doi.org/10.1016/0550-3213(82)90040-2
J. Kim, Nucl. Phys. B 197, 174 (1982). https://doi.org/10.1016/0550-3213(82)90160-2
M. Abud, G. Sartori, Phys. Lett. 104B, 147 (1981). https://doi.org/10.1016/0370-2693(81)90578-5
M. Abud, G. Sartori, Ann. Phys. 150, 307 (1983). https://doi.org/10.1016/0003-4916(83)90017-9
G.C. Branco, J.M. Gerard, W. Grimus, Phys. Lett. 136B, 383 (1984). https://doi.org/10.1016/0370-2693(84)92024-0
K.G. Klimenko, Theor. Math. Phys. 62, 58 (1985). https://doi.org/10.1007/BF01034825 [Teor. Mat. Fiz. 62,87(1985)]
R. de Adelhart Toorop, F. Bazzocchi, L. Merlo, A. Paris, JHEP 03, 035 (2011). https://doi.org/10.1007/JHEP03(2011)035, https://doi.org/10.1007/JHEP01(2013)098. [Erratum: JHEP01,098(2013)]
A. Degee, I.P. Ivanov, V. Keus, JHEP 02, 125 (2013). https://doi.org/10.1007/JHEP02(2013)125
M. Maniatis, D. Mehta, C.M. Reyes, Phys. Rev. D 92(3), 035017 (2015). https://doi.org/10.1103/PhysRevD.92.035017
M. Heikinheimo, K. Kannike, F. Lyonnet, M. Raidal, K. Tuominen, H. Veermäe, JHEP 10, 014 (2017). https://doi.org/10.1007/JHEP10(2017)014
P.M. Ferreira, I.P. Ivanov, E. Jiménez, R. Pasechnik, H. Serôdio, JHEP 01, 065 (2018). https://doi.org/10.1007/JHEP01(2018)065
M. Köpke. BFB, A Mathematica Package to Check Boundedness of General Higgs Potentials. https://git.io/vFQvi (2017)
L.H. Lim, Singular Values and Eigenvalues of Tensors: A Variational Approach (2006)
L. Qi, J. Symb. Comput. 40(6), 1302 (2005). https://doi.org/10.1016/j.jsc.2005.05.007
L. Qi, The Spectral Theory of Tensors (Rough Version) (2012)
L. Qi, Z. Luo, Tensor Analysis: Spectral Theory and Special Tensors. Other Titles in Applied Mathematics (Society for Industrial and Applied Mathematics, 2017)
B. Grenet, P. Koiran, N. Portier, Lect. Notes Comput. Sci. 6281, 477 (2010). https://doi.org/10.1007/978-3-642-15155-2-42
F.S. Macaulay, Proc. Lond. Math. Soc. s1—-35(1), 3 (1902). https://doi.org/10.1112/plms/s1-35.1.3
D.A. Cox, J.B. Little, D. O’Shea, Using Algebraic Geometry. Graduate Texts in Mathematics, vol. 185 (Springer, New York, 1998)
Wolfram Research, Inc. Mathematica, Version 11.2. Champaign, IL (2017)
D.R. Grayson, M.E. Stillman. Macaulay2, a software system for research in algebraic geometry. http://www.math.uiuc.edu/Macaulay2/. Accessed 23 May 2018
G. Staglianò. Macaulay2 Package Resultants: Resultants, Discriminants, and Chow Forms. Version 1.0. http://www2.macaulay2.com/Macaulay2/doc/Macaulay2-1.10/share/doc/Macaulay2/Resultants/html/ (2017)
T. Zell, Mathoverflow—existence of a real-valued solution to system of multivariate polynomial equations. https://mathoverflow.net/a/66870/69288 (2017)
M. Köpke. Investigation of the GCP Structure of Three-Higgs–Doublet Models and a General Method to Derive Boundedness Constraints for Multi-Higgs Potentials. Master’s Thesis, Karlsruhe Institute of Technology. http://www.teco.edu/~koepke/mastersthesis.pdf (2018)
J.C. Faugére, J. Pure Appl. Algebra 139(1), 61 (1999). https://doi.org/10.1016/S0022-4049(99)00005-5
J.C. Faugére, in Proceedings of the 2002 International Symposium on Symbolic and Algebraic Computation (ACM, New York, NY, USA, 2002), ISSAC ’02, pp. 75–83. https://doi.org/10.1145/780506.780516
D. Manocha, Comput. J. 36(5), 485 (1993). https://doi.org/10.1093/comjnl/36.5.485
D. Manocha, J.F. Canny, J. Symb. Comput. 15(2), 99 (1993). https://doi.org/10.1006/jsco.1993.1009
F.S. Macaulay, Proc. Lond. Math. Soc. s2–21(1), 14 (1923). https://doi.org/10.1112/plms/s2-21.1.14
A. Wallack, D. Manocha, I. Emiris. Mars, maple/matlab/c resultant-based solver. http://gamma.cs.unc.edu/MARS/ (2017)
Acknowledgements
We are grateful to Kristjan Kannike for his valuable comments. We also want to thank Sven Caspart for the insightful discussions about algebraic geometry. I.P.I. was supported by the Portuguese Fundação para a Ciência e a Tecnologia (FCT) through the Investigator contract IF/00989/2014/CP1214/CT0004 under the IF2014 Program and in part by contracts UID/FIS/00777/2013 and CERN/FIS-NUC/0010/2015, which are partially funded through POCTI, COMPETE, QREN, and the European Union. M.K. and M.M. acknowledge financial support from the DFG project “Precision Calculations in the Higgs Sector—Paving the Way to the New Physics Landscape” (ID: MU 3138/1-1).
Author information
Authors and Affiliations
Corresponding author
Appendix A: Polynomial rings, division and resultants
Appendix A: Polynomial rings, division and resultants
In this section, we remind the reader of basic notions on the ring of polynomials, their division, and the theory of resultants, the objects which indicate when a system of polynomial equations has non-trivial solutions. To keep the material easily readable, we will expose it in plain language and reduce the formal notation to the bare minimum.
1.1 Appendix A.1: Polynomial rings
A polynomial in one variable x is an expression of the form
with the coefficients \(a_i\) belonging to some field \(\mathbb {F}\), e.g. \(\mathbb {Q}\), \(\mathbb {R}\), or \(\mathbb {C}\). The non-negative integer m is called the degree of the polynomial. The collection of all polynomials of all degrees forms an algebraic structure called the polynomial ring \(\mathbb {F}[x]\). A ring has richer structure than a vector space, because its elements, in addition to summation and multiplication by another number from the field \(\mathbb {F}\), can also be multiplied among themselves, with the result staying inside the ring. However, unlike fields, polynomial rings may lack multiplicative inverses.
Similarly, one can define a polynomial ring in several variables \(x_{1}, \dots , x_n\) over the same field; it is denoted as \(\mathbb {F}[x_{1}, \dots , x_n]\). The polynomial \(f \in \mathbb {F}[x_{1}, \dots , x_n]\) is called multivariate, while \(g \in \mathbb {F}[x]\) is univariate. A monomial in \(\mathbb {F}[x_{1}, \dots , x_n]\) is a product of powers of the form \(x_{1}^{d_{1}} x_{2}^{d_{2}} \dots x_n^{d_n}\). Its (total) degree is the sum of all individual powers: \(d = \sum _i d_i\). Clearly, any polynomial is a linear combination of monomials. The degree of a multivariate polynomial is the highest degree among all of its monomials.
1.2 Appendix A.2: Ideals and polynomial division
Just as for groups or vector spaces, rings can have subrings, which are subsets closed under all of its operations. However, rings also contain another important substructures called ideals. A polynomial ideal is a subring \(I \subseteq \mathbb {F}[x]\) such that I is closed under multiplication by the whole ring: if \(i \in I\) and \(r \in \mathbb {F}[x]\), then \(r \cdot i \in I\).
Ideals are closely related to polynomial division. Consider first univariate polynomials. The following theorem holds: for all \(f, g \in \mathbb {F}[x]\), there are unique \(q, r \in \mathbb {F}[x]\) such that
with either \(r = 0\) or \(\deg (r) < \deg (g)\). One calls q the quotient and r the remainder of the (Euclidean) polynomial division of f by g. If \(r=0\), then f is divisible by g. The set of all f that are divisible by g forms an ideal I, which is denoted as \(I := \langle g \rangle \). By Hilbert’s basis theorem, every univariate ideal has this form, see, for example, [39], p. 4.
For a multivariate polynomial ring \(R := \mathbb {F}[x_{1}, \dots , x_n]\), every ideal is also of the form
for some \(m \in \mathbb {N}\). The polynomials \(g_i\) are called generators of I. Polynomial ideals are always finitely generated. However, the relation of ideals to polynomial division, that is, representing f as
with some remainder r becomes more subtle.
Consider, for example, \(\mathbb {F}[x_{1}, x_{2}]\) and try to divide \(f = x_{1}^2 x_{2} + x_{1} x_{2}^2 + x_{2}^2\) by \(g_{1} = x_{1} x_{2} - 1\) and \(g_{2} = x_{2}^2 - 1\). Then, the decomposition (A.4) is not unique:
The uniqueness of the \(q_i\) and r is lost compared to the univariate case. To understand why this happened one has to look at the explicit algorithm used to derive these results. In the first example of Eq. (A.5), the term \((x _{1}+ x_{2}) \cdot g_{1}\) cancels both of the terms of highest degree in f, while in the second example \(x _{1}\cdot g_{1}\) only cancels the first one. For univariate polynomials there is no ambiguity in deciding what the leading order term is.
To restore the uniqueness for the multivariate case, one first has to introduce a monomial ordering, which would uniquely identify a leading term for every polynomial. Several options are possible. For example, the lexicographic order starts by ordering the variables themselves, \(x_{1}> x_{2}> \dots > x_n\), and then demands that
if in the difference \((d_{1},\dots ,d_n) - (e_{1},\dots ,e_n)\) the left-most non-zero entry is positive. This is analogous to the ordering of words in dictionaries.
Still, choosing the lexicographic ordering does not completely restore the uniqueness in the above example. However, instead of thinking of Eq. (A.4) as a division of f by a given set of polynomials \(g_i\), one can think of it as a division by the ideal \(\langle g_{1}, \dots , g_m\rangle \). In this way, one can define an equivalent set of polynomials, which span the same ideal and for which uniqueness is restored. That is, for every set of \(g_i\), there exists a set of \(\tilde{g}_i\) called a Gröbner basis such that
and it holds that for any \(f \in I\) there exists a \(\tilde{g}_i\) such that the leading term of f is divisible by the leading term of \(\tilde{g}_i\). In a sense, a Gröbner basis is the smallest generating set for I; it is a convenient choice to make division unique. There exists an algorithm due to Buchberger (see for example [39], p. 15), which allows one to find a Gröbner basis algorithmically. Most computer algebra systems like Mathematica (function call: GroebnerBasis) and Maple (package: Groebner, function call: Basis) implement this or similar algorithms.
In the above example (A.5) with the lexicographic ordering, the Gröbner basis for \(\langle g_{1}, g_{2}\rangle \) is given by \(\tilde{g}_{1} = x_{1} - x_{2}\), \(\tilde{g}_{2} = g_{2} = x_{2}^2 -1\). The ideals spanned by both pairs are equal because \(g_{1} = x_{2} \tilde{g}_{1} + \tilde{g}_{2}\) and, conversely, \(\tilde{g}_{1} = x_{2} g_{1} - x_{1} g_{2}\). The division of f by \(\langle g_{1}, g_{2}\rangle \) can be performed as a division by polynomials \(\tilde{g}_i\) using the lexicographic ordering, resulting in
A different ordering may lead to a different Gröbner basis and a different remainder.
1.3 Appendix A.3: Quotient ring as a vector space
Consider a polynomial ring \(R = \mathbb {F}[x_{1}, x_{2}, \dots , x_n]\) and an ideal \(I = \langle g_{1}, g_{2}, \dots , g_m \rangle \), where \(g_i\) already indicate its Gröbner basis. Every polynomial \(f \in R\) can be uniquely divided by the ideal I producing a remainder r, see Eq. (A.4). Different polynomials \(f_{1}\) and \(f_{2}\) can produce the same remainder r, if \(f_{1} - f_{2} \in I\). Therefore, one can consider the remainder r as the smallest representative of an equivalence class of remainders, denoted by [r], which represents all polynomials \(f \in R\) such that their division by the ideal I gives r.
The collection of all such equivalence classes also forms a ring called the quotient ring
Calculations in it are the same as the ones in \(\mathbb {F}[x_{1}, \dots , x_n]\) modulo the ideal I. Hence, \([r_{1}] + [r_{2}] = [r_{1} + r_{2}]\) and \([r_{1}] \cdot [r_{2}] = [r_{1} \cdot r_{2}]\). Practically, one takes two representatives \(f_{1} \in [r_{1}]\) and \(f_{2} \in [r_{2}]\), performs the calculations, and divides the outcome by I.
In certain situations, one can also view the quotient ring Q as a \(\mathbb {C}\) vector space spanned by a finite monomial basis. This is the case for the quotient ring used in (17), i.e. for calculating resultants. Let \(D := \dim Q\) and consider now any polynomial \(f \in \mathbb {F}[x_{1}, x_{2}, \dots , x_n]\), which is a representative element of the equivalence class [f]. By multiplying elements \([r] \in Q\) by [f], one obtains other elements \([r \cdot f]\), which also belong to the same vector space Q. Thus, f induces a linear map \(M_f\) in the vector space Q. If a basis \(\{[r_a]\}\) is chosen in the vector space Q, one can describe this map with a matrix M acting on the basis vectors according to
1.4 Appendix A.4: Working with resultants
To get some practice with quotient space calculations, which are needed for the algorithm presented in Sect. 3.3, let us consider the following system of three homogeneous polynomials
We construct two other sets of polynomials:
We then build the ideal \(\langle F_{2}, F_{3} \rangle \) and, adopting the lexicographic ordering, find its Gröbner basis:
The quotient ring and vector space \(\mathbb {C}[y,z] / \langle F_{2}, F_{3} \rangle \) has dimension \(D = {\deg (f_{2}) \cdot \deg (f_{3})} = 4\). Table 1 shows a list of remainders r for a division of low degree monomials.
It allows us to select the four basis vectors \([r_{1}] = [1]\), \([r_{2}] = [z]\), \([r_{3}] = [z^2]\) and \([r_{4}] = [z^3]\). Now consider the products \(r_a \cdot F_{1}\) and perform their polynomial division by \(G_{2}, G_{3}\):
and expand the remainders in the same basis vectors \([r_a]\):
Thus, \(F_{1}\) acts in this vector space with the matrix \(M_{1}\) given by
with determinant \(\det (M_{1}) = 17\). Therefore, the first step of the algorithm of Sect. 3.3 gives
The second step starts with \(\bar{f}_{2}, \bar{f}_{3}\) in Eq. (A.12), from which we construct
The quotient ring \(\mathbb {C}[z] / \langle \tilde{F}_{3} \rangle \) viewed as a vector space has dimension \(\deg (\bar{f}_{3}) = 2\), and its basis vectors are [1] and [z]. In this space, \(\tilde{F}_{2}\) acts as a linear map with the following matrix:
We have \(\det (M_{2}) = 1\). Finally, using that \(\mathrm {Res}(\tilde{f}_{3}) = 1\), we obtain \(\mathrm {Res}(\bar{f}_{2},\bar{f_{3}}) = 1^2 \cdot 1 = 1\), so that the overall resultant of the system (A.11) is equal to \(\mathrm {Res}(f_{1},f_{2},f_{3}) = 17\). Since it is non-zero, the system of equations \(f_i = 0\) does not possess non-trivial solutions.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Funded by SCOAP3
About this article
Cite this article
Ivanov, I.P., Köpke, M. & Mühlleitner, M. Algorithmic boundedness-from-below conditions for generic scalar potentials. Eur. Phys. J. C 78, 413 (2018). https://doi.org/10.1140/epjc/s10052-018-5893-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1140/epjc/s10052-018-5893-y