1 Introduction

Quantization of gravity is one of the major fundamental problems in theoretical physics. The quantization of general relativity by the standard perturbative methods of quantum field theory fails due to non-renormalizable divergences. Various approaches have been proposed and being studied to solve the fundamental problem, depending on views of authors. In one approach, general relativity (with higher derivative terms) is directly quantized as quantum field theory with the modern technique of the functional renormalization group [1]. In other approaches, fundamental discreteness is introduced to represent spacetimes, which include (causal) dynamical triangulations [2], loop quantum gravity [3], causal sets [4], quantum graphity [5], matrix models [6,7,8,9,10], tensor models [11,12,13,14], and so on. In these discretized approaches, an important criterion for success is whether macroscopic spacetimes are generated, or in other words, whether there exist appropriate continuum limits that recover the usual continuum picture of spacetime with dynamics described by general relativity as low-energy effective theory.

The criterion above can in principle be checked by studying the properties of a wave function of each theory. If the wave function has a peak at a configuration that can well be described by a macroscopic spacetime picture, then the model can be considered to be potentially successful. An indirect motivation for the present paper is to understand the properties of the wave function [15] that is an exact solution to a tensor model in the Hamilton formalism [16, 17] (see “Appendix A” for the tensor model.). It has been argued and shown for some simple cases that the wave function has peaks at the tensor values that are invariant under Lie groups [18]. By using the correspondence developed in Ref. [19] between tensor values and spaces with geometries, this would imply that the spacetimes symmetric under Lie groups are favored quantum mechanically. However, the main difficulty in arguing this is that only little part of the peak structure (often called landscape in such contexts) of the wave function is known, not enough to discuss “probabilities of spacetimes”.

To simplify the problem keeping the main structure from the tensor model as much as possible, one of the authors of the present paper and his collaborators considered the following two simplifications in the former papers. One is that they considered a toy wave function rather than the actual wave function of the tensor model [20]. The actual wave function is expressed by a certain power, say R-th, of a function expressed by an integration over \(N+1\) variables, but, in the toy wave function, the function is simplified to the one expressed by an integration over N variables by fixing a certain integration variable to a constant. While this substantially simplifies the analysis, the toy wave function keeps the most crucial property mentioned above that there appear peaks at the tensor values that are symmetric under Lie groups as the actual wave function of the tensor model does [20].

Though this toy wave function is simpler than the actual one, it is still difficult to perform thorough analyses, because the dimension of the argument (a symmetric tensor with three indices) of the wave function is very large with the order of \(\sim N^3/6\). Therefore, as an additional simplification, the authors in Ref. [21] considered a model that can be obtained by integrating over the argument of the toy wave function. This gives a dynamical system of a matrix, say \(\phi _a^i\ (a=1,2,\ldots ,N,\ i=1,2,\ldots ,R)\), rectangular in general, where the lower indices are pairwise contracted, but the upper ones are not always done so. While this model does not fall into the known solvable models such as the rectangular matrix models [22,23,24] or the vector models [25, 26], it has the same form as what appears when the replica trick is applied to the spherical p-spin model for spin glasses [27, 28], where R plays the role of the replica number. Here, though the form is exactly so, the concerned ranges of the dynamical variables and the parameters are different between our model and the spin glass model, and it seems reasonable to reanalyze the matrix model with fresh eyes: (1) while the replica number R is taken to the vanishing limit in the replica trick, it takes a finite value \(R\sim N^2/2\) in the correspondence to the tensor model, and should rather be taken to infinity in the thermodynamic limit \(N\rightarrow \infty \); (2) a coupling constantFootnote 1 takes the opposite sign in our model compared to the spin glass model; (3) there is a spherical constraint on the dynamical variable in the spherical p-spin model, but there is none in our model.

In this paper, we numerically study the matrix model by the Monte Carlo simulation with the standard Metropolis update method. This contrasts with the perturbative analytical computations performed in the previous paper [21]. We also perform some additional analytical computations to compare with the numerical results. We have obtained the following main results:

  • The expectation values of some observables are computed by the numerical simulations, and it is observed that there exists a transition region around \(R\sim N^2/2\). Intriguingly, the location is in good coincidence with \(R=(N+2)(N+3)/2\) that is required by the consistency of the tensor model (Namely, the hermiticity of the Hamiltonian constraints. See “Appendix A” for more details.) [15, 18, 29]. Presently, this coincidence is mysterious, since there are no apparent connections between the transition and the consistency. The observables seem to continuously but substantially change their behavior at the transition region, but it has not been determined whether this transition is a phase transition or a crossover in the thermodynamic limit \(N\rightarrow \infty \). The method for the Monte Carlo simulations performed in this paper is not powerful enough for the determination because of an issue explained below.

  • The expectation values of some observables are analytically computed in the leading order, mostly based on the treatment in the previous paper [21], and are compared with the numerical results. Good agreement between them is obtained outside the vicinity of the transition region, while there exist deviations in the transition region. The deviations are such that they soften the transition to make it look more like a crossover. A next-leading order computation has also been performed, but this does not well correct the deviations.

  • The tensor model suggests the presence of topological characteristics for the dominant configurations of the matrix \(\phi _{a}^i\) (see Sect. 7 and “Appendix A”). Therefore, we have studied topological characters of the configurations that are generated in the simulations by using the modern technique called persistent homology [30] in the topological data analysis. This technique extracts homology groups possessed by a data, which is a value of the matrix \(\phi _{a}^i\) in our case. The dominant topology gradually changes from \(S^1\) to higher-dimensional cycles with the increase of R in the vicinity of the transition region.

  • The Monte Carlo simulation becomes substantially difficult in the region with \(R\gtrsim N^2/2\) and large \(\lambda /k^3\), where \(\lambda \) and k are the parameters of our model (1). In the region, the step sizes of the Metropolis updates chosen for reasonable acceptance rates become too small to reach thermal equillibriums in \(\sim 10^{10}\) Metropolis updates.

  • A characterization of the transition can be done by the sizes of the matrix elements, which take relatively large values at the region with \(R\gtrsim N^2/2\) and large \(\lambda /k^3\), but otherwise fluctuate around small values. In the former case, our model may behave like the spherical p-spin model, since the matrix elements are effectively constrained to non-zero sizes, which would approximately realize the spherical constraint in the spherical p-spin model. This may partly explain the bad performance of the Monte Carlo simulation in the region, seemingly reflecting the high viscous nature of glassy fluids.

This paper is organized as follows. In Sect. 2, we define the model and summarize the previous results [21] that are relevant to the present paper. In Sect. 3, some observables are introduced and the analytical expressions of their expectation values are obtained in a leading order. In Sect. 4, the details of the computation in the leading order are given. The result of the next-leading order is also presented, while the details of the computation are given in “Appendix D”. In Sect. 5, we perform a saddle point analysis of the expectation values of the observables in the leading order. This describes the transition as a continuous phase transition in the large N limit, where the first derivatives of the expectation values of the observables with respect to R are discontinuous. In Sect. 6, we compare the Monte Carlo and the analytical results. They agree well outside the transition region. In the transition region, however, there exist deviations, which smoothen the transition to make it look more like a crossover. In Sect. 7, we analyze the homology structure of the configurations generated by the simulations. The preference changes from \(S^1\) to higher-dimensional cycles with the increase of R in the vicinity of the transition region. The last section is devoted to a summary and future prospects. In “Appendix A”, the motivation for the matrix model is explained from the viewpoint of the tensor model. In “Appendix B”, an instructive computation of the partition function for \(R=2\) is given. In “Appendix C”, “Appendix D”, and “Appendix E”, some equations used in the main text are explicitly derived. In “Appendix F”, a brief introduction to persistent homology is given.

2 The model

The partition function of our matrix model is given by

$$\begin{aligned} Z_{N,R}(\lambda ,k):= \int _{{\mathbb R}^{NR}} d\phi \ \exp \left( -\lambda U(\phi )- k \hbox {Tr}(\phi ^t \phi )\right) , \end{aligned}$$
(1)

where \(\lambda \) and k are the coupling constants assumed to be positive, \(\phi \) is a (generally rectangular) real matrix, \(\phi _a^i\ (a=1,2,\ldots ,N,\ i=1,2,\ldots ,R\)), and \(d\phi :=\prod _{a,i=1}^{N,R}d\phi _a^i\). The integration is over the NR-dimensional real space denoted by \({\mathbb R}^{NR}\). The coupling terms are defined by

$$\begin{aligned} \begin{aligned}&U(\phi ):=\sum _{i,j=1}^R \left( \phi _a^i \phi _a^j \right) ^3,\\&\mathrm{Tr}(\phi ^t \phi ):=\sum _{i=1}^R \phi _a^i \phi _a^i, \end{aligned} \end{aligned}$$
(2)

where the repeated lower indices are assumed to be summed over. We use this standard convention for the lower indices throughout this paper, unless otherwise stated. On the other hand, we do not use this convention for the upper indices: A sum over them must always be written explicitly.Footnote 2 The background motivation for the matrix model is explained in “Appendix A” from the viewpoint of the tensor model.

In (2), the lower indices are contracted pairwise, while the upper indices are not necessarily so. Therefore the model has the symmetry of the O(N) transformation on the lower indices, but only the permutation symmetry \(S_R\) of relabeling \(\{1,2,\ldots ,R\}\) on the upper indices. These symmetries are not enough to diagonalize \(\phi _a^i\) in general, and therefore this model cannot be solved in a similar way as the usual matrix model.

In the previous paper [21], we considered an expression which can just be obtained by separating the radial and angular part of the integration in (1): By the change of variable, \(\phi _a^i=r \tilde{\phi }_a^i\), with \(r^2=\sum _{i=1}^R \phi _a^i \phi _a^i\) and \(\tilde{\phi }\) representing the angular coordinates, we obtain

$$\begin{aligned} Z_{N,R}(\lambda ,k)=V_{NR-1} \int _0^\infty dr\, r^{NR-1} f_{N,R}(\lambda \, r^6) \, e^{-k \, r^2}, \end{aligned}$$
(3)

where

$$\begin{aligned} f_{N,R}(t):=\frac{1}{V_{NR-1}}\int _{S^{NR-1}} d \tilde{\phi } \, e^{-t\, U(\tilde{\phi })} \end{aligned}$$
(4)

with \(V_{NR-1}=\int _{S^{NR-1}} d\tilde{\phi }\), the volume of the \(NR-1\)-dimensional unit sphere.

This rather trivial change of expression is actually very useful, because \(f_{N,R}(t)\) can be shown to be an entire function of t and therefore has a Taylor expansion in t with the infinite convergence radius around \(t=0\) (actually around arbitrary \(t\ne \infty \)). Therefore, in principle, the dynamics can be solved by obtaining the entire perturbative series of \(f_{N,R}(t)\). Note that the corresponding perturbative expansion of \(Z_{N,R}(\lambda ,k)\) in \(\lambda \) around \(\lambda =0\), often obtained by perturbative methods, is merely an asymptotic series, because \(Z_{N,R}(\lambda ,k)\) has an essential singularity at \(\lambda =0\). The \(f_{N,R}(t)\) has also the property that it is a decreasing positive function of t with \(f_{N,R}(0)=1\) for real t. This property provides a good criterion for assessing the validity of an approximation of \(f_{N,R}(t)\). In the previous paper [21], \(f_{N,R}(t)\) in the leading order of 1/R has been determined by a Feynman diagrammatic method with the result,

$$\begin{aligned} f^{1/R,\mathrm{leading}}_{N,R}(t)&=\left( 1+\frac{12 t }{N^3 R^2}\right) ^{-\frac{ N(N-1)(N+4)}{12}}\nonumber \\&\quad \times \left( 1+\frac{6(N+4) t }{N^3 R^2}\right) ^{-\frac{N}{2}}. \end{aligned}$$
(5)

In particular, (5) indeed satisfies the properties above: It is a decreasing function for real t with \(f^{1/R,\mathrm{leading}}_{N,R}(0)=1\), and is almost an entire function, since the locations of the singularities are far away from the relevant region \(t\ge 0\) for large NR.

Since there are two parameters NR, which can be taken large, the range of validity of (5), which was derived in the leading order of 1/R, is not obvious. However, in later sections, we will find that (5)Footnote 3 will give results which agree wellFootnote 4 with those of the numerical simulations except in the transition region around \(R\sim N^2/2\).

3 Observables

The purpose of this section is to introduce some observables, say \(\mathcal{O}(\phi )\), and discuss the expectation values defined by

$$\begin{aligned} \langle {\mathcal{O}(\phi )}\rangle := \frac{1}{Z_{N,R}(\lambda ,k)} \int _{{\mathbb R}^{NR}} d\phi \ {\mathcal{O}}(\phi )\ e^{-\lambda U(\phi )- k \mathrm{Tr}(\phi ^t \phi )}. \end{aligned}$$
(6)

The observables must respect the symmetry \(O(N)\times S_R\) mentioned in Sect. 2. Among various possibilities, we consider \(\hbox {Tr}( \phi ^t \phi )\) and \(U(\phi )\) in (2), and also

$$\begin{aligned} U_d(\phi ):=\sum _{i=1}^R \left( \phi _a^i \phi _a^i \right) ^3. \end{aligned}$$
(7)

The last one is the diagonal part of the sum in \(U(\phi )\) in (2). Since these observables are some parts contained in the exponent of (1), they can be implemented in the numerical simulations with little additional computational costs.

To compute the expectation values of these observables, it is convenient to extend (1) by introducing the coupling constant \(\lambda _d\) conjugate to \(U_d(\phi )\) as

$$\begin{aligned} Z_{N,R}(\lambda ,\lambda _d,k):=\int _{{\mathbb R}^{NR}} d\phi \ e^{-\lambda U(\phi )-\lambda _d U_d(\phi )- k \mathrm{Tr}(\phi ^t \phi )}. \end{aligned}$$
(8)

Then the expectation values can respectively be expressed as

$$\begin{aligned} \begin{aligned}&\langle \hbox {Tr}(\phi ^t \phi ) \rangle =\frac{\partial }{\partial k} F_{N,R}(\lambda ,\lambda _d=0,k), \\&\langle U(\phi ) \rangle =\frac{\partial }{\partial \lambda } F_{N,R}(\lambda ,\lambda _d=0,k), \\&\langle U_d(\phi ) \rangle =\left. \frac{\partial }{\partial \lambda _d} F_{N,R}(\lambda ,\lambda _d,k)\right| _{\lambda _d=0}, \end{aligned} \end{aligned}$$
(9)

where \(F_{N,R}(\lambda ,\lambda _d,k):=-\log Z_{N,R}(\lambda ,\lambda _d,k)\), which is the free energy of the model. Here we have put \(\lambda _d=0\) at last, since our interest is in (1) corresponding to \(\lambda _d=0\) of (8).

To compute the partition function (8), it is convenient to first separate the radial and the angular part as in (3):

$$\begin{aligned} Z_{N,R}(\lambda ,\lambda _d,k)= V_{NR-1} \int _0^\infty dr\ r^{NR-1} f_{N,R,\lambda ,\lambda _d}(r^6)\, e^{-k r^2}, \end{aligned}$$
(10)

where

$$\begin{aligned} f_{N,R,\lambda ,\lambda _d}(t):=\frac{1}{V_{NR-1}} \int _{S^{NR-1}}d\tilde{\phi } \ e^{-\lambda \, t\, U(\tilde{\phi })-\lambda _d\, t \, U_d(\tilde{\phi })}. \end{aligned}$$
(11)

The \(f_{N,R,\lambda ,\lambda _d}(t)\) has the similar properties as \(f_{N,R}(t)\) explained in Sect. 2: It is an entire function of t; For \(\lambda >0, \lambda _d\ge 0\), it is positive and decreasing in t for real t; \(f_{N,R,\lambda ,\lambda _d}(0)=1\). With \(f_{N,R,\lambda ,\lambda _d}(t)\), the observables (9) can be expressed by

$$\begin{aligned} \begin{aligned}&\langle \hbox {Tr}(\phi ^t \phi ) \rangle =\frac{1}{{\mathcal{N}}_f}\int _0^\infty dr\ r^{NR+1} f_{N,R,\lambda ,0}(r^6)\, e^{-k r^2},\\&\langle U(\phi ) \rangle =-\frac{1}{{\mathcal{N}}_f}\int _0^\infty dr\ r^{NR-1} \frac{\partial }{\partial \lambda } f_{N,R,\lambda ,0}(r^6)\, e^{-k r^2}, \\&\langle U_d(\phi ) \rangle =-\frac{1}{{\mathcal{N}}_f}\int _0^\infty dr\ r^{NR-1} \left. \frac{\partial }{\partial \lambda _d}f_{N,R,\lambda ,\lambda _d}( r^6)\right| _{\lambda _d=0} \, e^{-k r^2}, \end{aligned} \end{aligned}$$
(12)

where the normalization is given by

$$\begin{aligned} {\mathcal{N}}_f=\int _0^\infty dr\ r^{NR-1} f_{N,R,\lambda ,0}(r^6)\, e^{-k r^2}. \end{aligned}$$
(13)

From the leading order computation, which is detailed in Sect. 4, we obtain

$$\begin{aligned} f^{leading}_{N,R,\lambda ,\lambda _d}(t)= h_{N,R}(\lambda R t +\lambda _d t) \ h_{N,R}(\lambda _d t )^{R-1} \end{aligned}$$
(14)

with

$$\begin{aligned} h_{N,R}(x)\!:=\!\left( 1+12 \gamma _3 x \right) ^{-\frac{ N(N-1)(N\!+\!4)}{12}} \left( 1+6(N\!+\!4) \gamma _{3} x \right) ^{-\frac{N}{2}}, \end{aligned}$$
(15)

where

$$\begin{aligned} \gamma _n:=\frac{\Gamma \left( \frac{NR}{2}\right) }{2^n\Gamma \left( \frac{NR}{2}+n\right) }. \end{aligned}$$
(16)

When \(\lambda _d=0\) is taken, (14) becomes

$$\begin{aligned} \begin{aligned}&f^{leading}_{N,R,\lambda ,0}(t)\\&=\left( 1+12 \gamma _3 \lambda R t \right) ^{-\frac{ N(N-1)(N+4)}{12}} \left( 1+6(N+4) \gamma _{3} \lambda R t \right) ^{-\frac{N}{2}}. \end{aligned} \end{aligned}$$
(17)

This could be expected to agree with (5), but there is a slight difference coming from (16) with \(n=3\). This difference originates from the fact that the strategy of the computation we take in Sect. 4 is different from the one taken previously in Ref. [21], and therefore the expressions of the leading orders are slightly different from each other. However, they agree with each other in the leading order of 1/R, since \(\gamma _n\sim (NR)^{-n}\) for \(NR \gg n\), as expected.

By putting these expressions into (12), we obtain

$$\begin{aligned}&\langle \hbox {Tr}(\phi ^t \phi ) \rangle _{leading}=\frac{1}{{\mathcal{N}}_f}\int _0^\infty dr\ r^{NR+1} f_{N,R,\lambda ,0}^{leading}(r^6)\, e^{-k r^2},\nonumber \\&\langle U(\phi ) \rangle _{leading} =-\frac{R}{{\mathcal{N}}_f}\int _0^\infty dr\ r^{NR+5} \nonumber \\&\quad \times \frac{h_{N,R}'(\lambda R r^6)}{h_{N,R}(\lambda R r^6)} f_{N,R,\lambda ,0}^{leading}( r^6)\, e^{-k r^2}, \nonumber \\&\langle U_d(\phi ) \rangle _{leading} =-\frac{1}{{\mathcal{N}}_f}\int _0^\infty dr\ r^{NR+5}\nonumber \\&\quad \times \left( \frac{h_{N,R}'(\lambda R r^6)}{h_{N,R}(\lambda R r^6)} +(R-1)\frac{h_{N,R}'(0)}{h_{N,R}(0)} \right) \nonumber \\&\quad \cdot f_{N,R,\lambda ,0}^{leading}( r^6)\, e^{-k r^2}, \end{aligned}$$
(18)

where \({\mathcal{N}}_f\) is given by (13) with \(f_{N,R,\lambda ,0}^{leading}(t)\). The actual values of these integrations can be obtained numerically.

4 Computations of \(f_{N,R,\lambda ,\lambda _d} (t)\) in the leading and the next-leading orders

In this section, we will compute \(f_{N,R,\lambda ,\lambda _d}(t)\) in the leading order of t, and will also show the result in the next-leading order, whose detailed computations are given in “Appendix D”. In Ref. [21], the computation in the leading order of 1/R has been performed by using the Feynman diagrams for the \(\phi _a^i\) variables. In this paper, however, we will take a different strategy. This is because the new strategy makes more transparent the rather complicated counting of combinatorics performed in Ref. [21], and make it straightforward to include the extra coupling \(\lambda _d U_d(\phi )\) and also to consider the next order in t. For \(\lambda _d=0\), the new strategy gives essentially the same result as Ref. [21] in the leading order, as commented below (17).

Let us define

$$\begin{aligned} f_{N,R,\Lambda _{ij}}(t):=\frac{1}{V_{NR-1}}\int _{S^{NR-1}} d\tilde{\phi } \ e^{-t \, U_{\Lambda _{ij}}(\tilde{\phi })}, \end{aligned}$$
(19)

where

$$\begin{aligned} U_{\Lambda _{ij}}(\tilde{\phi }):=\sum _{i,j=1}^R \Lambda _{ij} \left( \tilde{\phi }_a^i \tilde{\phi }_a^j\right) ^3 \end{aligned}$$
(20)

with a real symmetric matrix \(\Lambda _{ij}\). The eigenvalues of the matrix \(\Lambda _{ij}\) are assumed to be non-negative for the convergence of the corresponding partition function. The \(f_{N,R,\Lambda _{ij}}(t)\) also has the same nice properties as \(f_{N,R}(t)\) that it is an entire function, which has a Taylor series expansion in t with the infinite convergence radius around \(t=0\), and is a decreasing positive function of t for real t and \(\Lambda \ne 0\) with \(f_{N,R,\Lambda _{ij}}(0)=1\). If we take \(\Lambda _{ij}=\lambda +\lambda _d \delta _{ij}\), (19) gives \(f_{N,R,\lambda ,\lambda _d}(t)\) in (11). By introducing a new variable \(P_{abc}^i\ (a,b,c=1,2,\ldots ,N,\ i=1,2,\ldots ,R)\), which is symmetric for the lower indices, one can rewrite (19) as

$$\begin{aligned} \begin{aligned}&f_{N,R,\Lambda _{ij}}(t) =const. \int _{S^{NR-1}} d\tilde{\phi } \int _{-\infty }^\infty \prod _{i=1}^R \prod _{a\le b \le c=1}^N dP_{abc}^i \\&\quad \exp \left( -\sum _{i=1}^R P_{abc}^i P_{abc}^i + 2 I \sqrt{t} \sum _{i,j=1}^R \tilde{\Lambda }_{ij} P^i_{abc} \tilde{\phi }_a^j \tilde{\phi }_b^j \tilde{\phi }_c^j \right) , \end{aligned} \end{aligned}$$
(21)

where I denotes the imaginary unit and \(\tilde{\Lambda }_{ij}\) is a symmetric matrix satisfying,Footnote 5

$$\begin{aligned} \Lambda _{ij}=\sum _{k=1}^R \tilde{\Lambda }_{ik} \tilde{\Lambda }_{kj}. \end{aligned}$$
(22)

The constant prefactor in (21) can be determined by \(f_{N,R,\Lambda _{ij}}(0)=1\).

To compute (21), let us first integrate over \(\tilde{\phi }\). This change of the order of the integrations can be done, because the integration over \(P^i_{abc}\) with the infinite integration region converges uniformly for any \(\tilde{\phi } \in S^{NR-1}\). Then our task is to compute

$$\begin{aligned} \left\langle e^{ 2 I \sqrt{t} P\tilde{\phi }^3} \right\rangle _{\tilde{\phi }} = \frac{1}{V_{NR-1}} \int _{S^{NR-1}} d\tilde{\phi } \ e^{2 I \sqrt{t}\, P\tilde{\phi }^3}, \end{aligned}$$
(23)

where we have used a short-hand notation,

$$\begin{aligned} P\tilde{\phi }^3:= \sum _{i,j=1}^R \tilde{\Lambda }_{ij} P^i_{abc} \tilde{\phi }_a^j \tilde{\phi }_b^j \tilde{\phi }_c^j, \end{aligned}$$
(24)

and \(\langle \cdot \rangle _{\tilde{\phi }}\) denotes the expectation value for the uniform probability distribution on the unit sphere \(S^{NR-1}\).

For further computations, let us introduce the cumulants \(\langle {\mathcal{O}}^n \rangle ^c\) defined byFootnote 6

$$\begin{aligned} \log \langle e^{s \mathcal{O}} \rangle =\sum _{n=1}^\infty \frac{s^n}{n!} \langle {\mathcal{O}}^n \rangle ^c \end{aligned}$$
(25)

with arbitrary s. Then (21) can be rewritten as

$$\begin{aligned} f_{N,R,\Lambda _{ij}}(t)=const. \int _{-\infty }^\infty \prod _{i=1}^R \prod _{a\le b \le c=1}^N dP_{abc}^i \ e^{-S_{eff}(P)} \end{aligned}$$
(26)

with

$$\begin{aligned} S_{eff}(P)=\sum _{i=1}^R P_{abc}^i P_{abc}^i- \sum _{n=1}^\infty \frac{(2 I \sqrt{t})^n}{n!} \langle ( P\tilde{\phi }^3)^n \rangle _{\tilde{\phi }}^c, \end{aligned}$$
(27)

where \(S_{eff}(P)\) can be regarded as an effective action of \(P^i_{abc}\) after integrating out \(\tilde{\phi }\), and it is given in terms of the perturbative expansion in t. Due to the form (24), the n-th order cumulant gives the n-th order interaction term of \(P^i_{abc}\), and all the terms with odd n vanish because of the obvious invariance of the integration over \(\tilde{\phi }\) under \(\tilde{\phi } \rightarrow -\tilde{\phi }\).

Let us compute the quadratic term with \(n=2\) in (27). Since \(\langle P\tilde{\phi }^3\rangle _{\tilde{\phi }}=0\), we obtain

$$\begin{aligned} \begin{aligned}&\langle (P\tilde{\phi }^3)^2\rangle _{\tilde{\phi }}^c= \langle (P\tilde{\phi }^3)^2\rangle _{\tilde{\phi }}\\&\quad = \sum _{i,j,i',j'=1}^R \tilde{\Lambda }_{ij} \tilde{\Lambda }_{i'j'} P^i_{abc} P^{i'}_{a'b'c'} \langle \tilde{\phi }_a^j \tilde{\phi }_b^j \tilde{\phi }_c^j \tilde{\phi }_{a'}^{j'} \tilde{\phi }_{b'}^{j'} \tilde{\phi }_{c'}^{j'} \rangle _{\tilde{\phi }} \end{aligned} \end{aligned}$$
(28)

The integral over \(\tilde{\phi }\) on \(S^{NR-1}\), which is a curved compact space, is not easy to handle, so we use a formula which maps this integration to the Gaussian integral:

$$\begin{aligned} \langle \tilde{\phi }_{a_1}^{i_1}\tilde{\phi }_{a_2}^{i_2}\cdots \tilde{\phi }_{a_m}^{i_m} \rangle _{\tilde{\phi }}= (2\beta )^{\frac{m}{2}}\gamma _{\frac{m}{2}} \langle \phi _{a_1}^{i_1} \phi _{a_2}^{i_2}\cdots \phi _{a_m}^{i_m} \rangle _\phi , \end{aligned}$$
(29)

where \(\gamma _n\) is defined in (16), and

$$\begin{aligned} \begin{aligned}&\langle \phi _{a_1}^{i_1} \phi _{a_2}^{i_2}\cdots \phi _{a_m}^{i_m} \rangle _\phi \\&\quad := \frac{1}{\int _{{\mathbb R}^{NR}} d\phi \ e^{-\beta \mathrm{Tr}\phi ^t \phi }} \int _{{\mathbb R}^{NR}} d\phi \ \phi _{a_1}^{i_1} \phi _{a_2}^{i_2}\cdots \phi _{a_m}^{i_m} \ e^{-\beta \mathrm{Tr} \phi ^t \phi }. \end{aligned} \end{aligned}$$
(30)

Here \(\beta \) is a sort of dummy variable, which can be chosen freely with \(\beta >0\), and does not appear in the final expressions. In fact, as shown below, the factor \((2 \beta )^{\frac{m}{2}}\) in (29) is exactly canceled by the same factor from the Wick contraction (31). The formula (29) was previously used in [21], and is proven in “Appendix C” so that the present paper be self-contained.

Fig. 1
figure 1

Left: the diagram for the interaction vertex \(\sum _{i,j=1}^R \tilde{\Lambda }_{ij} P^i_{abc} \phi _a^j \phi _b^j \phi _c^j\). Right: the diagrams obtained by evaluating (28) through (29)

Through the replacement (29), (28) can be computed by the standard procedure using the Wick theorem and Feynman diagrams. A Wick contraction is performed by

$$\begin{aligned} \langle \phi _a^i \phi _b^j\rangle _\phi =\frac{1}{2\beta } \delta _{ab}\delta ^{ij}, \end{aligned}$$
(31)

which can be derived by an explicit computation of (30). The Feynman diagram for the vertex \(\sum _{i,j=1}^R \tilde{\Lambda }_{ij} P^i_{abc} \phi _a^j \phi _b^j \phi _c^j\) is shown in the left figure of Fig. 1. Each leg is supposed to bring the two indices of \(\phi _a^i\), and a Wick contraction connects two legs with the identification of their indices as in (31). A caution is that each leg on one vertex brings independent lower indices, but a common upper index.

Now let us apply the Wick contractions to what is obtained by replacing (28) with (29). We find the two diagrams shown in the right figure of Fig. 1. Their degeneracy factors are 6 and 9, respectively, by counting the numbers of the ways to connect the legs of the two vertices. Since j and \(j'\) in (28) are identified by the Wick contractions, we also get \(\sum _{j=1}^R \tilde{\Lambda }_{i j} \tilde{\Lambda }_{i'j}=\Lambda _{ii'}\) as a factor (see 22). Thus, we obtain

$$\begin{aligned} \langle (P\tilde{\phi }^3)^2 \rangle ^c_{\tilde{\phi }}=\gamma _3 \sum _{i,j=1}^R \Lambda _{ij} \left( 6 P^i_{abc} P^j_{abc}+9 P^i_{aab}P^j_{bcc} \right) , \end{aligned}$$
(32)

where one notices that the factor \((2\beta )^3\) coming from the replacement (29) is exactly canceled by the factors of the three Wick contractions (31) performed for the evaluation. Putting the result (32) into (27), one obtains the effective action in the second order of \(P_{abc}^i\) as

$$\begin{aligned} \begin{aligned}&S_{eff}^{(2)}(P) =\sum _{i,j=1}^R\\&\quad \times \left( \delta _{ij} P^i_{abc} P^j_{abc} +2 \gamma _3 \Lambda _{ij} t \left( 6 P^i_{abc} P^j_{abc}+9 P^i_{aab}P^j_{bcc} \right) \right) . \end{aligned} \end{aligned}$$
(33)

The computation of (26) has now been reduced to diagonalizing the quadratic expression (33). The upper and lower indices can independently be diagonalized, because (33) has the form of the direct product with respect to these indices. More explicitly, since \(\Lambda _{ij}\) is real and symmetric, we can consider the following decomposition into the eigenspaces:

$$\begin{aligned} \Lambda _{ij}=\sum _{\lambda _{ev}} \lambda _{ev}\, v^{\lambda _{ev}}_i v^{\lambda _{ev}}_j, \end{aligned}$$
(34)

where \(v^{\lambda _{ev}}\) are the orthonormal eigenvectors, and the sum is over all the eigenvalues (with their degeneracies). By putting this and \(\sum _{\lambda _{ev}} v^{\lambda _{ev}}_i v^{\lambda _{ev}}_j=\delta _{ij}\) into (33), we obtain a decomposition,

$$\begin{aligned} \begin{aligned}&S_{eff}^{(2)}(P) =\sum _{\lambda _{ev}} \\&\quad \times \left( P^{\lambda _{ev}}_{abc} P^{\lambda _{ev}}_{abc} +2 \gamma _3 \lambda _{ev} t \left( 6 P^{\lambda _{ev}}_{abc} P^{\lambda _{ev}}_{abc} +9 P^{\lambda _{ev}}_{aab}P^{\lambda _{ev}}_{bcc} \right) \right) , \end{aligned} \end{aligned}$$
(35)

where \(P^{\lambda _{ev}}_{abc}:=\sum _{i=1}^R v^{\lambda _{ev}}_iP^{i}_{abc}\).

Next let us diagonalize the lower index part in (35),

$$\begin{aligned} P_{abc} P_{abc}+ 2 \gamma _3\lambda _{ev}\,t \left( 6 P_{abc} P_{abc}+9 P_{aab}P_{bcc} \right) , \end{aligned}$$
(36)

where for brevity we omit \(\lambda _{ev}\) from \(P^{\lambda _{ev}}_{abc}\). Let us separate \(P_{abc}\) into the tensor part \(P^T_{abc}\) and the vector part \(P^V_{abc}\), which are defined byFootnote 7

$$\begin{aligned} \begin{aligned} P_{abc}&=P^T_{abc}+P^V_{abc}, \\ P^V_{abc}&=\frac{1}{N+2} \left( P_{add}\delta _{bc}+P_{bdd}\delta _{ca}+P_{cdd}\delta _{ab} \right) . \end{aligned} \end{aligned}$$
(37)

It is easy to check that \(P^T_{abc}P^V_{abc}=0\) and \(P^V_{abc} P^V_{abc}=\frac{3}{N+2} P_{abb}P_{acc}\). In particular, the former identity implies that \(P^T_{abc}\) and \(P^V_{abc}\) are independent degrees of freedom. Then, by using (37) and the identities above, (36) can be expressed as

$$\begin{aligned} ( 1\!+\!12\gamma _3\lambda _{ev}\, t) P^T_{abc} P^T_{abc}\!+\! \left( 1\!+\!6(N+4) \gamma _3 \lambda _{ev}\, t\right) P^V_{abc} P^V_{abc}. \end{aligned}$$
(38)

The number of independent components contained in \(P^T_{abc}\) and \(P^V_{abc}\) are \(\# P^T=N(N+1)(N+2)/6-N=N(N+4)(N-1)/6\) and \(\# P^V=N\), respectively. Therefore, by putting this diagonal form into (26) and integrating over \(P^V\) and \(P^T\), we finally obtain the expression of \(f_{N,R,\Lambda _{ij}}(t)\) from the quadratic order \(S_{eff}^{(2)}(P)\) as

$$\begin{aligned}&f^{(2)}_{N,R,\Lambda _{ij}}(t)=const. \int dP e^{-S^{(2)}_{eff}(P)} \nonumber \\&\quad =\prod _{\lambda _{ev}} ( 1\!+\!12\gamma _3 \lambda _{ev} t)^{-\frac{N(N+4)(N-1)}{12}} \left( 1\!+\!6(N\!+\!4)\gamma _3 \lambda _{ev}\, t\right) ^{-\frac{N}{2}} \nonumber \\&\quad =\prod _{\lambda _{ev}} h_{N,R}(\lambda _{ev}t), \end{aligned}$$
(39)

where we have determined the overall factor by requiring \(f^{(2)}_{N,R,\Lambda _{ij}}(0)=1\), the product is over all the eigenvalues of the matrix \(\Lambda _{ij}\), and \(h_{N,R}(x)\) is defined in (15).

For the computation of the observables discussed in Sect. 3, we consider \(\Lambda _{ij}=\lambda +\lambda _d \delta _{ij}\). In this case, the matrix \(\Lambda _{ij}\) has one eigenvalue \(\lambda R+\lambda _d\) with the eigenvector \((1,1,\ldots ,1)\), and the eigenvalue \(\lambda _d\) with degeneracy \(R-1\) with any of the vectors transverse to \((1,1,\ldots ,1)\) as the eigenvectors. Therefore, from (39), we obtain

$$\begin{aligned} f^{(2)}_{N,R,\lambda +\lambda _d \delta _{ij}}(t)=h_{N,R}(\lambda R t +\lambda _d t) \ h_{N,R}(\lambda _d t )^{R-1}. \end{aligned}$$
(40)

This is the leading-order result shown in (14).

As we will see later in Sect. 6, there are some deviations between the leading-order result above and the numerical simulations. To see how the situation changes by adding some corrections, we have also computed the next-leading order. The details of the computation are given in “Appendix D”. The final result is

$$\begin{aligned} f^{next{-}leading}_{N,R,\lambda ,\lambda _d}(t) =f^{leading}_{N,R,\lambda ,\lambda _d}(t) \left( 1 - \langle S_{eff}^{(4)}(P) \rangle _P \right) , \end{aligned}$$
(41)

where \(f^{next{-}leading}_{N,R,\lambda ,\lambda _d}(t)\) is the sum of the leading and the next-leading orders, and

$$\begin{aligned}&\langle S_{eff}^{(4)}(P) \rangle _P =-\frac{4 t^2}{4!} \Big [ \gamma _6 R G_1 (x_1,y_1)\nonumber \\&\quad -\,3 (\gamma _3^2-\gamma _6) \big (G_2(x_2,y_2)+ (\lambda R+\lambda _d)^2 G_3(x_3,y_3)\nonumber \\&\quad +\,(R-1) \lambda _d^2 G_3(x_4,y_4) \big ) \Big ] \end{aligned}$$
(42)

with the definitions of \(G_i,x_i,y_i\) given by from (D.21) to (D.31).

5 Saddle point analysis in the leading order

The integral expressions of the observables (18) in the leading order do not seem to have explicit expressions with known functions. Therefore, a way to obtain their explicit values is to numerically perform the integrations. This will be performed in Sect. 6 to compare with the Monte Carlo results. In this section, on the other hand, we will apply the saddle point approximation to the integrals to obtain a qualitative global picture of the phase structure of the model.

To discuss the saddle point approximation of the partition function (3), let us consider the minus of the logarithm of the integrand, which is given byFootnote 8

$$\begin{aligned} \begin{aligned}&{\mathcal{F}}_{N,R} (\lambda , k,r) \!\equiv \! f_0\!-\!(NR-1)\log r \!-\! \log f_{N,R}(\lambda r^6) \!+\!k r^2. \end{aligned} \end{aligned}$$
(43)

with \(f_0=-\log V_{NR-1}\). A saddle point \(r=r_*\) of the integral is determined by

$$\begin{aligned} \left. \frac{\partial }{\partial r} {\mathcal{F}}_{N,R} (\lambda , k,r) \right| _{r=r_*}=0 \end{aligned}$$
(44)

with the second derivative being positive at the point. As \(f_{N,R}\), we take (5), which is the expression in the leading order of 1/R:

$$\begin{aligned} \begin{aligned}&{\mathcal{F}}^{1/R,leading}_{N,R}(\lambda ,k,r):= f_0 -(NR-1) \log r\\&\quad + \, A_0 \log (1+A_1 r^6 )+B_0 \log (1+ B_1 r^6)+k r^2 \end{aligned} \end{aligned}$$
(45)

with

$$\begin{aligned} \begin{aligned}&A_0=\frac{N(N-1)(N+4)}{12}, \quad A_1=\frac{12 \lambda }{N^3 R^2}, \\&B_0=\frac{N}{2}, \quad B_1=\frac{6(N+4) \lambda }{N^3 R^2}. \end{aligned} \end{aligned}$$
(46)

In the saddle point approximation, the free energy of the model is approximated by

$$\begin{aligned} F^{free}_{N,R}(\lambda ,k)\sim {\mathcal{F}}^{1/R,leading}_{N,R}(\lambda ,k,r^*)+\hbox {unimportant terms}, \end{aligned}$$
(47)

where the unimportant terms contain \(\log V_{NR-1}\), which does not depend on \(\lambda \) or k, and also some lower order terms in N, which will be discussed in the last paragraph of this section.

Let us first show that there exists a unique solution to the saddle point Eq. (44), for the leading order expression (45), in the integration region \(r\ge 0\). To see this, it is convenient to use a new parametrization of R in terms of \(\alpha \) as \(R=R_c (1+\alpha )\) with \(R_c=(N+1)(N+2)/2\) and \(-1<\alpha \). Then, by noting \(N R_c=6 (A_0+B_0)\), the saddle point equation (44) with (45) can be written as

$$\begin{aligned} N R_c \alpha -1 +\frac{6 A_0}{1+A_1 r_*^6}+\frac{6 B_0}{1+B_1 r_*^6}=2 k r_*^2. \end{aligned}$$
(48)

The lefthand side is obviously a decreasing function of \(r_*\) with a maximum at \(r_*=0\), while the righthand side is an increasing function from zero to the infinity. Since the maximum on the lefthand side is \(NR_c\alpha -1+6A_0 + 6B_0=NR-1\), there always exists a unique solution of \(r_*\) for \(N,R\ge 1\). Moreover, the solution is smooth: \(r_*\) does not jump in a discrete manner, when the parameters are continuously changed, because the \(r_*\)-dependence of each side continuously changes. This turns down the possibility that the model has a discontinuous phase transition in this treatment.

To discuss the solution more quantitatively with approximations, let us restrict ourselves to the parameter range of our interest: \(\lambda \sim O(1)\), \(N\gtrsim O(10)\), and \(k\lesssim O(1)\). In addition, for the simplicity of the following discussions, let us avoid the region around \(\alpha \sim 0\). By noting that \(NR_c,\) and \(A_0\) are of order \(O(10^3)\) or larger, one can find that, for each case of \(\alpha <0\) and \(\alpha >0\), there are only two relevant terms among all in (48). For \(\alpha <0\), the first and third terms on the lefthand side are relevant, and for \(\alpha >0\), the first term on the lefthand side and the one on the righthand side are relevant. By solving the equations under taking these relevant terms only, the solutions are respectively given by

$$\begin{aligned} r_*^2\sim \left\{ \begin{array}{ll} \left( \frac{1}{A_1} \left( - \frac{6 A_0}{N R_c \alpha }-1 \right) \right) ^{\frac{1}{3}},&{} \alpha <0,\\ \frac{N R_c \alpha }{2 k}, &{} \alpha >0. \end{array} \right. \end{aligned}$$
(49)

The first case shows divergent behavior for \(\alpha \rightarrow -0\). However, this should not be taken as it is, since the transition should not have such an abrupt behavior as discussed above. In fact, the simplification taken above brakes down in the vicinity of \(\alpha \sim 0\), and the real behavior is such that \(r_*^2\) smoothly interpolates between the two parameter regions in the vicinity of \(\alpha \sim 0\).

Fig. 2
figure 2

The behavior of the observables with respect to \(\alpha =(R-R_c)/R_c\) in the large N limit, based on the saddle point analysis with the use of the leading order result of \(f_{N,R}\). The parameters are assumed to be \(\lambda \sim O(1)\) and \(k\lesssim O(1)\)

The \(r_*^2\) in (49) has different large-N behavior in the two regions: \(N^\frac{7}{3}\) for \(\alpha <0\) and \(N^3\) for \(\alpha >0\). By normalizing \(r_*^2\) with the common factor \((NR_c)^{-1}\) for both the regions, we obtain

$$\begin{aligned} \tilde{r}_*^2\sim \left\{ \begin{array}{cl} 0,&{} \alpha \le 0,\\ \frac{\alpha }{2k}, &{} \alpha >0, \end{array} \right. \end{aligned}$$
(50)

in the \(N\rightarrow \infty \) limit, where \(\tilde{r}_*^2:=(NR_c)^{-1} r_*^2\). See the leftmost figure in Fig. 2. This characterizes the transition at \(\alpha =0\) as a continuous phase transition with \(\langle \hbox {Tr}(\phi ^t \phi )\rangle /(NR_c) = 0\) for \(\alpha \le 0\) and \(\langle \hbox {Tr}(\phi ^t \phi )\rangle /(NR_c) =\frac{\alpha }{2k}\) for \(\alpha >0\) in the thermodynamic limit.

The other observables can be treated in similar manners. By taking (18), putting \(r=r_*\), and taking the leading order in large N, one obtains

$$\begin{aligned} \begin{aligned} \langle U(\phi ) \rangle _{leading}&\sim \left\{ \begin{array}{cl} \frac{N^3}{12 \lambda }(1+\alpha ), &{}\quad \hbox {for } \alpha \le 0,\\ \frac{N^3}{12 \lambda }, &{}\quad \hbox {for } \alpha>0,\\ \end{array} \right. \\ \langle U_d(\phi ) \rangle _{leading}&\sim \left\{ \begin{array}{cl} \frac{N^3(1+\alpha )}{12 \lambda (-\alpha )}, &{}\quad \hbox {for } \alpha \le 0, \\ \frac{N^5}{16 k^3} \frac{\alpha ^3}{(1+\alpha )^2}, &{}\quad \hbox {for } \alpha >0. \end{array} \right. \end{aligned} \end{aligned}$$
(51)

The divergence of \(\langle U_d(\phi ) \rangle _{leading}\) in \(\alpha \rightarrow -0\) should not be taken as it is, because of the same reason mentioned above for \(r_*^2\). By normalizing \(\tilde{U}(\phi ):=(N^3/12 \lambda )^{-1} U(\phi )\) and \(\tilde{U}_d(\phi ):=(N^5/16 k^3)^{-1} U_d(\phi )\), one obtains

$$\begin{aligned} \begin{aligned} \langle \tilde{U}(\phi ) \rangle _{leading}&\sim \left\{ \begin{array}{c@{\qquad }l} 1+\alpha , &{}\hbox {for } \alpha \le 0,\\ 1, &{}\hbox {for } \alpha>0,\\ \end{array} \right. \\ \langle \tilde{U}_d(\phi ) \rangle _{leading}&\sim \left\{ \begin{array}{c@{\qquad }l} 0, &{}\hbox {for } \alpha \le 0, \\ \frac{\alpha ^3}{(1+\alpha )^2}, &{}\hbox {for } \alpha >0. \end{array} \right. \end{aligned} \end{aligned}$$
(52)

See the middle and rightmost figures in Fig. 2. The results support the same conclusion that there is a continuous phase transition at \(\alpha =0\).

Let us finally comment on the consistency of the above saddle point approximation in the large N case. From (45) and (49), it is straightforward by some explicit computations to find the expansion of \({\mathcal{F}}^{1/R,leading}_{N,R}\) around \(r\sim r_*\) as

$$\begin{aligned} {\mathcal{F}}^{1/R,leading}_{N,R} \sim a_0+a_2 (r-r_*)^2 +a_3 (r-r_*)^3+\cdots , \end{aligned}$$
(53)

where \(a_n:=\frac{1}{n!}\frac{d}{dr^n}{\mathcal{F}}^{1/R,leading}_{N,R}(\lambda ,k,r)|_{r=r_*}\) can easily be estimated as \(a_0\sim O(N^3)\), and

$$\begin{aligned} \begin{aligned}&a_2\sim O(N^{2/3}),\ a_3\sim O(N^{-1/2}) , \hbox { for }\alpha <0,\\&a_2\sim O(1),\ a_3\sim O(N^{-3/2}) , \hbox { for }\alpha >0. \end{aligned} \end{aligned}$$
(54)

Therefore the integral is dominated by the saddle point value \(a_0\), and the Gaussian integration around the saddle pointFootnote 9 and the higher orders in \(r-r^*\) are subdominant. In addition, the insertion of the observables \({\mathcal{O}}\) as in (18) for the computations of their expectation values does not change the location of the saddle point in the leading order, because the additional contribution \(\log {\mathcal{O}}\) to (53) is \(O(\log N)\), and cannot become comparable to the leading \(O(N^3)\) terms.

Fig. 3
figure 3

The numerical results of e.v. of the observables discussed in Sect. 3 for \(\lambda =1\), \(N=5\) and \(k=0.1\) (top three) and \(k=1.0\) (bottom three) against R. Plotted points represent the Monte Carlo results and ‘leading’ and ‘next-leading’ mean the evaluations based on Eq. (12) with perturbatively evaluated \(f_{N,R,\lambda ,\lambda _d}(t)\) in the leading and next-leading orders, respectively. There are some small windows within the figures, where one can see which of the leading and next-leading lines exists above/below the other, and can also see more clearly the results for \(\langle U_d \rangle \) in the transition region around \(R=R_c\). For the clarity of the small R regions, which are unclear in some of the figures, we provide Fig. 4

Fig. 4
figure 4

Magnification of the small R regions that are unclear in Fig. 3

6 Comparison with Monte Carlo simulations

Fig. 5
figure 5

\(\lambda =1\), \(N=10\) and \(k=0.05\) (top three), \(k=0.10\) (middle three) and \(k=1.00\) (bottom three) against R. The same notations are used, and some small windows are put for the same purposes as in Fig. 3. For the clarity of the small R regions that are unclear in some of the figures, we provide Fig. 6

Fig. 6
figure 6

Magnification of the small R regions that are unclear in Fig. 5

In Sect. 4, we have computed the leading order approximation of \(f_{N,R,\lambda ,\lambda _d}(t)\) defined in (11) in a perturbative method, and have obtained the result (14) along with (15) and so on. We have also derived the next-leading order correction (41) with (42), the details of which are given in “Appendix D”. With those results, we can numerically calculate the expectation values (e.v.) of the observables given in (12) by the expressions on the righthand sides. However, note that the above approximations of \(f_{N,R,\lambda ,\lambda _d}(t)\) are based on taking the perturbative expansion of \(S_{eff}\) given in (27) up to the second order in t. Therefore they seem to require the implicit assumption of small values of t, and may not generally be trusted for the computation of (12), because t is finally assigned with \(r^6\), and r is integrated over zero to infinity.

In view of the question above, it would be interesting to compute our model without any adoption of approximation methods. More specifically, in this section we compute the e.v. of the observables, (6) with \({\mathcal{O}}(\phi )=\hbox {Tr}(\phi ^t \phi ),U(\phi ),U_d(\phi )\), by the Monte Carlo simulations, and compare them with the analytical results obtained by numerically integrating the righthand sides of (12), where we put our perturbative results for \(f_{N,R,\lambda ,\lambda _d}(t)\). Note that, in our strategy of the approximations, R is not an expansion parameter, only t is, and therefore it is meaningful to compare the results in the full range of R.

The results of the comparison are summarized in Figs. 345 and 6. The points, each with an error bar (though it’s very small), represent the Monte Carlo results. For the information about the parameters taken in the Monte Carlo simulations, refer to the captions. In particular, we take \(\lambda =1\) in all the computations, because \(\lambda \) in the model (1) can be scaled out by a scale transformationFootnote 10 \(\phi \rightarrow \lambda ^{-1/6}\phi \) so that it is absorbed into k as \(k/\lambda ^{1/3}\). The dotted and chained lines represent the values of the e.v. of the observables in the leading and the next-leading orders by adopting \(f_{N,R,\lambda ,\lambda _d}(t)^{leading/next{-}leading}\) to (12), respectively. In some figures, it is difficult to distinguish these two lines, almost overlapping with each other. Therefore, we put some small windows within the figures, where these two lines can be distinguished, clearly showing which of them exists above/below the other. In fact, the relative locations (namely, above/below) of the two lines remain unchanged for each observable throughout the parameters NR and k in the figures. In the small windows within the figures of \(\langle U_d \rangle \), one can more clearly see the results in the vicinity of the transition region around \(R=R_c\). In addition, for the clarity of the small R regions of some of the figures, we provide Figs. 4 and 6, where one can in particular find that the analytical and Monte Carlo results approach each other as R becomes smaller.

An important thing that can be observed in the figures is that, for each N, there exists a region of R that separates the smaller and larger regions of R with different qualitative behaviors of the observables. This is more clearly seen for larger N and smaller k. The transition region we observe indeed exists around the value \(R_c=(N+1)(N+2)/2\), which was obtained as the critical point from the saddle point analysis in Sect. 5 (see Fig. 2).

It is an important physical question whether this transition of behavior is a phase-transition or just a crossover in the thermodynamic limit \(N\rightarrow \infty \). However, we cannot currently answer this question for certain with the Monte Carlo results presently available, and this would require larger scale Monte Carlo simulations. It seems also difficult to answer this question by our perturbative analytical methods because of the following reason. In the figures, we can find good agreement between the perturbative computations and the Monte Carlo results in the regions away from the transition region. This would support the validity of our perturbative calculations in those outside regions. On the other hand, we can observe that there exist some deviations between the perturbative computations and the Monte Carlo results in the transition region. The deviations appear in such a way that the Monte Carlo results smoothen the transition to make it more like a crossover. Therefore, it seems that the analytical expressions we have obtained as approximations do not seem to be reliable in the transition region.

We can further discuss this complication from another view point as follows. Let us look at the figures more closely. Then we can find that, as to the numerical relations among \(\langle \text {Tr} \phi ^t\phi \rangle \), \(\langle U \rangle \), and \(\langle U_d \rangle \) in the leading and the next-leading orders, the following hold:

$$\begin{aligned} \begin{aligned}&\langle \text {Tr}\,\phi ^t\phi \rangle : \text { including next{-}leading} > \text {leading},\\&\langle U \rangle : \text { including next{-}leading }<\text { leading},\\&\langle U_d \rangle : \text { including next{-}leading }<\text { leading}, \end{aligned} \end{aligned}$$
(55)

for all R. Therefore, while the next-leading order corrections indeed improve the approximations so that they approach the Monte Carlo results in the outside regions and this is also so for \(\langle \text {Tr}\,\phi ^t\phi \rangle \) and \(\langle U\rangle \) in the transition region, the last inequality about \(\langle U_d\rangle \) is in the opposite direction. This suggests that our perturbative treatment seems to have some difficulties in correctly taking into account some configurations that mainly contribute to \(U_d\) in the transition region. It would be an interesting future problem to identify these configurations.

Let us briefly explain our actual Monte Carlo simulations. We have performed Monte Carlo simulations with the standard Metropolis update method for the model (1) by using KEKCC, the cluster system of KEK. For each calculation shown in the figures, we performed 2 billion sweeps, where the time taken for this was generally about 7 and 23 hours with \(R=10\) and 130, respectively. We stored the data of the observables once per 400 sweeps, and computed their mean values and the \(1\sigma \)-errors by the Jackknife resampling method. In each calculation we always set the acceptance rate to be around 60\(\%\). However, to realize this 60\(\%\), we had to tune the step sizes in our Metropolis method to quite small values, especially in the region \(R\gtrsim N^2/2\).

Let us further comment on the last peculiar nature we encountered in the simulations. We have performed the simulations for \(k=1,0.1\) and 0.05 with \(N=10\) and \(k=1.0\) and 0.1 with \(N=5\), respectively, as shown in the figures. As suggested by the results in the figures, the transition could be sharpened, if we performed simulations with smaller values of k than those in the figures. However, when we tried to do so, we encountered a serious difficulty in particular in the region \(R\gtrsim N^2/2\). It was that the Metropolis step sizes must be tuned to very small values to keep the reasonable acceptance rate like 60%. Then, the performance of the simulations became so slow that we could not find the timing when the system had reached thermodynamic equilibriums: The system always looked like being in the middle stage of moving very slowly toward thermodynamic equilibriums, at least during one week of continuous running or so. Therefore we took relatively large values of k as those in the figures to avoid the serious difficulty that makes the simulations unreliable.

Finally, let us qualitatively explain why the analytical results computed by the perturbative method and the Monte Carlo results agree with each other outside the transition region of R. Let us start with a qualitative estimation of the effective action (27) with respect to the orders in R and t. One can obtain

$$\begin{aligned} S_{eff}(P)\sim (1+b_0\, t) P^2 + b_1\,t^2 P^4+\cdots , \end{aligned}$$
(56)

where we have ignored all the index structures of \(P^i_{abc}\) for notational simplicity, and the dominant R-dependencies of the coefficients are given by

$$\begin{aligned} \begin{aligned} b_0\sim \frac{c_0}{R^2}, \ b_1 \sim \frac{c_1}{R^5}. \end{aligned} \end{aligned}$$
(57)

Here the dominant R-dependence of \(b_0\) is obtained by putting the eigenvalue \(\lambda _{ev}\sim R\) into (36) and recalling \(\gamma _3\sim 1/R^3\), as defined in (16); The estimation of \(b_1\) is also straightforward and is given in “Appendix E”. Then, normalizing the quadratic part by rescaling \(P\rightarrow P/\sqrt{1+b_0 t}\) in (56), one finds that the actual quartic coupling of \(S_{eff}\) is estimated as

$$\begin{aligned} \frac{b_1\,t^2}{(1+b_0 t)^2}\sim \frac{c_1 t^2}{R(R^2+c_0 t)^2}, \end{aligned}$$
(58)

which is different from the naive value \(b_1\, t^2\). When R is small, \(r^2=\mathrm{Tr}(\phi ^t \phi )\) is dominated by small values as shown in the Monte Carlo computations above. This means that, since \(t\sim r^6\) in the usage (3), (58) is also dominated by small values, and therefor the quadratic term \(S_{eff}^{(2)}\) will give good analytic estimations. On the other hand, when R is large, \(t\sim r^6\) is dominated by large values as shown in the Monte Carlo results. Nonetheless what is remarkable is that (58) is always suppressed by 1/R irrespective of the values of t. Therefore the leading order term \(S_{eff}^{(2)}\) will again give good estimations. In the middle region, however, (58) could generally become large, and higher order and/or non-perturbative corrections may substantially contribute, as suggested by the deviations between the Monte Carlo and analytical results.

7 Topological structure of configurations

In this section, we will explain our observation on the topological structure of the configurations generated by the Monte Carlo simulations. Topology of a value of the matrix \(\phi _a^i\) can be analyzed by persistent homology [30], which is a modern technique of the topological data analysis (see “Appendix F” for a brief introduction of persistent homology.). More specifically, we performed the Monte Carlo simulations for \(N=4\) and \(R=10,15,20,25\) with \(\lambda =1,\ k=0.01\), and, for each case, uniformly took 100 samples of the values of \(\phi _a^i\) during a large number of sequential updates of order \(10^8\) after thermodynamic equilibriums were seen to be reached. Then, the samples were analyzed in terms of persistent homology. The analysis shows that the favored topology of the configurations is \(S^1\) for \(R=10,15\), but gradually changes to higher dimensional cycles, when R is increased. We will first explain the background motivation for this analysis, and will then show the results.

One of the present authors and his collaborators have been studying a tensor model in the canonical formalism, which we call canonical tensor model [16, 17], as a model of quantum gravity. In Ref. [18], it has been shown that the exact wave function of the tensor model has peaks at the configurations that are invariant under Lie groups. This phenomenon, which we call symmetry highlighting phenomenon, potentially has an important physical significance, since this phenomenon would imply the dominance of spacetimes symmetric under Lie groups through the correspondence between tensors and spaces developed in Ref. [19]. This symmetry highlighting phenomenon has first been shown for a toy wave function [20], which slightly simplifies the wave function of the canonical tensor model. The toy wave function is given byFootnote 11

$$\begin{aligned} \tilde{\varphi }(P)=\int _{\mathbb {R}^N} \prod _{a=1}^N d\phi _a \exp \left( I P_{abc}\phi _a \phi _b \phi _c +\left( I \kappa -\epsilon \right) \phi _a \phi _a\right) , \end{aligned}$$
(59)

where I denotes the imaginary unit, and \(\epsilon \) is a small positive regularization parameter to assure the convergence of the integral. The symmetry highlighting phenomenon is that the wave function has large peaks at \(P_{abc}\) that are invariant under Lie groups: \(P_{abc}=h_{a}^{a'}h_{b}^{b'}h_{c}^{c'}P_{a'b'c'}\) under \({}^\forall h \in H\), where H is a representation of a Lie group. The phenomenon can qualitatively be understood by the following rough argument: If \(P_{abc}\) is invariant under a Lie group, the integration over \(\phi _a\) in (59) will contribute coherently along the gauge orbit \(h_{a}^{a'} \phi _{a'} \ (^\forall h\in H)\), but, if this is not so, the contributions tend to cancel among themselves due to the phase oscillations of the integrand and the wave function takes relatively small values. In Ref. [20], some tractable simple cases have explicitly been studied, and the presence of the phenomenon has indeed be shown.

Other than the simple case studies, the peak structure of the toy wave function and that of the tensor model are largely unknown. One reason is that the number of independent components of \(P_{abc}\), which is about \(\sim N^3/6\), is so large that it is practically not possible to go over the whole configuration space of \(P_{abc}\). Rather, we will be able to obtain rough knowledge by integrating over \(P_{abc}\):

$$\begin{aligned}&\int _{-\infty }^\infty \prod _{a\le b\le c=1}^N dP_{abc} \ \tilde{\varphi }(P)^R \exp \left( -\alpha P_{abc}P_{abc} \right) \nonumber \\&\quad =\mathrm{const.}\, Z_{N,R}\left( \frac{1}{4\alpha },k \right) , \end{aligned}$$
(60)

where we have denoted \(k=-I \kappa +\epsilon \), and we have considered an arbitrary power R of the wave function, because the actual wave function of the tensor model has the corresponding powerFootnote 12 which specifically takes \(R=(N+2)(N+3)/2\) [15, 18, 29] (see “Appendix A”). In (60), we see that the integration of a power of the wave function over \(P_{abc}\) with a Gaussian weight produces the matrix model (1). If the integration is dominated by such peaks associated to Lie groups, we can expect that the N-dimensional vectors, \(\phi _a^i\ (i=1,2,\ldots ,R)\), in the matrix model tend to exist along some gauge orbits in the vector space.

In general, there exist a number of peaks (or rather ridges) associated with various Lie group symmetries and gauge orbits for the wave function (59). As argued and shown explicitly in Refs. [18, 20], peaks associated with lower dimensional Lie group symmetries generally exist more abundantly than those with higher dimensional symmetries, because the number of symmetry conditions that must be satisfied by \(P_{abc}\) is smaller for the former than for the latter. On the contrary, the peaks of the latter are generally higher than those of the former, because the gauge orbits of the latter have larger dimensions and provide more coherent contributions than the former. Therefore, there are competitions between height and abundancy, and it is generally a subtle question which of lower or higher dimensional Lie group symmetries is probabilistically favored in a given case.

Let us discuss this question in view of the correspondence to the matrix model (60), especially by considering changing the value of R. As R is the power of the wave function as in (60), larger R will enhance higher peaks compared to the lower ones. Therefore we expect that, for larger R, the contributions from peaks with higher dimensional symmetries will be more dominant. Otherwise, peaks with lower dimensional Lie group symmetries will dominate because of their abundant existence. In addition, when R is very small, non-symmetric configurations will dominate, since they exist most abundantly.

In the Monte Carlo simulation of our model, if a symmetric peak dominates as explained above, the N-dimensional vectors \(\phi _a^i\ (i=1,2,\ldots ,R)\) will be randomly distributed along the associated gauge orbit. A caution here is that the gauge orbit can take any O(N)-transformed location in the N-dimensional vector space due to the O(N) symmetry of the model. Therefore we should not simply plot all the samples of \(\phi _a^i\) generated by the Monte Carlo simulations in the N-dimensional vector space, since this will merely provide a trivial O(N)-invariant spherical distribution of points without any characteristics of the Lie-group symmetry associated to the dominant peak. Rather, we have to analyze each sample of \(\phi _a^i\) generated by the Monte Carlo simulations to extract its characteristics, and then pile up all the extractions over all the samples to find allover characteristic properties.

Fig. 7
figure 7

Persistent diagrams obtained from the Monte Carlo data of \(N=4,\ k=0.01,\ \lambda =1\) with \(R=10,15,20,25\) (from the top to the bottom). To avoid the dependence of the initial values, 10 independent Monte Carlo sequences were run, and the sampling were performed uniformly from the sequence of updates of \(\sim 10^9\) after thermodynamic equilibriums were seen. 100 configurations of \(\phi _a^i\) were uniformly sampled and the persistent homologies were analyzed (one- to three-dimensional homologies from the left to the right). The results of 100 samples are plotted on the same persistent diagrams. Blue dots represent the longest-life elements in each dimensional persistent homology group of each data, the yellow ones the second, and the green ones all after the second. The dots away from the diagonal line represent long-life persistent homology group elements, which are considered to be characteristics of a data, while those near the diagonal line are regarded as “noises”. The highest blue dots, namely those with the largest \(u_\mathrm{end}\) that represent the largest structure, move from \(H_1\) to \(H_2\) and then \(H_3\) with the increase of R

Topological structure of each sample of \(\phi _a^i\) can be analyzed by using the technique of persistent homology [30]. This is a modern applied mathematical technique of the topological data analysis, and can extract homology groups of a data (see “Appendix F”). Here an input data should be a set of points with relative distances. We used an open-source c++ program that is called RipserFootnote 13 for the analysis and plotted the output with Mathematica. For a configuration \(\phi _a^i\), we consider the replica number \(i=1,2,\ldots ,R\) to represent the label of “points” of a data set, and define the distances between two points i and j as

$$\begin{aligned} d(i,j):=\arccos \left( \frac{\phi _a^i\phi _a^j}{\sqrt{ \phi _a^i \phi _a^i \phi _b^j \phi _b^j} }\right) . \end{aligned}$$
(61)

The gist of this definition is that the N-dimensional vectors \(\phi _a^i\ (i=1,2,\ldots ,R)\) are projected onto the unit sphere \(S^{N-1}\), and the geodesic distances along the sphere are regarded as the distances. In particular, this definition is suited for detecting a gauge orbit \(h_{a}^b \phi _b\ (h\in H)\), since it is projected on the sphere irrespective of the size of the vector \(\phi _a\).

We want to see the phenomenon explained above in the actual Monte Carlo simulations. For this initial study, choosing small N would be preferred for simper analysis, because then there exist a small number of possibilities of gauge orbits with small dimensions, and also thermodynamic equilibriums can easily be reached due to the small number of degrees of freedom. Note however that this trades off the clarity of the homology structure being detected by persistent homology. This is because, for small N, the values of R at which the phenomenon appears are also rather small in the order of \(R\sim N^2/2\), as we will see in the analysis below. The homology cycles formed by small numbers (namely R) of points necessarily become obscure, especially higher dimensional cycles are difficult to be clearly detected.

For the actual simulation, we considered \(N=4\). In \(N=4\), as explicitly solved in Ref. [20], there exist only two possibilities of Lie group symmetries, SO(2) and SO(3), and the gauge orbits are \(S^1\) and \(S^2\), respectively. In fact, the ridges of the \(P_{abc}\) with these symmetries reach the origin \(P_{abc}=0\), and therefore we should also add the trivial possibility of \(S^3\) with the SO(4) symmetry, which is the symmetry of \(P_{abc}=0\) and is maximal. Figure 7 shows the persistent diagrams obtained from the Monte Carlo simulations with \(R=10,15,20,25\). Statistically speaking, one can observe that, starting from \(S^1\) at \(R=10,15\), higher dimensional cycles gradually appear and become the largest structure when R is increased, while lower dimensional cycles gradually take smaller values of u.

8 Summary and future prospects

In this paper, we studied a matrix model containing non-pairwise index contractions [21], which has a motivation from a tensor model of quantum gravity [16, 17]. More specifically, it has \(\phi _a^i\ (a=1,2,\ldots , N, \ i=1,2,\ldots ,R)\) as its degrees of freedom, where the lower indices are pairwise contracted, but the latter are not always done so. This matrix model has the same form as what appears in the replica trick of the spherical p-spin model for spin glasses [27, 28], though the variable and parameter ranges of our interest are different. We performed Monte Carlo simulations with the Metropolis update method, and compared the results with some analytical computations in the leading order, mostly based on the previous treatment in Ref. [21]. They are in good agreement outside the transition region, that is located around \(R\sim N^2/2\). In the transition region, however, there exist deviations between the simulations and the analytical results, and the deviations cannot be corrected well, even if the next-leading order contributions are included. It has not been determined whether the transition is a phase transition or a crossover in the thermodynamic limit \(N\rightarrow \infty \), because of the limited range of the parameters like \(N\lesssim 10\) available in our Monte Carlo simulation. Our Monte Carlo simulation tended to slow down especially at \(R\gtrsim N^2/2\) with large \(\lambda /k^3\), suspecting that the system gets glassy nature in the region, but no conclusive argument has been made for this aspect. We also studied the topological characteristics of the configurations \(\phi _a^i\) generated in the Monte Carlo simulations by using the modern technique called persistent homology [30] in topological data analysis. This technique extracts the homology structure of a data, which is a configuration of \(\phi _a^i\) in our case. We observed that, in the vicinity of the transition region, the homology structure of the configurations gradually changes from \(S^1\) to higher-dimensional cycles with the increase of R.

A particularly interesting result of this paper is that there seems to exist a transition region around \(R\sim N^2/2\). Intriguingly, this value of R coincides with what is required by the consistency of the tensor model (namely, the hermiticity of the hamiltonian constraint. See “Appendix A”) [15, 18, 29]. Moreover, our model seems to have the most interesting properties in this region, but they are not well understood: There are some deviations between the simulation and the analytical results in this region, but the reason is not clear; The transition of the homological structure of the dominant configurations in this vicinity is peculiar but not well understood; Whether the transition is a phase transition or a crossover in the thermodynamic limit \(N\rightarrow \infty \) is not determined. For the better understanding in the future, it seems necessary to treat larger N cases with large \(\lambda /k^3\) by employing more efficient methods of Monte Carlo simulations and finding more powerful analytical methods.