Hydrodynamics of core-collapse supernovae and their progenitors

Abstract

Multi-dimensional fluid flow plays a paramount role in the explosions of massive stars as core-collapse supernovae. In recent years, three-dimensional (3D) simulations of these phenomena have matured significantly. Considerable progress has been made towards identifying the ingredients for shock revival by the neutrino-driven mechanism, and successful explosions have already been obtained in a number of self-consistent 3D models. These advances also bring new challenges, however. Prompted by a need for increased physical realism and meaningful model validation, supernova theory is now moving towards a more integrated view that connects multi-dimensional phenomena in the late convective burning stages prior to collapse, the explosion engine, and mixing instabilities in the supernova envelope. Here we review our current understanding of multi-D fluid flow in core-collapse supernovae and their progenitors. We start by outlining specific challenges faced by hydrodynamic simulations of core-collapse supernovae and of the late convective burning stages. We then discuss recent advances and open questions in theory and simulations.

Introduction

The death of massive stars is invariably spectacular. In the cores of these stars, nuclear fusion proceeds all the way to the iron group through a sequence of burning stages. At the end of the star’s life, nuclear energy generation has ceased in the degenerate Fe core (i.e., the core has become “inert”), while nuclear burning continues in shells composed of progressively light nuclei further out. Once the core approaches its effective Chandrasekhar mass and becomes sufficiently compact, electron captures on heavy nuclei and partial nuclear disintegration lead to collapse on a free-fall time scale, leaving behind a neutron star or black hole. In most cases, a small fraction of the potential energy liberated during collapse is transferred to the stellar envelope, which is expelled in a powerful explosion known as a core-collapse supernova, as first recognized by Baade and Zwicky (1934).

How precisely the envelope is ejected has remained one of the foremost questions in computational astrophysics ever since the first modeling attempts in the 1960s (Colgate et al. 1961; Colgate and White 1966). In this review we focus on the critical role of multi-dimensional (multi-D) fluid flow during the supernova explosion itself and the final pre-collapse stages of their progenitors.

For pedagogical reasons, it is preferable to commence our brief exposition of multi-D hydrodynamic effects with the supernova explosion mechanism rather than to follow the sequence of events in nature, or historical chronology.

The multi-dimensional nature of the explosion mechanism

In principle, the collapse of an iron core to a neutron star opens a reservoir of several \(10^{53}\, \mathrm {erg}\) of potential energy, which appears more than sufficient to account for the typical inferred kinetic energies of observed core-collapse supernovae of order \(10^{51}\, \mathrm {erg}\) (see, e.g., Kasen and Woosley 2009; Pejcha and Prieto 2015).

Transferring the requisite amount of energy from the young “proto-neutron star” (PNS) is not trivial, however. The simplest idea is that the energy is delivered when the collapsing core overshoots nuclear density and “bounces” due to the high incompressibility of matter above nuclear saturation density, which launches a shock wave into the surrounding shells (Colgate et al. 1961). However, the shock wave stalls within milliseconds as nuclear dissociation of the shocked material and neutrino losses drains its initial kinetic energy (e.g., Mazurek 1982; Burrows and Lattimer 1985; Bethe 1990). The shock then turns into an accretion shock, whose radius is essentially determined by the pre-shock ram pressure and the condition of hydrostatic equilibrium between the shock and the PNS surface. It typically reaches a radius between 100 and 200 km a few tens of milliseconds after bounce and then recedes again.

Among the various ideas to “revive” the shock (for a more exhaustive overview see Mezzacappa 2005; Kotake et al. 2006; Janka 2012; Burrows 2013) the neutrino-driven mechanism is the most promising scenario and has been explored most comprehensively since it was originally conceived—in a form rather different from the modern paradigm—by Colgate and White (1966). The modern version of this mechanism is illustrated in Fig. 1b: a fraction of the neutrinos emitted from the PNS and the cooling layer at its surface are reabsorbed further out in the “gain region”. If the neutrino heating is sufficiently strong, the increased thermal pressure drives the shock out, and the heating powers an outflow of matter in its wake.

Fig. 1
figure1

Overview of the multi-D effects operating prior to and during a core-collapse supernova as discussed in this review (below the dashed line) in the broader context of the evolution of a massive star. a After millions of years of H and He burning, the star enters the neutrino-cooled burning stages (C, Ne, O, Si burning). These advanced core and shell burning stages tpyically proceed convectively because burning and neutrino cooling at the top and bottom of an active shell/core establish an unstable negative entropy gradient. The interaction of the flow with convective boundaries can lead to mixing and transfer energy and angular momentum by wave excitation. Rotation may modify the flow dynamics. b After the star has formed a sufficiently massive iron core, the core undergoes gravitational collapse, and a young proto-neutron star is formed. The shock wave launched by the “bounce” of the core quickly stalls, and is likely revived by neutrino heating in most cases. In the phase leading up to shock revival, neutrino heating drives convection in the heating or “gain” region, and the shock may execute large-scale oscillations due the standing accretion shock instability (SASI). Rotation and the asymmetric infall of convective burning shells can modify the dynamics. There is also a convective region below the cooling layer at the proto-neutron star (PNS) surface. c After the shock has been revived and sufficient energy has been pumped into the explosion by neutrino heating or some other mechanism, the shock propagates through the outer shells on a time scale of hours to days. During this phase, the interaction of the (deformed) shock with shell interfaces as well as reverse shock formation trigger mixing by the Rayleigh–Taylor (RT), the Richtmyer–Meshkov (RM) instability, and (as a secondary process) the Kelvin–Helmholtz (KH) instability. Once the shock breaks out through the stellar surface, the explosion becomes visible as an electromagnetic transient. Mixing instabilities continue to operate on longer time scales throughout the evolution of the supernova remnant

However, according to the most sophisticated spherically symmetric (1D) models using Boltzmann neutrino transport to accurately model neutrino heating and cooling, this mechanism does not work in 1D (Liebendörfer et al. 2001; Rampp and Janka 2000) except for the least massiveFootnote 1 supernova progenitors. For all other progenitors, it is crucial that multi-D effects support the neutrino heating. Convection occurs in the gain region because neutrino heating establishes a negative entropy gradient (Bethe 1990), and was shown to be highly beneficial for obtaining neutrino-driven explosions by the first generation of multi-D models from the 1990s (Herant et al. 1992, 1994; Burrows et al. 1995; Janka and Müller 1995, 1996). Another instability, the standing-accretion shock instability (SASI; Blondin et al. 2003; Blondin and Mezzacappa 2006; Foglizzo et al. 2006, 2007; Laming 2007), arises due to an advective-acoustic amplification cycle and gives rise to large-scale (\(\ell =1,2\)) shock oscillations; it plays a similarly beneficial role in the neutrino-driven mechanism as convection (Scheck et al. 2008; Müller et al. 2012a). Rapid rotation could also modify the dynamics in the supernova core and support the development of neutrino-driven explosions (Janka et al. 2016; Summa et al. 2018; Takiwaki et al. 2016).

Evidently, multi-D effects are also at the heart of the most serious alternative to neutrino-driven explosions, the magnetohydrodynamic (MHD) mechanism (e.g., Akiyama et al. 2003; Burrows et al. 2007a; Winteler et al. 2012; Mösta et al. 2014b), which likely explains unusually energetic “hypernovae”. But whether core-collapse supernovae are driven by neutrinos or magnetic fields, it is pertinent to ask: How important are the initial conditions for the multi-D flow dynamics that leads to shock revival?

The multi-D structure of supernova progenitors

For pragmatic reasons, supernova models have long relied on 1D stellar evolution models as input, or at best on “1.5D” rotating models using the shellular approximation (Zahn 1992). For non-rotating progenitors, spherical symmetry was either broken by introducing perturbations in supernova simulations by hand, or due to grid perturbations. For rotating and magnetized progenitors, spherical symmetry is broken naturally, but on the other hand the stellar evolution models do not provide the detailed multi-D angular momentum distribution and magnetic field geometry, which must be specified by hand.

In reality, even non-rotating progenitors are not spherically symmetric at the onset of collapse. Outside the iron core, there are typically several active convective burning shells (Fig. 1a) that will collapse in the wake of the iron core within hundreds of milliseconds. It was realized in recent years that the infall of asymmetric shells can be important for shock revival (Couch and Ott 2013; Müller 2015; Couch et al. 2015; Müller et al. 2017a).

The multi-D structure of supernova progenitors is thus directly relevant for the neutrino-driven mechanism, but the potential ramifications of multi-D effects during the pre-collapse phase are in fact much broader: How do they affect mixing at convective boundaries, and hence the evolution of the shell structure on secular time scales? How do they affect the angular momentum distribution and magnetic fields in supernova progenitors?

Observational evidence for multi-D effects in core-collapse supernovae

Observations contain abundant clues about the multi-D nature of core-collapse supernova explosion. Large birth kicks of neutron stars (Hobbs et al. 2005; Faucher-Giguère and Kaspi 2006; Ng and Romani 2007) and even black holes (Repetto et al. 2012) cannot be explained by stellar dynamics alone and require asymmetries in the supernova engine. There is also evidence for mixing processes during the explosion and large-scale asymmetries in the ejecta from the spectra and polarization signatures of many observed transients (e.g., Wang and Wheeler 2008; Patat 2017), and from young supernova remnants like Cas A (Grefenstette et al. 2014).

The relation between the asymmetries in the progenitor and the supernova core, and the asymmetries in observed transients and gaseous remnants is not straightforward, however. The observable symmetries are rather shaped by mixing processes that operate as the shock propagates through the stellar envelope (Fig. 1c). Rayleigh–Taylor instability occurs behind the shock as it scoops up material and decelerates (Chevalier 1976; Bandiera 1984), and the interaction of a non-spherical shock with shell interfaces can give rise to the Richtmyer–Meshkov instability (Kifonidis et al. 2006). The asymmetries imprinted during the first seconds of an explosion provide the seed for these late-time mixing instabilities, and 3D supernova modellling is now moving towards an integrated approach from the early to the late stages of the to better link the observations to the physics of the explosion mechanism (e.g., Hammer et al. 2010; Wongwathanarat et al. 2013; Müller et al. 2018; Chan et al. 2018, 2020), and, in future, even to the multi-D progenitor structure.

Scope and structure of this review

In this review, we are primarily concerned with the numerical techniques for modeling multi-D fluid flow in core-collapse supernovae and their progenitors, and with our current understanding of the theory and phenomenology of the multi-D fluid instabilities. Although multi-D effects are relevant to virtually all aspects of core-collapse supernova theory, we can only afford cursory attention to many of them in order to keep this overview focused.

There are many other reviews to fill the gap, or provide a different perspective. Janka (2012) provides a very broad, but less technical, overview of the core-collapse supernova problem. For the explosion mechanism and a different take on the role of multi-D effects, the reader may also consult Mezzacappa (2005), Kotake et al. (2006), Burrows (2013), Müller (2016), Janka et al. (2016) and Couch (2017). The important problem of neutrino transport is treated in considerable depth by Mezzacappa (2005, 2020) and is therefore not addressed in this review. A number of reviews address the potential of neutrinos (e.g., Kotake et al. 2006; Müller 2019b) and gravitational waves (Ott 2009; Kotake 2013; Kalogera et al. 2019) as diagnostics of the multi-dimensional dynamics in the supernova core.

We shall start by discussing the governing equations for reactive, self-gravitating flow and their numerical solution in the context of core-collapse supernovae and convective burning in Sect. 2. We do not treat numerical methods for MHD in supernova simulations, although we occasionally comment on the role of MHD effects in the later sections. In the subsequent sections, we then review recent simulation results and progress in the theoretical understanding of convection during the late burning stages (Sect. 3), of supernova shock revival (Sect. 4), and the hydrodynamics of the explosion phase (Sect. 5).

Numerical methods

Modeling the late stages of nuclear burning and the subsequent supernova explosion involves solving the familiar equations for mass, momentum, and energy conservation with source terms that account for nuclear burning and the exchange of energy and momentum with neutrinos. Viscosity and thermal heat conduction mediated by photons, electron/positrons, and ions can be neglected, and so we have (in the Newtonian limit and neglecting magnetic fields),

$$\begin{aligned} \frac{\partial \rho }{\partial t} +\nabla \cdot (\rho \mathbf {v})&= 0, \end{aligned}$$
(1)
$$\begin{aligned} \frac{\partial \rho \mathbf {v}}{\partial t} +\nabla \cdot (\rho \mathbf {v} \otimes \mathbf {v}) + \nabla P&= -\rho \nabla \varPhi +\mathbf {Q}_\mathrm {m}, \end{aligned}$$
(2)
$$\begin{aligned} \frac{\partial \rho (\varepsilon +v^2/ 2)}{\partial t} +\nabla \cdot \left[ \rho (\varepsilon +v^2/2)\mathbf {v} + P\mathbf {v}\right]&= -\rho \mathbf {v} \cdot \nabla \varPhi + Q_\mathrm {e} + \mathbf {Q}_\mathrm {m} \cdot \mathbf {v}, \end{aligned}$$
(3)

in terms of the density \(\rho \), the fluid velocity \(\mathbf {v}\), the pressure P, internal energy density \(\varepsilon \), the gravitational potential \(\varPhi \), and the neutrino energy and momentum source terms \(Q_\mathrm {e}\) and \(\mathbf {Q}_\mathrm {m}\). If we take \(\varepsilon \) to include nuclear rest-mass contributions, there is no source term for the nuclear energy generation rate; otherwise an additional source term \(\dot{Q}_\mathrm {nuc}\) appears on the right-hand side (RHS) of Eq. (3). These equations are supplemented by conservation equations for the mass fractions \(X_i\) of different nuclear species and the electron fraction \(Y_\mathrm {e}\) (net number of electrons per baryon),

$$\begin{aligned} \frac{\partial \rho X_i}{\partial t} +\nabla \cdot (\rho X_i \mathbf {v})&= \dot{X}_{i,\mathrm {burn}}, \end{aligned}$$
(4)
$$\begin{aligned} \frac{\partial \rho Y_\mathrm {e}}{\partial t} +\nabla \cdot (\rho Y_\mathrm {e})&= \dot{Q}_{Y_\mathrm {e}}, \end{aligned}$$
(5)

where the source terms \(\dot{X}_{i,\mathrm {burn}}\) and \(\dot{Q}_{Y_\mathrm {e}}\) account for nuclear reactions and the change of the electron fraction by \(\beta \) processes.

In the regime of sufficiently high optical depth, the effect of the neutrino source terms could alternatively be expressed by non-ideal terms for heat conduction, viscosity, and diffusion of lepton number (e.g., Imshennik and Nadezhin 1972; Bludman and van Riper 1978; Goodwin and Pethick 1982; van den Horn and van Weert 1983, 1984; Yudin and Nadyozhin 2008), but this approach would break down at low optical depth. The customary approach to Eqs. (15) is, therefore, to apply an operator-split approach and combine a solver for ideal hydrodynamics for the left-hand side (LHS) and the gravitational source terms with separate solvers for the source terms due to neutrino interactions and nuclear reactions. Simulations of the Kelvin–Helmholtz cooling phase of the PNS over time scales of seconds form an exception; here only the PNS interior is of interest so that it is possible and useful to formulate the neutrino source terms in the equilibrium diffusion approximation (Keil et al. 1996; Pons et al. 1999).

Hydrodynamics

A variety of computational methods are employed to solve the equations of ideal hydrodynamics in the context of supernova explosions or the late stellar burning stages. Nowadays, the vast majority of codes use Godunov-based high-resolution shock capturing (HRSC) schemes with higher-order reconstruction (see, e.g., LeVeque 1998b; Martí and Müller 2015; Balsara 2017 for a thorough introduction). Examples include implementations of the piecewise parabolic method of Colella and Glaz (1985) or extensions thereof in the Newtonian hydroydnamics codes Prometheus (Fryxell et al. 1989, 1991; Müller et al. 1991), which has been integrated into various neutrino transport solvers by the Garching group (Rampp and Janka 2002; Buras et al. 2006b; Scheck et al. 2006), its offshoot Prompi (Meakin and Arnett 2007b), PPMstar (Woodward et al. 2019), VH1 (Blondin and Lufkin 1993; Hawley et al. 2012) as used within the Chimera transport code (Bruenn et al. 2018), Flash (Fryxell et al. 2000), Castro (Almgren et al. 2006), Alcar (Just et al. 2015), and Fornax (Skinner et al. 2019). This approach is also used in most general relativistic (GR) hydrodynamics codes for core-collapse supernovae like CoCoNuT (Dimmelmeier et al. 2002; Müller et al. 2010), Zelmani (Reisswig et al. 2013; Roberts et al. 2016), and GRHydro (Mösta et al. 2014a). Godunov-types scheme with piecewise-linear total-variation diminishing (TVD) reconstruction are still used in the Fish code (Käppeli et al. 2011) and in the relativistic Fugra code of (Kuroda et al. 2012, 2016b). The 3DnSNe code of the Fukuoka group (e.g., Takiwaki et al. 2012), which is based on the ZEUS code of Stone and Norman (1992), has also switched from an artificial viscosity scheme to a Godunov-based finite-volume approach with TVD reconstruction (Yoshida et al. 2019).

Alternative strategies are less frequently employed. The Vulcan code (Livne 1993; Livne et al. 2004) uses a staggered grid and von Neumann–Richtmyer artificial viscosity (Von Neumann and Richtmyer 1950). The SNSPH code (Fryer et al. 2006) uses smoothed-particle hydrodynamics (Gingold and Monaghan 1977; Lucy 1977; for modern reviews see Price 2012; Rosswog 2015). Although less widely used in supernova modeling today, the SPH approach has been utilized for some of the early studies of Rayleigh–Taylor mixing (Herant and Benz 1991) and convectively-driven explosions (Herant et al. 1992) in 2D and later for the first 3D supernova simulations with gray neutrino transport (Fryer and Warren 2002). Multi-dimensional moving mesh schemes have been occasionally employed to simulate magnetorotational supernovae (Ardeljan et al. 2005) and reactive-convective flow in stellar interiors (Dearborn et al. 2006; Stancliffe et al. 2011). More recently “second-generation” moving-mesh codes based on Voronoi tessellation, such as Tess (Duffell and MacFadyen 2011) and Arepo (Springel 2010), have been developed and employed for simulations of jet outflows (Duffell and MacFadyen 2013, 2015), and fallback supernovae (Chan et al. 2018). Spectral solvers for the equations of hydrodynamics, while popular for solar convection, have so far been applied only once for simulating oxygen burning (Kuhlen et al. 2003), but never for the core-collapse supernova problem.

Since Godunov-based finite-volume solvers are now most commonly employed for simulating core-collapse supernovae and the late burning stages, we shall focus on the problem-specific challenges for this approach in this section.

Problem geometry and choice of grids

The physical problem geometry in global simulations of core-collapse supernovae and the late convective burning stages is characterized by approximate spherical symmetry, and one frequently needs to deal with strong radial stratification and a large range of radial scales. For example, during the pre-explosion and early explosion phase, the PNS develops a “density cliff” at its surface that is approximately in radiative equilibrium and can be approximated as an exponential isothermal atmosphere with a scale height H of

$$\begin{aligned} H=\frac{k T R^2}{G M m_\mathrm {b} }, \end{aligned}$$
(6)

in terms of the PNS mass M, radius R, and surface temperature T, and the baryon mass \(m_\mathrm {b}\). With typical values of \(M\sim 1.5\,M_\odot \), R shrinking down to a final value of \(\mathord {\sim }12\, \mathrm {km}\), and a temperature of a few MeV, the scale height soon shrinks to a few \(100 \, \mathrm {m}\). Later on during the explosion, the scales of interest shift to the radius of the entire star, which is of order \(\mathord {\sim } 10^8 \, \mathrm {km}\) for red supergiants.

The spherical problem geometry and the multi-scale nature of the flow is a critical element in the choice of the numerical grid for “star-in-a-box” simulations.Footnote 2 Cartesian grids, various spherical grids, and, on occasion, unstructured grids have been used in the field for global simulations and face different challenges.

Grid-induced perturbations Cartesian grids have the virtue of algorithmic simplicity and do not suffer from coordinate singularities, but also come with disadvantages as they are not adapted to the approximate symmetry of the physical problem. The unavoidable non-spherical perturbations from the grid make it impossible to reproduce the spherically symmetric limit in multi-D even for perfectly spherical initial conditions, or to study the growth of non-spherical perturbations in a fully controlled manner. The former deficiency is arguably an acceptable sacrifice, though it can limit the possibilities for code testing and verification, but the latter can introduce visible artifacts in simulations. For example, Cartesian codes sometimes produce non-vanishing gravitational wave signals from the bounce of non-rotating cores (Scheidegger et al. 2010), and often show dominant \(\ell =4\) modes during the growth phase of non-radial instabilities (Ott et al. 2012).

Handling the multi-scale problem Furthermore, a single Cartesian grid cannot easily handle the multiple scales encountered in the supernova problem. Even with \(\mathcal {O}(1000^3)\) zones, such a grid can at best cover the region inside \(1000 \, \mathrm {km}\) with acceptable resolution, but following the infall of matter for several \(100 \, \mathrm {ms}\) without boundary artifacts and the development of an explosion requires covering a region of at least \(10,000 \, \mathrm {km}\). This problem is often dealt with by using adaptive mesh refinement (AMR; see, e.g., Berger and Colella 1989; Fryxell et al. 2000), which is usually implemented as “fixed mesh refinment” for pre-defined nested cubic patches (e.g., Schnetter et al. 2004). Other codes have opted to combine a single central Cartesian patch or nested patches with a spherically symmetric region (Scheidegger et al. 2010) or multiple spherical polar patches (Ott et al. 2012) outside. For long-time simulations of Rayleigh–Taylor mixing in the envelope, standard adaptive or pre-defined mesh refinement may not be sufficiently efficient for covering the range of changing scales and necessitate manual remapping to a coarser grids (“homographic expansion”) as the simulation proceeds (Chen et al. 2013).

In spherical coordinates, the multi-scale nature of the problem can be accommodated to a large degree by employing a non-uniform radial grid that transitions to roughly equal spacing in \(\log r\) at large radii. Radial resolution can be added selectively in strongly stratified regions like the PNS surface (Buras et al. 2006b), or one can use an adaptive moving radial grid (Liebendörfer et al. 2004; Bruenn et al. 2018). However, some care must be exercised in using non-uniform radial grids. Rapid variations in the radial grid resolution \(\varDelta r/r\) can produce artifacts such as artificial waves and disturbances of hydrostatic equilibrium.

It is also straightforward to implement a moving radial grid to adapt to changing resolution requirements or the bulk contraction/expansion of the region of interest; see Winkler et al. (1984), Müller (1994) for an explanation of this technique. The MPA/Monash group routinely apply such a moving radial grid in quasi-Lagrangian mode during the collapse phase (Rampp and Janka 2000), and, with a prescribed grid function, in parameterized multi-D simulations of neutrino-driven explosions (Janka and Müller 1996; Scheck et al. 2006) and in simulations of convective burning (Müller et al. 2016b). The Oak Ridge group uses a truly adaptive radial grid in their supernova simulations with the Chimera code (Bruenn et al. 2018). A moving radial mesh might also appear useful for following the expansion of the ejecta and the formation of a strongly diluted central region in simulations of mixing instabilities in the envelope, but the definition of an appropriate grid function is non-trivial. Most simulations of mixing instabilities in spherical polar coordinates have therefore relied on simply removing zones continuously from the evacuated region of the blast wave to increase the time step (Hammer et al. 2010) rather than implementing a moving radial mesh (Müller et al. 2018).

Both fixed mesh refinement and spherical grids with non-uniform radial mesh spacing only provide limited adaptability to the structure of the flow. Truly adaptive mesh refinement can provide superior resolution in cases where very tenuous, non volume-filling flow structures emerge. Mixing instabilities in the envelope are a prime example for such a situation, and have often been studied using AMR in spherical polar coordinates (Kifonidis et al. 2000, 2003, 2006) in 2D and Cartesian coordinates (Chen et al. 2017).

The time step constraint in spherical polar coordinates While spherical polar coordinates are well-adapted to the problem geometry, they also suffer from drawbacks. One of these drawbacks—among others that we discuss further below—is that the converging cell geometry imposes stringent constraints on the time step near the grid axis and the origin. The Courant–Friedrichs–Lewy limit (Courant et al. 1928) for the time step \(\varDelta t\) requires that \(\varDelta t < r \,\varDelta \theta /(|\mathbf {v}|+c_\mathrm {s})\) in 2D and \(\varDelta t < r \sin \theta \,\varDelta \varphi /(|\mathbf {v}|+c_\mathrm {s})\) in 3D in terms of the grid spacing \(\varDelta \theta \) and \(\varDelta \varphi \) in latitude and longitude, and the fluid velocity \(\mathbf {v}\) and sound speed \(c_\mathrm {s}\). If \(\varDelta \theta =\varDelta \varphi \), this is worse than a Cartesian code with grid spacing comparable to \(\varDelta r\) by a factor \(\varDelta \theta \ll 1\) in 2D and \(\varDelta \theta ^2 \ll 1\) in 3D near the origin.

Various workarounds have been developed to tame this time-step constraint. Some core-collapse supernova codes (Prometheus-Vertex, Chimera, CoCoNuT) simulate the innermost region of the grid assuming spherical symmetry. The approximation of spherical symmetry is well justified in the core, since the innermost region of the PNS is convectively stable during the first seconds after collapse and explosion until the late Kelvin–Helmholtz cooling phase. Even more savings can be achieved by treating the PNS convection zone using mixing-length theory (Müller 2015), but this is a more severe approximation that significantly affects the predicted gravitational wave signals and certain features of the neutrino emission and nucleosynthesis. Concerns have also been voiced that the imposition of a spherical core region creates an immobile obstacle to the flow that leads to the violation of momentum conservation, which might have repercussions on neutron star kicks (Nordhaus et al. 2010a). While it is true that the PNS tends not to move in simulations with a spherical core region, Scheck et al. (2006) found (using a careful analysis based on hydro simulations in the accelerated frame comoving with the PNS) that the assumption of an immobile PNS does not gravely affect the dynamics in the supernova interior and the PNS kick in particular.

Even with a 1D treatment for the innermost grid zones, one is still left with a severe time-step constraint at the grid axis in 3D. A number of alternatives to spherical polar grids with uniform spacing in latitude \(\theta \) and longitude \(\varphi \) can help to remedy this. The simplest workaround is to adopt uniform spacing in \(\mu =\cos \theta \) instead of \(\theta \). In this case, one has \(\sin \theta = (2N_\theta -1)^{1/2}/N_\theta \approx \sqrt{2} N_\theta ^{-1/2}\) in the zones adjacent to the axis for \(N_\theta \) zones in latitude instead of \(\sin \theta \approx N_\theta ^{-1}/2\), so the time step limit scales as \(\varDelta t \propto \sqrt{2} N_\theta ^{-1/2} N_\phi ^{-1}\) instead of \(\varDelta t \propto N_\theta ^{-1} N_\phi ^{-1}/2\), where \(N_\varphi \) is the number of zones in longitude. Alternatively, one can selectively increase the \(\theta \)-grid spacing in the zones close to the axis. However, the time step constraint at the axis is still more restrictive than at the equator in this approach, and the aspect ratio of the grid cells becomes extreme near the pole, which can create problems with numerical stability and accuracy.

Fig. 2
figure2

Alternative spherical grids that avoid the tight time step constraint at the axis of standard spherical polar grids: a Grid with mesh coarsening in the \(\varphi \)-direction only. Only an octant of the entire grid is shown. b Dendritic grid with coarsening in the \(\theta \)- and \(\varphi \)-direction Image reproduced with permission from (Skinner et al. 2019), copyright by AAS. c Overset Yin–Yang grid (Kageyama and Sato 2004; Wongwathanarat et al. 2010a) with two overlapping spherical polar patches in yellow and cyan

One approach to fully cure the time step problem, which was first proposed for simulations of compact objects by Cerdá-Durán (2009), consists in abandoning the logically Cartesian grid in r, \(\theta \), and \(\varphi \) and selectively coarsening the grid spacing in \(\varphi \) (and possibly \(\theta \)) near the axis (and optionally at small r) as illustrated in Fig. 2. Such a mesh coarsening scheme has been included in the CoCoNuT-FMT code (Müller 2015) with coarsening in the \(\varphi \)-direction, and as a “dendritic grid” with coarsening in the \(\theta \)- and \(\varphi \)-direction in the Fornax code (Skinner et al. 2019) and the 3DnSNe code (Nakamura et al. 2019). Mesh coarsening can be implemented following standard AMR practice by prolongating from the coarser grids to the finer grids in the reconstruction step. Alternatively, one can continue using the hydro solver on a fine uniform grid in \(\theta \) and \(\varphi \), and average the solution over coarse “supercells” after each time step, followed by a conservative prolongation or “pre-reconstruction” step back onto the fine grid to ensure higher-order convergence. This has the advantage of retaining the data layout and algorithmic structure of a spherical polar code, but care is required to to ensure that the prolongation of the conserved variables does not introduce non-monotonicities in the primitive variables, which limits the pre-reconstruction step to second-order accuracy in practice (Müller et al. 2019). A possible concern with mesh coarsening on standard spherical polar grids is that it may favor the emergence of axis-aligned bipolar flow structures during the explosion phase in supernova simulations (Müller 2015; Nakamura et al. 2019). In practice, however, strong physical seed perturbations easily break any grid-induced alignment of the flow with the axis. As the more simulations with mesh coarsening become available (Müller et al. 2019; Burrows et al. 2020), it does not appear that axis alignment is a recurring problem.

Filtering in Fourier space, which has long been used in the meteorology community (Boyd 2001), provides another means of curing the restrictive time step constraint near the axis and has been implemented in the CoCoNuT-FMT code (Müller et al. 2019). This can also be implemented with minimal interventions in a solver for spherical polar grids, and is attractive because the amount of smoothing that is applied to the solution increases more gradually towards the axis than with mesh coarsening schemes. Müller et al. (2019) suggests that this eliminates the problem of axis-aligned flows. On the downside, simulations with Fourier filtering may occasionally encounter problems with the Gibbs phenomenon at the shock.

More radical solutions to the axis problem include overset grids and non-orthogonal spherical grids. An overset Yin–Yang grid (Kageyama and Sato 2004) has been implemented in the Prometheus code (Wongwathanarat et al. 2010a; Melson 2013) and used successfully for simulations of supernovae and convective burning. The Yin–Yang grid provides near-uniform resolution in all directions, solves the time step problem, and also eliminates the delicate problem of boundary conditions at the axis of a spherical polar grid. The added algorithmic complexity is limited to interpolation routines that provide boundary conditions; since each patch is part of a spherical polar grid, no modifications of the hydro solver for non-orthogonal grids are required. As a downside, it is more complicated—but possible (Peng et al. 2006)—to implement overset grids in a strictly conservative manner. In future, non-orthogonal grids spherical grids (Ronchi et al. 1996; Calhoun et al. 2008; Wongwathanarat et al. 2016) may provide another solution that avoids the axis problem and ensures conservation in a straightforward manner, but applications are so far limited to other astrophysical problems (Koldoba et al. 2002; Fragile et al. 2009; Shiota et al. 2010).

Boundary conditions The definition of the outer boundary conditions for Cartesian grids can be more delicate and less flexible than for spherical grids. Simulations of supernova shock revival and the late convective burning stages usually do not cover the entire star for efficiency reasons, and sometimes it can even be desirable to excise an inner core region, e.g., the PNS interior in supernova simulations (e.g., Janka and Müller 1996; Scheck et al. 2008; Ugliano et al. 2012; Ertl et al. 2016; Sukhbold et al. 2016) or the Fe/Si core in the O shell burning models of Müller et al. (2017a). To minimize artifacts near the outer and (for annular domains) the inner boundary, the best strategy is often to impose hydrostatic boundary conditions assuming constant entropy, so that the pressure P, density \(\rho \), and radial velocity \(v_r\) in the ghost cells are obtained as

$$\begin{aligned} \mathrm {d}P&= \int \rho g \,\mathrm {d}r, \end{aligned}$$
(7)
$$\begin{aligned} \mathrm {d}\rho&= \int c_\mathrm {s}^{-2} \,\mathrm {d}P,\end{aligned}$$
(8)
$$\begin{aligned} v_r&= 0. \end{aligned}$$
(9)

in terms of the radial gravitational acceleration g and the sound speed \(c_\mathrm {s}\) (cf. Zingale et al. 2002 for hydrostatic extrapolation in the plane-parallel case). This can be readily implemented for spherical grid, and the same is true for inflow, outflow, or wall boundary conditions for excised outer shells or an excised core.

In Cartesian coordinates, however, defining boundary conditions as a function of radius is at odds with the usual strategy of enforcing the boundary conditions by populating ghost zones along individual grid lines separately. For pragmatic reasons, one often enforces standard boundary conditions (reflecting/inflow/outflow) on the faces of the cubical domain instead (e.g., Couch et al. 2015), which is viable as long as the domain boundaries are sufficiently distant from the region of interest. Alternatively, one can impose fixed boundary conditions inside the cubical domain, but outside a smaller spherical region of interest (e.g., Woodward et al. 2018). Outflow conditions on an interior boundary, e.g., for fallback onto a compact remnant, can also be implemented relatively easily (Joggerst et al. 2009).

On the other hand, the boundary conditions at the axis and the origin require careful consideration in case of a spherical polar in order to minimize artifacts from the grid singularities. Conventionally, one uses reflecting boundary conditions to populate the ghost zones before performing the reconstruction in the r- and \(\theta \)-direction, i.e., one assumes odd parity for the velocity components \(v_r\) and \(v_\theta \) respectively, and even parity for scalar quantities and the transverse velocity components. This usually ensures that \(v_r\) and \(v_\theta \) do not blow up near the grid singularities, but in some cases stronger measures are required; e.g., one can enforce zero \(v_r\) or \(v_\theta \) in the cell next to the origin/grid axis, or switch to step function reconstruction in the first cell. One may also need to impose odd parity for \(v_\varphi \) or for better stability, or reconstruct the angular velocity component \(\omega _\varphi =v_\varphi /r\) instead of \(v_\varphi \).

No hard-and-fast rules for such fixes at the axis and the origin can be given, except perhaps that one should also consider treating the geometric source terms in spherical polar coordinates differently (see below) before applying fixes to the boundary conditions that reduce the order of reconstruction, or before manually damping or zeroing velocity components. In fact, the symmetry assumptions behind reflecting boundary conditions (i.e., \(v_r\rightarrow 0\) at the origin and \(v_\theta \rightarrow 0\)) are actually too strong. Strictly speaking, one should only impose the condition that the Cartesian velocity components \(v_x\), \(v_y\), and \(v_z\) are continuous across the singularity for smooth flow. In principle, this can be accommodated during the reconstruction by populating the ghost zones for \(r<0\), \(\theta <0\), and \(\theta >\pi \) with values from the corresponding grid lines across the origin or the axis, bearing in mind any flip in direction of the basis vectors \(\mathbf {e}_r\), \(\mathbf {e}_\theta \), and \(\mathbf {e}_\varphi \) across the coordinate singularity. For the reconstruction along the radial grid line with constant \(\theta \) and \(\varphi \), this comes down to defining

$$\begin{aligned} v_r (r)&=\, \left\{ \begin{array}{ll} v_r(r,\theta ,\varphi ),&{} r>0\\ -v_r(r,\pi -\theta ,\varphi +2\pi ),&{} r<0 \end{array} \right. \end{aligned}$$
(10)
$$\begin{aligned} v_\theta (r)&=\, \left\{ \begin{array}{ll} v_\theta (r,\theta ,\varphi ),&{} r>0\\ v_\theta (r,\pi -\theta ,\varphi +2\pi ),&{} r<0 \end{array} \right. \end{aligned}$$
(11)
$$\begin{aligned} v_\varphi (r)&=\, \left\{ \begin{array}{ll} v_\varphi (r,\theta ,\varphi ),&{} r>0\\ -v_\varphi (r,\pi -\theta ,\varphi +2\pi ),&{} r<0 \end{array} \right. , \end{aligned}$$
(12)

and, similarly, for reconstruction in the \(\theta \)-direction along a grid line with constant r and \(\varphi \):

$$\begin{aligned} v_r (\theta )&=\, \left\{ \begin{array}{ll} v_r(r,-\theta ,\varphi +2\pi ),&{} \theta<0\\ v_r(r,\theta ,\varphi ),&{} 0<\theta<\pi \\ v_r(r,2\pi -\theta ,\varphi +2\pi ),&{} \pi <\theta \end{array} \right. \end{aligned}$$
(13)
$$\begin{aligned} v_\theta (\theta )&= \,\left\{ \begin{array}{ll} -v_\theta (r,-\theta ,\varphi +2\pi ),&{} \theta<0\\ v_\theta (r,\theta ,\varphi ),&{} 0<\theta<\pi \\ -v_\theta (r,2\pi -\theta ,\varphi +2\pi ),&{} \pi <\theta \end{array} \right. \end{aligned}$$
(14)
$$\begin{aligned} v_\varphi (\theta )&=\, \left\{ \begin{array}{ll} -v_\varphi (r,-\theta ,\varphi +2\pi ),&{} \theta<0\\ v_\varphi (r,\theta ,\varphi ),&{} 0<\theta<\pi \\ -v_\varphi (r,2\pi -\theta ,\varphi +2\pi ),&{} \pi <\theta \end{array} \right. . \end{aligned}$$
(15)

This allows for non-zero values of \(v_r\) and \(v_\theta \) at the origin to reflect that matter can flow across the origin and the axis. Such special polar boundary conditions have been implemented for 3D light-bulbFootnote 3 simulations of SASI and convection with Flash (Fernández 2015), and are also used in the Fornax code (Skinner et al. 2019). In practice, however, reflecting boundary conditions do not appear to pose a major obstacle for flows across the axis or the origin if the diverging fictitious force terms are treated appropriately (see below). The reason is that reflecting boundaries merely slightly degrade the accuracy of the first cell interfaces away from the origin and the axis; the fact that velocity components at the coordinate singularity are (incorrectly) forced to zero on the cell interfaces at \(r=0\), \(\theta =0\), and \(\theta =\pi \) does not matter much because these interfaces have a vanishing surface area, so that the interface fluxes must vanish anyway.

Geometric source terms Another obstacle in spherical polar coordinates is the occurrence of fictitious force terms in the momentum equation. In terms of the density \(\rho \) and the orthonormal components \(v_i\) and \(g_i\) of the velocity and gravitational acceleration, the equations read,

$$\begin{aligned}&\frac{\partial \rho v_r}{\partial t} +\frac{1}{r^2}\frac{\partial r^2 \rho v_r^2}{\partial r} +\frac{1}{r^2}\frac{\partial r^2 \rho v_r v_\theta }{\partial r} +\frac{1}{r^2}\frac{\partial r^2 \rho v_r v_\varphi }{\partial r} +\frac{\partial P}{\partial r} = \rho g_r + \rho \frac{v_\theta ^2+v_\varphi ^2}{r},\end{aligned}$$
(16)
$$\begin{aligned}&\frac{\partial \rho v_\theta }{\partial t} +\frac{1}{r \sin \theta }\frac{\partial \sin \theta \rho v_r v_\theta }{\partial \theta } +\frac{1}{r \sin \theta }\frac{\partial \sin \theta \rho v_\theta ^2}{\partial \theta } +\frac{1}{r \sin \theta }\frac{\partial \sin \theta \rho v_\theta v_\varphi }{\partial \theta } \nonumber \\&\qquad +\frac{1}{r}\frac{\partial P}{\partial \theta } = \rho g_\theta + \rho \frac{\cot \theta v_\varphi ^2-v_r v_\theta }{r}, \end{aligned}$$
(17)
$$\begin{aligned}&\frac{\partial \rho v_\varphi }{\partial t} +\frac{1}{r \sin \theta }\frac{\partial \rho v_r v_\varphi }{\partial \varphi } +\frac{1}{r \sin \theta }\frac{\partial \rho v_\theta v_\varphi }{\partial \varphi } +\frac{1}{r \sin \theta }\frac{\partial \rho v_\varphi ^2}{\partial \varphi } +\frac{1}{r \sin \theta }\frac{\partial P}{\partial \varphi } \nonumber \\&\quad = \rho g_\varphi -\rho \frac{v_r v_\varphi +v_\theta v_\varphi \cot \theta }{r}, \end{aligned}$$
(18)

where the fictitious force terms are singular at the origin and at the axis. Often straightforward time-explicit discretization is sufficient for these source terms, especially in unsplit codes with Runge–Kutta time integration. In a dimensionally split implementation it can be advantageous to include a characteristic state correction in the Riemann problem due to (some of the) fictitious force terms (Colella and Woodward 1984). Time-centering of the geometric source terms can also lead to minor differences (Fernández 2015).

When stability problems or pronounced axis artifacts are encountered, one can adopt a more radical solution and transport the Cartesian momentum density \(\rho {\mathbf {v}}=(\rho v_x,\rho v_y,\rho v_z)\) while still using the components \(v_\alpha \) in the spherical polar basis as advection velocities, so that the fictitious force terms disappear entirely,

$$\begin{aligned} \frac{\partial \rho \mathbf {v}}{\partial t} + \frac{1}{\sqrt{\gamma }}\frac{\partial \sqrt{\gamma } \rho \mathbf {v} v^\alpha }{\partial x^\alpha } + \nabla P =\rho \mathbf {g}, \end{aligned}$$
(19)

where \(\alpha \in \{r,\theta ,\varphi \}\) and \(\gamma =r^4 \sin ^2 \theta \) is the determinant of the metric. This has been implemented in the CoCoNuT-FMT code as one of several options for the solution of the momentum equation. Transforming back and forth between the spherical polar basis for the reconstruction and solution of the Riemann problem and the Cartesian components for the update of the conserved quantities might appear cumbersome, but in fact one need not explicitly transform to Cartesian components at all. Instead one only needs to rotate vectorial quantities from the interface to the cell center when updating the momentum components in the spherical polar basis. For example, for uniform grid spacing \(\varDelta \theta \) in the \(\theta \)-direction, the flux difference terms from the \(\theta \)-interfaces j and \(j+1\) for updating \(\rho v_r\) and \(\rho v_\theta \) in zone \(j+1/2\) become

$$\begin{aligned} \left( \frac{\partial \rho v_{r,j+1/2}}{\partial t}\right) _\theta&= \cos \varDelta \theta /2 [ \rho v_r v_\theta \varDelta A]_{j} +\sin \varDelta \theta /2 [ (\rho v_\theta ^2+P) \varDelta A]_{j} \nonumber \\&-\cos \varDelta \theta /2 [ \rho v_r v_\theta \varDelta A]_{j+1} +\sin \varDelta \theta /2 [ (\rho v_\theta ^2+P) \varDelta A]_{j+1}, \end{aligned}$$
(20)
$$\begin{aligned} \left( \frac{\partial \rho v_{\theta ,j+1/2}}{\partial t}\right) _\theta&= \cos \varDelta \theta /2 [ (\rho v_\theta ^2+P) \varDelta A]_{j} -\sin \varDelta \theta /2 [\rho v_r v_\theta \varDelta A]_{j}\nonumber \\&-\cos \varDelta \theta /2 [(\rho v_\theta ^2+P)\varDelta A]_{j+1} -\sin \varDelta \theta /2 [ \rho v_r v_\theta \varDelta A]_{j+1}, \end{aligned}$$
(21)

where \(\varDelta A\) is the interface area and \(\varDelta \theta \) is the grid spacing in the \(\theta \)-direction, which is assumed to be uniform here. The term for \(\rho v_\varphi \) is not modified at all. Apart from eliminating the fictitious force terms in favor of flux flux terms, this alternative discretization of the momentum advection and pressure terms also complies with the conservation of total momentum (although discretization of the gravitational source term may still violate momentum conservation).

The singularities at the origin and the pole constitute a more severe problem for relativistic codes using free evolution schemes for the metric, where they can jeopardize the stability of the metric solver. We refer to Baumgarte et al. (2013, 2015) for a robust solution to this problem that has been implemented in their nada code; they employ a reference metric formulation both for the field equations and the fluid equations that factors out metric terms that become singular and use a partially implicit Runge–Kutta scheme to evolve the problematic terms.

Angular momentum conservation A somewhat related issue concerns the violation of angular momentum conservation in standard finite-volume codes (both with Eulerian grids and moving meshes). This is a concern especially for problems such as convection in rotating stars and magnetorotational explosions where rotation plays a major dynamical role and the evolution of the flow needs to be followed over long time scales. It is also an issue for question such as pulsar spin-up by asymmetric accretion, although a post-processing of the numerical angular momentum flux can help to obtain meaningful results even when there is a substantial violation of angular momentum conservation (Wongwathanarat et al. 2013).

Fig. 3
figure3

Profiles of the mass-weighted, spherically-averaged specific angular momentum \(\langle j\rangle \) in 3D simulations of convective oxygen shell burning in a rapidly rotating gamma-ray burst progenitor with an initial helium core mass of \(16\,M_\odot \) from Woosley and Heger (2006). The red and blue curves show the angular momentum distribution after a simulation time of \(522 \, \mathrm {s}\) (about 15 convective turnovers) for the standard formulation (red) of the fictitious force terms in Eqs. (17,18) and for the alternative formulation (blue) in Eqs. (23,24). The initial angular momentum profile is shown in black. The alternative formulation reduces the violation of global angular momentum conservation from 20% to 7%. However, for the standard formulation the effects of angular momentum non-conservation are much bigger locally than suggested by the global conservation error. The fictitious force terms proportional to \(v_r\) lead to a considerable loss of angular momentum near the reflecting boundaries, even though the angular momentum flux through the boundaries is exactly zero, and there is a spurious increase of angular momentum at the bottom of the convective oxygen burning shell outside the mass coordinate \(m=2.7\,M_\odot \)

The problem of angular momentum non-conservation can be solved, or at least mitigated, using Discontinuous Galerkin methods (Després and Labourasse 2015; Mocz et al. 2014; Schaal et al. 2015), which are not currently used in this field however, and can be avoided entirely in SPH (Price 2012). For a given numerical scheme, increasing the resolution is usually the only solution to minimize the conservation error, but in 3D spherical polar coordinates (and in 2D cylindrical coordinates), one can still ensure exact conservation of the angular momentum component \(L_z\) along the grid axis by conservatively discretizing the conservation equation for \(\rho v_\varphi r \sin \theta \),

$$\begin{aligned}&\frac{\partial \rho v_\varphi r \sin \theta }{\partial t} +\frac{1}{r \sin \theta }\frac{\partial \rho v_r v_\varphi r \sin \theta }{\partial \varphi } +\frac{1}{r \sin \theta }\frac{\partial \rho v_\theta v_\varphi r \sin \theta }{\partial \varphi } \nonumber \\&\quad +\frac{1}{r \sin \theta }\frac{\partial \rho v_\varphi ^2 r \sin \theta }{\partial \varphi } +\frac{\partial P}{\partial \varphi } = \rho g_\varphi r \sin \theta , \end{aligned}$$
(22)

instead of Eq. (18). Incidentally, this angular-momentum conserving formulation emerges automatically in GR hydrodynamics in spherical polar coordinates if one solves for the covariant momentum density components as in the CoCoNuT code (Dimmelmeier et al. 2002). However, as a price for exact conservation of \(L_z\), one occasionally encounters very rapid rotational flow around the axis, and enforcing conservation of only one angular component may add to artificial flow anisotropies due to the spherical polar grid geometry. Moreover, this recipe cannot be used for Yin–Yang-type overset spherical grids or for non-orthogonal spherical grids. If angular momentum conservation is a concern, one can, however, resort to a compromise by conservatively discretizing the equations for \(\rho v_\theta r\) and \(\rho v_\varphi r\),

$$\begin{aligned}&\frac{\partial \rho v_\theta r}{\partial t} +\frac{1}{r \sin \theta }\frac{\partial \sin \theta \rho v_r v_\theta r}{\partial \theta } +\frac{1}{r \sin \theta }\frac{\partial \sin \theta \rho v_\theta ^2 r}{\partial \theta } +\frac{1}{r \sin \theta }\frac{\partial \sin \theta \rho v_\theta v_\varphi r}{\partial \theta } \nonumber \\&\qquad +\frac{\partial P}{\partial \theta } = \rho g_\theta r + \rho \frac{\cot \theta v_\varphi ^2}{r} \end{aligned}$$
(23)
$$\begin{aligned}&\frac{\partial \rho v_\varphi r}{\partial t} +\frac{1}{r \sin \theta }\frac{\partial \rho v_r v_\varphi r}{\partial \varphi } +\frac{1}{r \sin \theta }\frac{\partial \rho v_\theta v_\varphi r}{\partial \varphi } +\frac{1}{r \sin \theta }\frac{\partial \rho v_\varphi ^2 r}{\partial \varphi } +\frac{1}{ \sin \theta }\frac{\partial P}{\partial \varphi } \nonumber \\&\quad = \rho g_\varphi r -\rho \frac{v_\theta v_\varphi \cot \theta }{r}, \end{aligned}$$
(24)

which eliminates some of the fictitious force terms. This sometimes considerably improves angular momentum conservation for all angular momentum components and works for any spherical grid. Figure 3 illustrates the difference between the standard form of the fictitious force term and the alternative form in Eqs. (23,24) for a simulation of oxygen shell convection in a rapidly rotating gamma-ray burst progenitor.

Challenges of subsonic turbulent flow

In the final stages of massive stars one encounters a broad range of flow regimes. The convective Mach number, i.e., the ratio of the typical convective velocity to the sound speed, during the advanced convective burning stages is fairly low, ranging from \(\mathord {\sim } 10^{-3}\) or less (Cristini et al. 2017) to \(\mathord {\sim }0.1\)–0.2 in the innermost shells at the onset of collapse (Collins et al. 2018). The flow is highly turbulent with Reynolds numbers of order \(10^{13}\)\(10^{16}\). During the supernova, one finds convective Mach numbers from a few \(10^{-2}\) in PNS convection to \(\mathord {\sim } 0.4\) (Müller and Janka 2015; Summa et al. 2016) in the gain region around shock revival, and as the shock propagates through the envelope, the flow becomes extremely supersonic with Mach numbers of up to several \(10^2\). The flow in unstable regions is typically highly turbulent. In the gain region, one obtains a nominal Reynolds number of order \(10^{17}\) based on the neutron viscosity (Abdikamalov et al. 2015), but non-ideal effects of neutrino viscosity and drag play a role in the environment of the PNS (Burrows and Lattimer 1988; Guilet et al. 2015; Melson et al. 2020). In the outer regions of the PNS convection zone neutrino viscosity keeps the Reynolds number as low as \(\mathord {\sim }100\) during some phases (Burrows and Lattimer 1988), and in the gain region drag effects are still so large that the flow cannot be assumed to behave like ordinary high-Reynolds number flow (Melson et al. 2020). Later on, as the shock propagates through the envelope and mixing instabilities develop, neutrino drag becomes unimportant, and the flow is again in the regime of very high Reynolds numbers.

Both the vast range in Mach number and the turbulent nature of the flow present challenges for the accuracy and robustness of numerical simulations. While even relatively simple HRSC schemes can deal with Mach numbers of \(\mathrm {Ma}\gtrsim 1\) with aplomb at reasonably high resolution, they can become excessively diffusive at low Mach number because of spurious acoustic waves arising from the discontinuities in the reconstruction (Guillard and Murrone 2004; Miczek et al. 2015). Moreover, the acoustic time step constraint in explicit finite-volume codes becomes excessively restrictive at low Mach number compared to the physical time scales of interest. These problems can be dealt with by using various low-Mach number approximations, e.g., the anelastic approximation as in Glatzmaier (1984), or more general formulations as in the Maestro code of Nonaka et al. 2010, or by a time-implicit discretization of the full compressible Euler equations (Viallet et al. 2011; Miczek et al. 2015). However, few studies (Kuhlen et al. 2003; Michel 2019) have used such methods to deal with low-Mach number flow in the late stages of convective burning massive stars as yet; they have been employed more widely to study, e.g., the progenitors of thermonuclear supernovae (Zingale et al. 2011; Nonaka et al. 2012)

Part of the reason is that advanced HRSC schemes remain accurate and competitive down to Mach numbers of \(10^{-2}\) and below depending on the reconstruction method and the Riemann solver. Although it is impossible to decide a priori whether a particular choice of methods is adequate for a given physical problem, or what its resolution requirements are, it is useful to be aware of strengths and weaknesses of different schemes in the context of subsonic turbulent flow. Unfortunately, our discussion of these strengths and weaknesses must remain rather qualitative because very few studies in the field have compared the performance of different Riemann solvers and reconstruction schemes in full-scale simulations and not only for idealized test problems.

Fig. 4
figure4

Snapshots of the entropy from 2D simulations of the \(20\,M_\odot \) progenitor of Woosley and Heger (2007) using the CoCoNuT-FMT code at post-bounce time of \(137 \, \mathrm {ms}\) (top row), \(163 \, \mathrm {ms}\) (middle row), and \(226 \, \mathrm {ms}\) (bottom row). The left and rights halves of the panels in the left column show the results for the HLLE and HLLC solver with standard PPM reconstruction. The left and rights halves of the panels in the right column show the results for second-order reconstruction using the MC limiter and 6th-order extremum-preserving PPM reconstruction using the HLLC solver

Riemann solvers For supernova simulations with Godunov-based codes, a variety of Riemann solvers are currently used. Newtonian codes typically opt either for a (nearly) exact solution of the Riemann problem following Colella and Glaz (1985)Footnote 4 or for approximate solvers that at least take the full wave structure of the Riemann problem into account such as the HLLC solver (Toro et al. 1994). On the other hand, the majority of relativistic simulations still resort to the HLLE solver (Einfeldt 1988) because of the added complexity of full-wave approximate Riemann solvers for GR hydrodynamics; exceptions include the CoCoNuT code which routinely uses the relativistic HLLC solver of Mignone and Bodo (2005), the CoCoNuT simulations of Cerdá-Durán et al. (2005) using the Marquina solver (Donat and Marquina 1996), and the convection simulations with the WhiskeyTHC code (Radice et al. 2016), which uses a Roe-based flux-split scheme.

The use of simpler solvers in GR simulations is a concern because one-wave schemes behave significantly worse than full-wave solvers in the subsonic regime. The problem of excessive acoustic noise from the discontinuities introduced by the reconstruction is exacerbated because solvers like HLLE essentially have these discontinuities decay only into acoustic waves. This results in stronger numerical diffusion, but can also create artificial numerical noise because the diffusive terms in the HLLE or Rusanov flux can generate spurious pressure perturbations from isobaric conditions. While higher-order reconstruction can beat down the numerical diffusion for smooth flows, a strong degradation in accuracy is unavoidable in turbulent flow with structure on all scales.

Few attempts have as yet been made to quantify the impact of the more diffusive one-wave solvers in supernova simulations. In an idealized setup, the problem was addressed by Radice et al. (2015), who conducted simulations of stirred isotropic turbulence with solenoidal forcing with a turbulent Mach number of \(\mathrm {Ma}\sim 0.35\), two different Riemann solvers (HLLE vs. HLLC), and different reconstruction methods and grid resolutions. Even when using PPM reconstruction, they still found substantial differences between the HLLE and HLLC solver in the spectral properties of the turbulence with the HLLE runs requiring about \(50\%\) more zones to achieve an equivalent resolution of the turbulent cascade to HLLC. Although it is not easy to extrapolate from the idealized setup of Radice et al. (2015) to core-collapse supernova simulations, one must clearly expect resolution requirements to depend sensitively on the Riemann solver. This is also illustrated by 2D supernova simulations comparing the HLLE and HLLC solver using the CoCoNuT-FMT code as shown in Fig. 4: starting from the initial seed perturbations, the HLLC model shows a faster growth of large-scale SASI shock oscillations during its linear phase and earlier emergence of parasitic instabilities (see Sect. 4.4 for the physics behind the SASI) due to the smaller amount of numerical dissipation. The evolution of the shock differs significantly during the first \(100 \, \mathrm {ms}\) of SASI activity, although the models become similar in terms of shock radius and shock asphericity later on. However, even then HLLC run consistently shows a higher entropy contrast and higher non-radial velocities within the gain region.

Reconstruction methods Similar concerns (not restricted to the low-Mach number regime) as for the simpler Riemann solvers can be raised about the order of the reconstruction scheme. There is certainly a clear divide between second-order piecewise linear reconstruction and higher-order methods like the PPM, WENO (weighted essentially non-oscillatory Shu 1997), and higher-order monotonicity-preserving (MP Suresh and Huynh 1997) schemes. In their simulations of forced subsonic turbulence, Radice et al. (2015) found similar differences between second-order reconstruction using the monotonized central (MC; van Leer 1977) limiter—one of the shaper second-order limiters—and runs using PPM or WENO as between the HLLC and HLLE solver. Again, the lower accuracy of second-order schemes is often clearly visible in full supernova simulations, which is again illustrated in Fig. 4 for the same setup as above. Similar to the HLLE run, the simulations using the MC limiter shows a delayed growth of the SASI and less small-scale structure.

Comparing the more modern higher-order reconstruction methods is much more difficult. For smooth problems like single-mode linear waves solutions, going beyond the original 4th-order PPM method of Colella and Woodward (1984) to methods of 5th order or higher can substantially reduce numerical dissipation; in the optimal case, the dissipation decreases with the grid spacing \(\varDelta x\) as \(\varDelta x^{-q}\) for a qth order method (Rembiasz et al. 2017). However, the higher-order scaling of the numerical dissipation cannot be generalised to turbulent flow, because the dissipation of the shortest realizable modes at the grid scale does not increase as a higher power of q. Based on similarity arguments, one can work out that the effective Reynolds number of turbulent flow increases only as \(\mathrm {Re}\sim \varDelta x ^{-4/3}\) (Müller 2016) and not as \(\varDelta x^{q}\) as one might hope. The reason behind this limitation is that increasing the order of reconstruction does not increase the maximum wavenumber \(k_{\max }\) of modes that can be represented on the grid, it merely limits numerical dissipation to a narrow band of wavenumbers below \(k_{\max }\).

For the moderately subsonic turbulent flow in core-collapse supernovae during the accretion phase, higher-order reconstruction often does not bring any tangible improvements for this reason. Figure 4 again shows this by comparing runs using standard 4th-order PPM and the 6th-order extremum-preserving PPM method of Colella and Sekora (2008). In both cases, the evolution of the shock is very similar, even though the phases of the SASI oscillations eventually falls out of sync. It is not obvious by visual inspection that the higher-order method allows smaller structures to develop. Only upon deeper analysis can small differences between the two methods be found, for example the model with extremum-preserving reconstruction maintains a measurably higher entropy contrast in the gain region and a slightly higher turbulent kinetic energy in the gain region.

There are nonetheless situations where it is useful to adopt extremum-preserving methods of very high order in global simulations of turbulent flow. First, such methods open up the regime of low Mach numbers to explicit Godunv-based codes. Using their Apsara code, Wongwathanarat et al. (2016) were able to solve the Gresho vortex problem (Gresho and Chan 1990) with little dissipation down to a Mach number of \(10^{-4}\) with the extremum-preserving PPM method of Colella and Sekora (2008), which is about two orders of magnitude better than for the MC limiter (Miczek et al. 2015), and about one order of magnitude better than for standard PPM.

Modern higher-order methods can also be crucial in certain simulations of mixing at convective boundaries and nucleosynthesis. In the case of convective boundary mixing, this has been stressed and investigated by Woodward et al. (2010, 2014), who achieve higher accuracy for the advection of mass fractions in their PPMstar code by evolving moments of the concentration variables within each cell (which is somewhat reminiscent of the Discontinuous Galerkin method). They found that this Piecewise-Parabolic Boltzmann method only requires half the resolution of standard PPM to achieve the same accuracy (Woodward et al. 2010). Higher-order extremum-preserving methods may also prove particularly useful for minimizing the numerical diffusion of mass fractions in models of Rayleigh–Taylor mixing during the supernova explosion phase, but this is yet to be investigated.

High-Mach number flow

Some of the considerations for subsonic flow carry over to the supersonic and transsonic flow encountered during the supernova explosion phase where mixing instabilities also lead to turbulence, but there are also problems specific to the supersonic regime.

Sonic points It is well known that the original Roe solver produces spurious expansion shocks in transsonic rarefaction fans, which needs to be remedied by some form of entropy fix (Laney 1998; Toro 2009). While other full-wave solvers—like the exact solver and HLLC—never fail as spectacularly as Roe’s, they are still prone to mild instabilities at sonic points. Under adverse conditions, these instabilities can be amplified and turn into a serious numerical problem. In this case, it is advisable to switch to a more dissipative solver such as HLLE in the vicinity of the sonic point. In supernova simulations, this problem is sometimes encountered in the neutrino-driven wind that develops once accretion onto the PNS has ceased. It can also occur prior to shock revival in the infall region and severely affect the infall downstream of the instability, especially when nuclear burning is included. In this case, the problem can be easily overlooked or misidentified because it usually manifests itself as an unusually strong stationary burning front, which may seem perfectly physical at first glance.

Odd-even decoupling and the carbuncle phenomenon Full-wave solvers like the exact solver and HLLC are subject to an instability at shock fronts (Quirk 1994; Liou 2000): for grid-aligned shocks, insufficient dissipation in the direction parallel to the shock can cause odd-even decoupling in the solution, which manifests itself in artificial stripe-like patterns downstream of the shock (Fig. 5). When the shock is only locally tangential to a grid line, this instability can give rise to protrusions, which is known as the carbuncle phenomenon. In supernova simulations, odd-even decoupling was first recognized as a problem by Kifonidis et al. (2000), and since then the majority of supernova codes (e.g., Prometheus, Flash, CoCoNuT, Fornax) have opted to handle this problem by adaptively switching to the more dissipative HLLE solver at strong shocks following the suggestion of Quirk (1994). The Chimera code (Bruenn et al. 2018) adopts the alternative approach of a local oscillation filter (Sutherland et al. 2003), which has the advantage of not degrading the resolution in the direction perpendicular to the shock, but has the drawback of allowing the instability to grow to a minute level (which may be undetectable in practice) before smoothing is applied to the solution. The carbuncle phenomenon can also occur in Richtmyer-type artificial viscosity schemes and be cured by modifying the artificial viscosity (Iwakami et al. 2008). The carbuncle instability remains a subject of active research in computational fluid dynamics, and a number of papers (e.g., Nishikawa and Kitamura 2008; Huang et al. 2011; Rodionov 2017; Simon and Mandal 2019) have attempted to construct Riemann solvers or artificial viscosity schemes that avoid the instability without sacrificing accuracy away from shocks, and may eventually prove useful for supernova simulations.

Fig. 5
figure5

Odd-even decoupling in a 2D core-collapse supernova simulation of a \(20\,M_\odot \) star with the CoCoNuT that uses the HLLC solver everywhere instead of switching to HLLE at shocks. The left and right panels show the radial velocity in units of the speed of light and the entropy s in units of \(k_\mathrm {b}/\mathrm {nucleon}\) about \(10 \, \mathrm {ms}\) after bounce. The characteristic radial streaks from odd-even decoupling are clearly visible behind the shock

Kinetically-dominated flow In HRSC codes that solve the total energy equation, one obtains the mass-specific internal energy \(\varepsilon \) by subtracting the kinetic energy \(v^2/2\) from the total energy e. In high Mach-number flow, one has \(e\gg \varepsilon \) and \(v^2/2 \gg \varepsilon \), and hence subtracting these two large terms can introduce large errors in the internal energy density and the pressure and sometimes leads to severe stability problems. A similar problem can occur in magnetically-dominated regions in MHD. Sometimes the resulting stability problems can be remedied by evolving the internal energy equation

$$\begin{aligned} \frac{\partial \rho \varepsilon }{\partial t} +\nabla \cdot \left( \rho \varepsilon \mathbf {v} + P\mathbf {v}\right) -\mathbf {v}\cdot \nabla P = -\rho \mathbf {v} \cdot \nabla \varPhi + Q_\mathrm {e} + \mathbf {Q}_\mathrm {m} \cdot \mathbf {v}, \end{aligned}$$
(25)

instead of Eq. (3) in regions of high Mach number or low plasma-\(\beta \). However, in doing so one sacrifices strict energy conservation, and hence one should apply this recipe as parsimoniously as possible.

Treatment of gravity

Convective burning and core-collapse supernovae introduce specific challenges in the treatment of gravity. In the subsonic flow regimes, one needs to be wary of introducing undue artifical perturbations from hydrostatic equilibrium and take care to avoid secular conservation errors. Moreover, in the core-collapse supernova problem, general relativistic effects become important in the vicinity of the PNS.

Hydrostatic balance and conservation properties

For nearly hydrostatic flow, one has \(\nabla P \approx - \rho \nabla \varPhi \), but this near cancellation is not automatically reflected in the numerical solution when using a Godunov-based scheme. Instead, the stationary numerical solution may be one with non-zero advection terms that are exactly (but incorrectly) balanced by the gravitational source term (Greenberg and Leroux 1996; LeVeque 1998a). Schemes that avoid this pathology are called well-balanced. The proper cancellation between the pressure gradient and the gravitational source term is particularly delicate if those two terms are treated in operator-split steps. Different methods have been proposed to incoroprate well-balancing into Godunov-based schemes. One approach is to use piecewise hydrostatic reconstruction (e.g., Kastaun 2006; Käppeli and Mishra 2016). A related technique suggested by LeVeque (1998a) introduces discrete jumps in the middle of cells to obtain modified interface states for the Riemann problem and absorb the source terms altogether.

In practice, these special techniques are not used widely in the field for two reasons. First, it is not trivial to general these schemes to achieve higher-order accuracy. Second, one already obtains a very well-balanced scheme by combing higher-order reconstruction, an accurate Riemann solver, and unsplit time integration. For split schemes, one can ensure a quite accurate cancellation of the pressure gradient and the source term by including a characteristic state correction as described by Colella and Woodward (1984) for the original PPM method.

Nevertheless, the cancellation of the pressure gradient and the gravitational source term in hydrostatic equilibrium is usually not perfect and typically leads to minute odd-even noise in the velocity field that is almost undetectable by eye. Computing the gravitational source term \(\rho \mathbf {v}\cdot \nabla \varPhi \) in the energy equation using such a noisy velocity field \(\mathbf {v}\) can lead to an appreciable secular drift of the total energy. For example, spurious energy generation can stop proto-neutron star cooling on simulation time scales longer than a second (Müller 2009). This problem can be circumvented by discretizing the energy equation starting from the form (Müller et al. 2010)

$$\begin{aligned} \frac{\partial \rho (\varepsilon + v^2/2+\varPhi )}{\partial t} \nabla \cdot \left[ \rho \mathbf {v} (\varepsilon + v^2/2+\varPhi ) +P\mathbf {v}\right] = \rho \frac{\partial \varPhi }{\partial t}. \end{aligned}$$
(26)

This guarantees exact total energy conservation if the time derivative of the gravitational potential is zero. Under certain conditions, exact total energy conservation can be achieved for a time-dependent self-gravitating configuration as well, and the method can also be generalized to the relativistic case.

In principle, one can also implement the gravitational source term (in the Newtonian approximation) in the momentum equation in a conservative form by writing \(\rho \mathbf {g}\) as the divergence of a gravitational stress tensor (Shu 1992). Such a scheme has been implemented by Livne et al. (2004) in the Vulcan code. However, this procedure involves a more delicate modification of the equations than in case of the energy source term, because it essentially amounts to replacing \(\rho \) by the finite-difference representation of the Laplacian \((4\pi G)^{-1} \varDelta \varPhi \) in the momentum source term. Unless the solution for the gravitational potential is extremely accurate, large acceleration errors may thus arise. Moreover, this approach does not work for effective relativistic potentials (see Sect. 2.2.2). For these reasons, the conservative form of the gravitational source term has not been used in practice in other codes. Even though the issue of momentum conservation is of relevance in the context of neutron star kicks, conservation errors do not seem to affect supernova simulation results qualitatively in practice, and post-processing techniques can be used to infer neutron star velocities from simulations with good accuracy (Scheck et al. 2006).

Treatment of general relativity

In core-collapse supernova simulations, the relativistic compaction of the proto-neutron star reaches \(GM/Rc^2=0.1\)–0.2 even for a normal PNS mass M and a somewhat extended radius R of the warm PNS. Infall velocities of 0.15–0.3c are encountered. Hence general relativistic (GR) and special relativistic effects are no longer negligible, though the latter is more critical for the treatment of the neutrino transport than for the hydrodynamics. For very massive neutron stars, cases of black hole formation, or jet-driven explosions, relativistic effects can be more pronounced.

A variety of approaches is used in supernova modelling to deal with relativistic effects. Purely Newtonian models have now largely been superseded. Using Newtonian gravity results in unphysically large PNS radii, and, as a consequence, lower neutrino luminosities and mean energies and worse heating heating conditions than in the relativistic case, even though the stalled accretion shock radius is larger than in GR before explosion (Müller et al. 2012b; Kuroda et al. 2012; Lentz et al. 2012; O’Connor and Couch 2018b). As an economical alternative, one can retain the framework of Newtonian hydrodynamics but incoroprate relativistic corrections in the gravitational potential based on the TOV equation (Rampp and Janka 2002). This approach was subsequently refined by Marek et al. (2005), Müller et al. (2008) to account for some inconsistencies between the use of Newtonian hydrodynamics and a potential based on a relativstic stellar structure equation, but full consistency can never be achieved in the pseudo-Newtonian approach. In the multi-D case, the relativistic potential replaces the monopole of the Newtonian potential, while higher multipoles are left unchanged. From a purist point of view, this pseudo-Newtonian approach is delicate because one sacrifices global conservation laws for energy and momentum (which would still hold in a more complicated form in an asymptotically flat space in full GR). In practice, this is less critical; in PNS cooling simulations by Hüdepohl et al. (2010) the total emitted neutrino energy was found to agree with the neutron star binding energy (computed from the correct TOV solution) to within \(1\%\) for the modified TOV potential (Case A) of Marek et al. (2005).

If the framework of Newtonian hydrodyanmics is abandoned, one may still opt for an approximate method to solve for the space-time metric as in the CoCoNuT code (Dimmelmeier et al. 2005; Müller et al. 2010; Müller and Janka 2015). Elliptic formulations such as CFC (conformal flatness conditions, Isenberg 1978) and xCFC (a modification of CFC for improved numerical stability; Cordero-Carrión et al. 2009) can be cheaper and more stable than free-evolution schemes based on the 3 + 1 decomposition (for reviews of these techniques, see Baumgarte and Shapiro 2010; Lehner and Pretorius 2014) and maximally constrained schemes (Bonazzola et al. 2004; Cordero-Carrión et al. 2012). However, full GR supernova simulations without the CFC approximation and with multi-group transport have also become possible recently (Roberts et al. 2016; Ott et al. 2018; Kuroda et al. 2016b, 2018). Although CFC remains an approximation, it is exact in spherical symmetry, and comparisons with free-evolution schemes have shown excellent agreement in the context of rotational collapse have shown excellent agreement even for rapidly spinning progenitors (Ott et al. 2007).

Comparisons of pseudo-Newtonian and GR simulations have demonstrated that using an effective potential is at least sufficient to reproduce the PNS contraction, the shock evolution, and the neutrino emission in GR very well (Liebendörfer et al. 2005; Müller et al. 2010, 2012b). While Müller et al. (2012b) still found better heating conditions in the GR case than with an effective potential in their 2D models, this comparison was not fully controlled in the sense that two different hydro solvers were used, and the effect was related to subtle differences in the PNS convection zone, which may well be related to factors other than the GR treatment (cf. Sect. 4.7). Further code comparisons are desirable to resolve this. The pseudo-Newtonian approach, does, however, systematically distort the eigenfrequencies of neutron star oscillation modes (Müller et al. 2008). In particular, the frequency of the dominant f-/g-mode seen in the gravitational wave spectrum is shifted up by 15–20% compared to the correct relativistic value (Müller et al. 2013).

Poisson solvers

In the Newtonian approximation, the gravitational field is obtained by solving the Poisson equation

$$\begin{aligned} \varDelta \varPhi =4\pi G\rho . \end{aligned}$$
(27)

In constrained formulations of the Einstein equations like (x)CFC, one encounters non-linear Poisson equations.

In simulations of supernovae and the late convective burning stages, the density field usually only deviates modestly from spherical symmetry and is not exceedingly clumpy (except in the case of mixing instabilities in the envelope during the explosion phase when self-gravity is less important to begin with). For this reason, the usual method of choice for solving the Poisson equation (even in Cartesian geometry) is to use the multipole expansion of the Green’s function (Müller and Steinmetz 1995). Typically, no more than 10–20 multipoles are needed for good accuracy, and very often only the monopole component is retained. Other methods have been used occasionally, though, such as pseudospectral methods (Dimmelmeier et al. 2005) and finite-difference solvers (e.g., Burrows et al. 2007b), and the FFT (Hockney 1965; Eastwood and Brownrigg 1979) is a viable option for Cartesian simulations.

Although it yields accurate results at fairly cheap cost, some subtle issues can arise with the multipole expansion. When projecting the source density onto spherical harmonics \(Y_{\ell m}\) to obtain multipole components \(\hat{\rho }_{\ell m}=\int Y_{\ell m}^* \rho \,\mathrm {d}\varOmega \), a naive step-function integration can lead to a self-potential error (Couch et al. 2013) and destroy convergence with increasing mulitpole number \(N_\ell \). This can be avoided either by performing the integrals over spherical harmonics \(Y_{\ell m}^*\) analytically (Müller 1994), or by using a staggered grid for the potential (Couch et al. 2013). The accuracy of the solution can also be degraded if the central mass concentration moves away from the center of the grid, which can be cured by off-centering the multipole expansion (Couch et al. 2013). Problems with off-centred or clumpy mass distributions can be cured completely if an exact solver is used. In Cartesian geometry, this can be accomplished econmically using the FFT, and an exact solver for spherical polar grid using a discrete eigenfunction expansion has recently been developed as well (Müller and Chan 2019). On spherical multi-patch grids, the efficient parallelization and computation of integration weights requires some thought and has been addressed by Almanstötter et al. (2018) and Wongwathanarat (2019).

Reactive flow

Nuclear burning is the principal driver of the flow for core and shell convection in the late, neutrino-cooled evolutionary stages of supernova progenitors. In core-collapse supernovae nuclear dissociation and recombination play a critical role for the dynamics and energetics, and one of the key observables, the mass of \({}^{56}\mathrm {Ni}\), is determined by nuclear burning.

Approaches to nuclear transmutations differ widely between simulation codes, and range from the assumption of nuclear statistical equilbrium (NSE) everywhere in some core-collapse supernova models to rather large reaction networks. Naturally,the appropriate level of sophistication depends on the regime and the observables of interest. The theory of nuclear reaction networks is too vast to cover in detail here, and we can only touch a few salient points related to their integration into hydrodynamics codes. For a more extensive coverage, we refer to textbooks and reviews on the subject (Clayton 1968; Arnett 1996; Müller 1998; Timmes 1999; Hix and Meyer 2006; Iliadis 2007).

Burning regimes As stellar evolution proceeds towards collapse, the ratio of the nuclear time scale to both the sound crossing time scale and convective time scale decreases, and the nuclear reaction flow involves an increasing number of reactions. The burning of C, Ne, and to some extent of O is dominated by an overseeable number of main reaction channels, and the relevant reaction rates are slow compared to the relevant hydrodynamical time scales. During oxygen burning, quasi-equilibrium clusters begin to appear and eventually merge into one or two big clusters during Si burning (Bodansky et al. 1968; Woosley et al. 1973) that are linked by slow “bottleneck” reactions. For sufficiently high temperatures, NSE is established and the composition only depends on density \(\rho \), temperature T, and the electron fraction \(Y_\mathrm {e}\) and is given the Saha equation. At higher densities during core-collapse, the assumption of non-interacting nuclei break down, and a high-density equation of state is required (see Lattimer 2012; Oertel et al. 2017; Fischer et al. 2017, for recent reviews); this regime is not of concern here because the flow can be treated as non-reactive.

Simple approaches In core-collapse supernova simulations, one sometimes simply assumes NSE everywhere, which amounts to an implicit release of energy at the start of a simulation. Although the Si and O shell will still collapse in the wake of the Fe core, this is somewhat problematic, especially for long-time simulations where the effect on the infall is bound to be more pronounced. For mitigating potential artifacts from the inconsistency of the composition and equation of state with the underlying stellar evolution model, it can be useful to initialise supernova simulations using the pressure rather than the temperature of progenitor model.

A considerably better and very cheap approach, known as “flashing”, is to use a few key \(\alpha \)-elements and non-symmetric iron group nuclei in addition to protons, neutrons, and \(\alpha \)-particles and burn them instantly into their reaction products and eventually into NSE upon reaching certain threshold temperatures (Rampp and Janka 2002). Such an approach can capture the energetics of explosive burning in the shock and the freeze-out from NSE in neutrino-driven outflows reasonably well, but only gives indicative results on the composition of the ejected matter. The choice of the proper NSE threshold temperature \(T_\mathrm {NSE}\gtrsim 5\mathrm {GK}\) can be particularly delicate, since this depends on the entropy and expansion time scale of the outflow and can critically affect the degree of recombination in the neutrino-heated ejecta.

For simulations of convective burning, a smooth behavior of the nuclear source terms is required, but for C, Ne, or O burning, one can still resort to simple fit formulae and only track the composition of the main fuel and ash if the goal is merely to understand the dynamics of the flow (e.g., Kuhlen et al. 2003; Jones et al. 2017; Cristini et al. 2016, 2017; Andrassy et al. 2020). Often these source terms (or also rates in calculations with veritable network) are rescaled if low convective Mach numbers make simulations in the physical regime unfeasible. This can be useful for exploring the parameter dependence of flow phenomena, but caution is required because safe extrapolation to the physical regime may also require a rescaling of other terms (e.g., thermal diffusivity, neutrino losses).

Reaction networks In 2D, Bazán and Arnett (1997) already conducted simulations of convective burning with 123 species, but the use of large networks in 3D simulations is still prohibitively expensive. Modern 3D simulations of convective burning with the Prompi (e.g., Meakin and Arnett 2006; Mocák et al. 2018), Prometheus (Müller et al. 2016b; Yadav et al. 2020), Flash (Couch et al. 2015) and 3DnSNe (Yoshida et al. 2019) codes have therefore only use networks of 19–25 species consisting of \(\alpha \)-elements, light particles, and at most a few extra iron-group elements. In multi-D supernova simulations with neutrino transport the use of such networks is feasible (von Groote 2014; Bruenn et al. 2013, 2016; Wongwathanarat et al. 2017), though they have not been used widely yet. It is critical that such reduced reaction networks appropriately account for side chains and the effective reaction flow between light particles (Weaver et al. 1978; Timmes et al. 2000). Their use is problematic for Si burning which requires networks of more than a hundred species to accurately capture the quasi-equilibrium clusters and the effects of deleptonization (Weaver et al. 1978) and for freeze-out from NSE with considerable neutron excess. Larger networks or special methods for quasi-equilibrium (Weaver et al. 1978; Hix et al. 2007; Guidry et al. 2013) will be required for reliable multi-D simulations of convective Si burning.

Coupling to the hydrodynamics Some numerical issues arise when a nuclear network is coupled to a Eulerian hydrodynamics solver, or even if the composition is just tracked as a passive tracer. One such problem concerns the conservation of partial masses, which is guaranteed analytically by a conservation Eq. (4) for each species i,

$$\begin{aligned} \frac{\partial \rho X_i}{\partial t} +\nabla \cdot (\rho X_i \mathbf {v})= \dot{X}_{i,\mathrm {burn}}. \end{aligned}$$
(28)

This equation can be solved using standard, higher-order finite-volume techniques. However, the solution also has to obey the constraint

$$\begin{aligned} \sum X_i=1, \end{aligned}$$
(29)

which is not fulfilled automatically by the numerical solution, unless flat reconstruction for the mass fractions is employed. One could enforce this constraint by rescaling the mass fractions to sum up to unity, but this would violate the conservation of partial masses. Plewa and Müller (1999) developed the Consistent Multi-fluid Advection (CMA) method as the standard treatment to ensure both minimal numerical diffusion of mass fractions and enforce conservation of partial masses. This method involves a rescaling and coupling of the interpolated interface values of the various mass fractions. Plewa and Müller (1999) demonstrated that simple methods for the advection of mass fractions can easily result in wrong yields by a factor of a few for some isotopes in supernova explosions.

Another class of problems is related to advection errors and numerical diffusion, especially at contact discontinuities and shocks, which can lead to artificial detonations or an incorrect propagation of physical detonations (Colella et al. 1986; Fryxell et al. 1989; Müller 1998). To ensure that detonations propagate at the correct physical velocity, nuclear burning should be switched off in shocked zones (Müller 1998).Footnote 5 Due to the extreme temperature dependence of nuclear reaction rates, similar problems can arise away from discontinuities due to advection errors that produce a small level of noise in the temperature. Artificial detonations can easily develop in highly degenerate regions and around sonic points. Eliminating such artifacts may require appropriate switches for pathological zones or very high spatial resolution (e.g., Kitaura et al. 2006).

Late-stage convective burning in supernova progenitors

In the Introduction, we already outlined the motivation for multi-D simulations of supernova progenitors in broad terms. On the most basic level, multi-D models are needed to properly intialize supernova simulations and provide physically correct seed perturbations for the instabilities that develop after collapse and in the explosion phase. This does not, in fact, presuppose that 1D stellar evolution models incorrectly predict the overall spherical structure of pre-supernova progenitors; in principle such an initialization might involve nothing but adding some degrees of freedom to 1D stellar evolution models without any noticeable change of the spherically averaged stratification. Historically, however, simulations of late-stage convection have focused on deviations of the multi-D flow from the predictions of traditional mixing-length theory (MLT; Biermann 1932; Böhm-Vitense 1958; Weiss et al. 2004) and not evolved progenitor models up to core collapse, whereas the initialization problem has only been tackled recently by Couch et al. (2015), Müller et al. (2016b, 2019), Yadav et al. (2020). In this section, we therefore address the interior flow in convective regions and boundary effects first before specifically discussing multi-D pre-supernova models.

Interior flow

Let us first consider the flow within convectively unstable regions. In MLT as implemented in modern stellar evolution codes such as Kepler (Weaver et al. 1978; Heger and Woosley 2010) and Mesa (Paxton et al. 2011), the convective velocity \(v_\mathrm {conv}\) in such regions is tied to the superadiabaticity of the density gradient as encoded by the Brunt–Väisälä \(\omega _\mathrm {BV}\) frequency and the local pressure scale height \(\varLambda \),

$$\begin{aligned} v_\mathrm {conv}=\alpha \sqrt{\varLambda \delta \rho /\rho \, g}= \alpha \varLambda \omega _\mathrm {BV}, \end{aligned}$$
(30)

where \(\alpha \) is a tuneable parameter of order unity, and the MLT density contrast \(\delta \rho \) is obtained from the the difference between the actual and density gradient \(\partial \rho /\partial r\) and the adiabatic density gradient \((\partial \rho /\partial P)_s (\partial P/\partial r)\),

$$\begin{aligned} \delta \rho = \frac{\varLambda \rho \omega _\mathrm {BV}^2}{g} = \varLambda \left[ \frac{\partial \rho }{\partial r}- \left( \frac{\partial \rho }{\partial P}\right) _s \frac{\partial P}{\partial r}\right] = \varLambda \left( \frac{\partial \rho }{\partial r}-\frac{1}{c_\mathrm {s}^2} \frac{\partial P}{\partial r}\right) . \end{aligned}$$
(31)

Note that stellar evolution textbooks usually express the convective velocity and density contrast in terms of the difference between the actual and adiabatic temperature gradient (Clayton 1968; Weiss et al. 2004; Kippenhahn et al. 2012), but Eqs. (30) and (31) are fully equivalent formulations that often prove less cumbersome.

Using Eq. (30) for the convective velocity, Eq. (31) for the MLT density contrast, and the temperature contrast \(\delta T=(\partial T/\partial \ln \rho )_P (\delta \rho /\rho )\), we then obtain the convective energy flux \(F_\mathrm {conv}\) (Kippenhahn et al. 2012; Weiss et al. 2004),

$$\begin{aligned} F_\mathrm {conv}&= \,\alpha _e \rho c_P \, \delta T \, v_\mathrm {conv} \nonumber \\&= -\alpha \alpha _e \rho c_P \left( \frac{\partial T}{ \partial \ln \rho }\right) _P \frac{\varLambda ^2 \omega _\mathrm {BV}^3}{g}, \end{aligned}$$
(32)

where \(c_P\) is the specific heat at constant pressure, and \(\alpha _e\) is another tunable non-dimensional parameter. Similarly, by estimating the composition contrast \(\delta X_i\) using the local gradient as \(\delta X_i=\alpha _X \varLambda \, \partial X_i/\partial r\), we obtain the partial mass flux for species i

$$\begin{aligned} F_{X_i}= \rho v_\mathrm {conv} \delta X_i = \alpha _X \varLambda \rho v_\mathrm {conv} \frac{\partial X_i}{\partial r}, \end{aligned}$$
(33)

where \(\alpha _X\) is again a dimensionless parameter. When comparing 1D stellar evolution models to each other or to multi-D simulations, one must bear in mind that slightly different normalization conventions for Eqs. (32) and (33) are in use. Regardless of these ambiguities, these coefficients are of order unity, for example the Kepler code uses \(\alpha \alpha _e = 1/2\) and \(\alpha \alpha _X = 1/6\) (Weaver et al. 1978), which can be conveniently interpreted as \(\alpha = 1\), \(\alpha _e=1/2\) and \(\alpha _X = 1/6\) (Müller et al. 2016b),

In order to connect more easily to multi-D simulations, it is useful restate the assumptions and consequences of MLT (without radiative diffusion) in a slightly different language. Equation (30) can also be written as

$$\begin{aligned} \frac{v_\mathrm {conv}^3}{\varLambda }= \alpha ^2 \, \delta \rho /\rho \,g v_\mathrm {conv}, \end{aligned}$$
(34)

which we can interpret as a balance between the rate of buoyant energy generation (\(\alpha ^2\, \delta \rho /\rho \, v_\mathrm {conv}\)) and turbulent dissipation (\(\epsilon \sim v_\mathrm {conv}^3/\varLambda \)). Furthermore the work done by bouyancy must ultimately be supplied by nuclear burning. Using thermodynamic relations, we find that the potential energy \(\varLambda \, \delta \rho /\rho \, g\) liberated by bubbles rising or sinking by one mixing length is of the order of the enthalpy contrast \(\delta h\) of the bubbles, which roughly equals the integral of the nuclear energy generation rate \(\dot{q}_\mathrm {nuc}\) over one turnover time \(\tau =\varLambda /v_\mathrm {conv}\),

$$\begin{aligned} \varLambda \, \delta \rho /\rho \, g \sim \delta h \sim \dot{q}_\mathrm {nuc} \varLambda /v_\mathrm {conv}. \end{aligned}$$
(35)

Together, Eqs. (34) and (35) lead to a scaling law \(v_\mathrm {conv} \sim (\dot{q}_\mathrm {nuc} \varLambda )^{1/3}\) for the typical value of \(v_\mathrm {conv}\) in a convective shell.

Fig. 6
figure6

Energy generation rate \(\dot{q}_\mathrm {nuc}\) (black), neutrino cooling rate \(|\dot{q}_\nu |\) (red), and convective velocity \(v_\mathrm {conv}\) (blue) in an \(18.88\,M_\odot \) progenitor (1D model, discussed in Yadav et al. (2020) in the innermost shells about \(1\, \mathrm {hr}\) before collapse. At this stage, balanced power still obtains. Nuclear energy generation dominates at the bottom of the shells, while neutrino cooling dominates in the outer layers. The integrated energy generation and cooling rate for the entire shell, which are given by the areas under the black and red curve, nearly balance each other

In nature, balance between nuclear energy generation, buoyant energy generation, and turbulent dissipation is usually established over a few turnover times. On longer time scales, active burning shells also adjust by expansion or contraction until the total nuclear energy generation rate and neutrino cooling rate balance each other (Woosley et al. 1972), with the nuclear burning dominating in the inner region and neutrino cooling dominating in the outer region of the shell (Fig. 6).

Because of the extremely strong temperature sensitivity of the burning rates, this state of balanced power is difficult to maintain when setting up multi-D simulations and will only be reestablished over a long, thermal time scale.Footnote 6 In fact, the problem of thermal adjustment has not yet been rigorously analyzed for any multi-D model yet, and insufficient simulation time for thermal adjustment is a concern that needs to be addressed in future. However, the problem of thermal adjustment is mitigated during the latest phases of shell convection prior to collapse: As the core and the surrounding shells contract, the nuclear burning rates accelerate to a point where neutrino cooling and shell expansion by \(P\, \mathrm {d}V\) work can no longer re-establish thermal balance on the contraction time scale of the core, and the state of balanced power is physically broken.

Two-dimensional simulations of convective burning The first attempts to simulate late-stage convection in massive stars by Arnett (1994), Bazan and Arnett (1994, 1998) targeted oxygen burning in a \(20\,M_\odot \) star in a 2D shellular domain with the Prometheus code using a small, 12-species reaction network and neutrino cooling by the pair process. Starting from a simulation on a small wedge of \(18^\circ \) in Arnett (1994), Bazan and Arnett (1994, 1998) subsequently considered broader wedges in cylindrical symmetry of up to \(135^\circ \) with a resolution of up to \(460 \times 128\) grid cells, as well as cases with meridional symmetry on a 2D grid \((r,\varphi )\) in radius and longitude. The simulations were invariably limited to short periods (up to \(400 \, \mathrm {s}\) in Bazan and Arnett 1998) and only a few convective turnover times. One simulation (Bazán and Arnett 1997) also tackled Si burning in 2D with a large network of 123 nuclei. These first-generation 2D models invariably found violent convective motions with Mach numbers of 0.1–0.2 and velocities about an order of magnitude above the MLT predictions, which cannot be accounted for by the aforementioned ambiguities in the definition of the dimensionless coefficients. The convective structures invariably tended to grow to the largest angular scale allowed by the chosen wedge geometry, and large density perturbations were found at the convective boundaries. Bazan and Arnett (1998) also stressed the high temporal variability of the convective flow, going so far as to question whether a steady state is ever established before collapse. Longer simulations of the same \(20\,M_\odot \) model over \(1200 \, \mathrm {ms}\) by the same group using the Vulcan code of (Livne 1993) showed the emergence of a steady state, albeit quite different from the 1D stellar evolution model due to convective boundary mixing (Asida and Arnett 2000).

To a large extent, the pronounced differences between these first-generation simulations and MLT predictions stem from the assumption of 2D flow. In 2D turbulence, the energy cascade is artificially inverted and goes from small to large scales (Kraichnan 1967). As a result, the flow tends to organise itself into large vortices, and dissipation occurs primarily in boundary layers (Falkovich et al. 2017; Clercx and van Heijst 2017).

Three-dimensional simulations of convective burning Consequently, 3D simulations of convective burning obtained considerably smaller convective velocities. The first 3D, full \(4\pi \) solid angle models of O shell convection (along with models of core hydrogen burning) were presented by Kuhlen et al. (2003) for a \(25\,M_\odot \) star with and without rotation. Their simulations used the anelastic pseudospectral code of Glatzmaier (1984) to follow convection for about 90 turnovers in the non-rotating case, and approximated the burning and neutrino cooling rates by power-law fits. Different from the earlier 2D models, they found convective velocities in good agreement with the 1D MLT prediction in the underlying stellar evolution model, but the still observed the emergence of large-scale flow patterns.

The use of simplified burning and neutrino loss rates, the anelastic approximation, and an explicit turbulent diffusivity in Kuhlen et al. (2003) still posed a concern, which was subsequently addressed by a series of 2D and 3D simulations of O and C burning (Meakin and Arnett 2006, 2007a, b; Arnett et al. 2009) in wedge-shaped domains using the compressible Prompi code and a larger reaction network (25 species) than in the first generation of 2D models. These simulations confirmed the significantly less violent nature of 3D convection compared to 2D (Meakin and Arnett 2006, 2007b), and established good agreement between elastic and anelastic simulations on the convective velocities and fluctuations of thermodynamic quantities in the interior of convective zones, though anelastic codes cannot model fluctuations at convective boundaries very well (Meakin and Arnett 2007a). They also found balance between buoyant driving and turbulent dissipation (which is essentially a restatement of the basic assumption of MLT) and observed rough equipartiton between the radial and non-radial contributions to the turbulent kinetic energy (Arnett et al. 2009). Their models still revealed differences from MLT in detail, such as different correlation lengths for velocity and temperature and a non-vanishing kinetic energy flux (Meakin and Arnett 2007b). Moreover, Meakin and Arnett (2010) suggested that the implicit identification of the pressure scale height with the dissipation length in MLT might lead to an underestimation of the convective velocities. More recent work by the same group has stressed the time variability of the convective flow (Arnett and Meakin 2011a, b) and criticized the MLT assumption of quasi-stationary convective velocities. Specifically Arnett and Meakin (2011b) pointed to strong fluctuations in the turbulent kinetic energy in the 3D oxygen shell burning simulation in a \(23\,M_\odot \) star by Meakin and Arnett (2007b), which they attempted to motivate by recourse to the Lorenz model for convection in the Boussinesq approximation. The connection between the simulations of convective burning and the Lorenz model for a viscous-conductive convection problem remains rather opaque, however.

More recent work on 3D convection by other groups has vindicated rather than undermined MLT as an approximation for the interior of convective zones. Müller et al. (2016b) conducted a \(4\pi \)-simulation of O burning in an \(18\,M_\odot \) star up to the point of collapse and found that convection reaches a quasi-stationary state after a few turnovers with only small fluctuations in the turbulent kinetic energy. In line with MLT and as in Arnett et al. (2009), the average convective velocity is well described by a balance of turbulent dissipation and buoyant driving in their model, and is in turn related to the average nuclear energy generation rate \(\dot{q}_\mathrm {nuc}\) as

$$\begin{aligned} \frac{v_\mathrm {conv}^3}{\varLambda } \approx 0.7 \dot{q}_\mathrm {nuc}, \end{aligned}$$
(36)

and even the profiles of the radial component of the turbulent velocity perturbation are in good agreement with the corresponding 1D stellar evolution model. A similar scaling was reported by Jones et al. (2017) based on idealized high-resolution simulations of O burning with a simple EoS and parameterized nuclear source terms and by Cristini et al. (2017) based on simulations of C burning in planar 3D geometry, also with a parameterized (and artificially boosted) nuclear source term. Jones et al. (2017) verified this scaling over a wider range of convective luminosities and Mach numbers by applying different boost factors to the nuclear generation rate.

Regarding the dominant scales of the convective flow, the recent global 3D shell burning simulations (Chatzopoulos et al. 2014; Couch et al. 2015; Müller et al. 2016b; Jones et al. 2016; Yadav et al. 2020) confirm the emergence of large eddies with low angular wavenumber \(\ell \) that stretch across the entire convective zone. Müller et al. (2016b) verified quantitatively that the peak of the turbulent energy spectrum in \(\ell \) agrees well with the wavenumber of the first unstable convective mode at the critical Rayleigh number (Chandrasekhar 1961; Foglizzo et al. 2006),

$$\begin{aligned} \ell = \frac{\pi (R_+ + R_-)}{2 (R_+ - R_-)}. \end{aligned}$$
(37)

Further simulations that also explored thinner shells (Müller et al. 2019) show a shift towards higher \(\ell \) and corroborate this scaling as illustrated in Fig. 7. Beyond this dominant wavenumber, the turbulence exhibits a Kolmogorov spectrum (Chatzopoulos et al. 2014; Müller et al. 2016b).

Fig. 7
figure7

Dependence of the dominant eddy scale on the shell geometry illustrated by slices through 3D supernova progenitor models with convective burning and their turbulent energy spectra. The 2D slices show the radial velocity at the onset of collapse in progenitors of \(12\,M_\odot \) with metallicity \(Z=0\) (top right), \(14.07\,M_\odot \) with \(Z=Z_\odot \) (top left) , and \(12.5\,M_\odot \) with \(Z=Z_\odot \) (bottom left) with active convective O shells. The bottom right panel shows turbulent energy spectra \(E(\ell )\) computed from the radial velocity around the center of the convective zone. The dominant wavenumber expected from Eq. (37) is indicated at the top; note that there is an uncertainty because the outer boundaries of the convective zones are fuzzy. The dotted lines show the slope of a Kolmogorov spectrum. The plots for the \(12\,M_\odot \) and \(12.5M_\odot \) models have been adapted from Müller et al. (2019). Image reproduced with permission, copyright by the authors

Naturally, the modern 3D models still exhibit differences to MLT in detail even within convective zones. For example, \(\omega _\mathrm {BV}^2\) often changes sign in the outer parts of a convective layer in 3D, indicating that the spherically-averaged stratification is nominally stable (Mocák et al. 2009; Müller et al. 2016b). Müller et al. (2016b) also remark that the spherically-averaged mass fraction profiles tend to be flatter in 3D than in 1D, due to the usual asymmetric choice \(\alpha _X = \alpha _e/3\) for the MLT parameters for material diffusion and energy transport, which probably ought to be replaced by \(\alpha _X = \alpha _e\). A rigorous approach to quantify the structure of the convective flow and the differences between 3D and 1D models is available in the form of spherical Reynolds decomposition, which has been pursued systematically by Viallet et al. (2013), Mocák et al. (2014) and Arnett et al. (2015). The mere form of the Reynolds-averaged equations for bulk (i.e., spherically-averaged) and fluctuating quantities dictates that such an analysis invariably finds dozens of terms that are implicitly set to zero in MLT.

Assessment How are we to evaluate these commonalities and differences between 3D simulations and 1D stellar evolution flow? For most purposes, the question is not whether effects are missing in MLT-based 1D models (since the very purpose of an approximation like MLT is to retain only the leading effects), but whether those missing effects matter over secular time scales or have an impact during the supernova explosion. As we shall discuss in detail in Sect. 4.5, the presence of asymmetries in convective shells indeed matters during the supernova, but the fact is also that MLT and linear perturbation theory appear to predict the relevant parameters—the velocities and dominant scales of convective eddies—quite well. As far as the secular evolution of convective burning shells is concerned, there is little evidence that MLT does not adequately describe the flow within convective shells. There is typically good agreement in critical parameters for the shell evolution like the total nuclear burning rate. Many effects that MLT captures inaccurately and matter critically in models of convective envelopes and stellar atmospheres—such as the precise deviation of the stratification from superadiabticity—are of minor importance for the bulk evolution of massive stars during the late burning stages. For more tangible consequence of multi-D effects on secular time scales, we need to consider convective boundaries in Sect. 3.3.

Supernova progenitor models

Simulations to the presupernova stage Only a few models of convective burning have yet been carried up to the point of core collapse (Couch et al. 2015; Müller 2016; Müller et al. 2016b, 2019; Yadav et al. 2020; Yoshida et al. 2019) because of several obstacles. In order to accurately follow the composition changes and the deleptonization in the Fe core and Si shell (i.e., in the NSE and QSE regime) that drive the evolution towards collapse, reaction networks with well over a hundred nuclei are required (Weaver et al. 1978). This is feasible in principle, but yet impractical for well-resolved 3D simulations up to collapse. Furthermore, the initial transient phase and imperfect hydrostatic equilibrium after the mapping from 1D to multi-D may artificially delay the collapse.

Two different strategies have been employed to circumvent these problems. In their simulation of Si burning in a \(15\,M_\odot \) star for \(160 \, \mathrm {s}\), Couch et al. (2015) used an extended 21-species \(\alpha \)-network with some iron group nuclei added to model core deleptonization (Paxton et al. 2011). In order to force the core to collapse, they increased the electron capture rate on \({}^{56}\mathrm {Fe}\) by a factor of 50, and their 3D model in fact reaches collapse more than six times faster than the corresponding 1D stellar evolution model. This approach is problematic because any modification of the contraction time scale of the core also affects the burning in the outer shells (Müller et al. 2016b). Using the same 21-species network, Yoshida et al. (2019) managed to evolve a 3D simulation of a \(25\,M_\odot \) star and several 2D simulations of different progenitors for the last \(\mathord {\sim } 100 \, \mathrm {s}\) without modifying the deleptonization rate. This suggests that multi-D models can be evolved somewhat self-consistently to collapse even though the short simulation time is a concern in this particular case, since it remains unclear to what extent the results are affected by the initial transient.

The 3D studies of O shell burning in various progenitors by Müller (2016), Müller et al. (2016b), Müller et al. (2019) and Yadav et al. (2020) have followed a different approach and circumvented the problems of QSE and deleptonization by excising the major part of the Fe core and Si shell and replacing them with an inner boundary condition that is contracted according to a mass shell trajectory from the corresponding 1D stellar evolution model. This approach can be justified for many progenitors, which have no active convective Si shell, or only weak convection in the Si shell.

Evolution towards collapse The convective flow in the contracting burning shells shortly before collapse exhibits few noteworthy differences to the burning in quasi-hydrostatic shells described in Sect. 3.1. The 3D simulations of the different groups (Couch et al. 2015; Müller et al. 2016b; Yoshida et al. 2019) all show the emergence of modes with a dominant wavelength of the order of the shell width according to Eq. (37), and as far as comparisons have been performed, the convective velocities remain in good agreement with MLT until shortly before collapse. It is noteworthy, however, that the convective velocities and Mach numbers tend to increase significantly during the last minutes before collapse because the temperature at the base of the inner shells, and hence the burning rate, increase as they contract in the wake of the core. The convective velocities then freeze out shortly before collapse once the burning rate changes on a time scale shorter than the turnover time scale. This freeze-out seems to be captured adequately by time-dependent MLT so that 1D stellar evolution models provide good estimates for the convective velocities at the onset of collapse. Bigger differences between 1D and 3D progenitor models can occur in case of small buoyancy barriers between the O, Ne, and C shell, in which case 3D models are more likely to undergo a shell merger (Yadav et al. 2020).

The evolution of the convective shells during collapse will be discussed in Sect. 4.5.

Fig. 8
figure8

Convective Mach number (left) and dominant angular wave number (right) in the Si shell (black) and O shell (red) predicted from 1D single-star evolution models from the study of Collins et al. (2018). Image reproduced with permission, copyright by the authors

Progenitor dependence Since 3D simulations indicate that convective velocities and eddy scales can be estimated fairly well from 1D stellar evolution models, one can already roughly outline the progenitor dependence of convective shell properties as shown by Collins et al. (2018). Considering the active Si and O shell burning shells at the onset of collapse in over 2000 progenitor models, they find a number of systematic trends (Fig. 8): the O shell typically has a higher convective Mach number (0.1–0.3) than the Si shell, where usually \(\mathrm {Ma}<0.1\), but there are islands around \(16\,M_\odot \) and \(19\,M_\odot \) in ZAMS mass where the convective Mach number in the Si shell reaches about 0.15 and is higher than in the O shell. The highest convective Mach numbers of up to 0.3 are reached in the O shell of low-mass progenitors with small cores as O burns deeper in the gravitational potential at higher temperatures. The general trend towards lower convective velocities in the O shell with higher progenitor and core mass is modified by variations in shell entropy and the residual O mass fraction at the onset of collapse. Deviations from this general trend also come about because the various C, Ne, O, and Si shell burning episodes do not always occur in the same order, and because of shell mergers.

The O shell is usually thicker and therefore allows large-scale modes with wave numbers \(\ell <10\) to dominate. Large-scale modes are more prevalent in progenitors above \(16\,M_\odot \) with their more massive O shells. The first, thick Si shell is no longer active at collapse in most cases, and there is typically only a thin convective Si shell (if any) between the Fe core and O shell at collapse, which will dominated by small-scale motions.

Collins et al. (2018) also find a high prevalence of late shell O–Ne shell mergers among high-mass progenitors. In about 40% of their models between \(16\,M_\odot \) and \(26\,M_\odot \) such a merger was initiated within the last minutes of collapse.

Although some of these trends follow from robust structural features and trends in the progenitor evolution, these findings will need to be examined with different stellar evolution codes and may be modified in detail, especially when better prescriptions for convective boundary mixing on secular time scales become available.

Convective boundaries

Mixing by entrainment As one of the most conspicuous features in their first 2D models of O shell burning, Bazan and Arnett (1994) and Bazán and Arnett (1997) noted the mixing of considerable amounts of C from the overlying layer into the active burning region. Although mixing across convective boundaries (sometimes indistinctly called “overshooting”) had already been a long-standing topic in stellar evolution by then, these results were noteworthy because Bazan and Arnett (1994); Bazán and Arnett (1997) found much stronger convective boundary mixing (CBM) than compatible with overshoot prescriptions in 1D stellar evolution models of massive stars. Second, they observed that the mixed material can burn vigorously and thereby in turn dramatically affect the convective flow, i.e., there is the possibility of a feedback mechanism that cannot occur in the case of envelope convection or surface convection. Meakin and Arnett (2006) investigated this problem further in a situation with active and interacting O and C shells and observed strong excitation of p- and g-modes at convective-radiative boundaries, which, as they suggested, might also contribute to compositional mixing.

Critical steps beyond a mere descriptive analysis of CBM during the late burning stages were finally taken by Meakin and Arnett (2007a), who established i) the presence of CBM also in 3D (albeit weaker than in 2D), ii) identified the dominant process as entrainment driven by shear (Kelvin–Helmholtz and Holmböe) instabilities at the convective boundary, and iii) verified that the mass entrainment \(\dot{M}\) rate obeys a power law that can be motivated theoretically and has been verified in laboratory experiments of shear-driven entrainment (Fernando 1991; Strang and Fernando 2001):

$$\begin{aligned} \dot{M}=4\pi A \rho r^2 v_\mathrm {conv} \mathrm {Ri}_\mathrm {b}^{-n}, \end{aligned}$$
(38)

Here, A and n are dimensionless constants and \(\mathrm {Ri}_\mathrm {b}\) is the bulk Richardson number, which can be expressed in terms of the integral scale L of the turbulent flow and the buoyancy jump \(\varDelta b\) across the boundary,

$$\begin{aligned} \mathrm {Ri}_\mathrm {B}= \frac{\varDelta b\, L}{v_\mathrm {conv}^2}. \end{aligned}$$
(39)

The buoyancy jump can be obtained by integrating the square of the Brunt–Väisala over the extent of the boundary layer from \(r_1\) to \(r_2\),

$$\begin{aligned} \varDelta b = \int \limits _{r_1}^{r_2} -\omega _\mathrm {BV}^2 \,\mathrm {d}r. \end{aligned}$$
(40)

In the case of a thin boundary layer, this reduces to \(\varDelta b = g \, \delta \rho /\rho \), where \(\delta \rho /\rho \) is the density contrast across the convective interface. From their simulations, Meakin and Arnett (2007b) determined values of \(A=1.06\) and \(n = 1.05\) for the power-law coefficients. Since the work expended to entrain material against buoyancy the force of buoyancy must be supplied by a fraction of the convective energy flux (an argument which was independently redeveloped by Spruit 2015), one expects a value of \(n=1\) for sufficiently high \(\mathrm {Ri}_\mathrm {B}\).

Several subsequent 3D simulations (Müller et al. 2016b; Jones et al. 2017) have confirmed a value of \(n\approx 1\) for the scaling law (38). Müller et al. (2016b) found a significantly smaller value of \(A\approx 0.1\), however, but this may simply be due to ambiguities in the definition and measurement of the integral length scale L, which Müller et al. (2016b) identify with the pressure scale height \(\varLambda \), and of the convective velocity \(v_\mathrm {conv}\) that enters Eq. (39) for the bulk Richardson number. Jones et al. (2017) expressed the entrainment law slightly differently by a proportionality \(\dot{M}\propto \dot{Q}_\mathrm {nuc}\) to the total nuclear energy generation rate \(\dot{Q}_\mathrm {nuc}\), which is equivalent to Eq. (38) with \(n=1\). Their simulations are particularly noteworthy because they employed sufficiently high resolution to establish the entrainment law up to very high \(\mathrm {Ri}_\mathrm {b}\). Although they do not explicitly state values of \(\mathrm {Ri}_\mathrm {b}\), one can estimate that their models reach up to \(\mathrm {Ri}_\mathrm {b}=700\)–1000.

The simulations of Cristini et al. (2017, 2019) are a notable exception as they find a significantly shallower power law with \(n=0.74\). This different power-law slope has yet to be accounted for, but it is important to note that despite the shallower power law, Cristini et al. (2017, 2019) generally find lower entrainment rates than Meakin and Arnett (2007b) for the same value of \(\mathrm {Ri}_\mathrm {b}\) with a much smaller value of \(A=0.05\). At \(\mathrm {Ri}_\mathrm {b}\approx 20\), their entrainment rate is actually in very good agreement with Müller et al. (2016b), and in the region of \(\mathrm {Ri}_\mathrm {b}=40\)–300 their data are consistent with a steeper power law of \(n\approx 1\). Since Cristini et al. (2017, 2019) also explore a much broader range in bulk Richardson number than the aforementioned studies, one possible interpretation could be that i) the value of A was overestimated in Meakin and Arnett (2007b), and that ii) the low value of \(n=0.74\) may be due to a flattening of the entrainment law below \(\mathrm {Ri}_\mathrm {b}=20\)–30 for some physical reasons, and perhaps a slight flattening at \(\mathrm {Ri}_\mathrm {b}>200\) because of numerical resolution effects.

Shell mergers Convective boundary mixing can take on a dramatic form when the buoyancy jump between two shells is sufficiently small for the neighboring shells to merge entirely within a few convective turnover times. Such shell mergers have long been known to occur in 1D stellar evolution models, in particular between O, Ne, and C shells (e.g., Sukhbold and Woosley 2014; Collins et al. 2018). This is because balanced power leads to very similar entropies in the O, Ne, and C shells, and hence small buoyancy jumps between the shells. When nuclear energy generation and neutrino cooling finally fall out of balance due to shell contraction, the entropy of the inner (O or Ne) shell frequently increases and overtakes the outer shell(s), so that such mergers are particularly prevalent shortly before collapse as pointed out by Collins et al. (2018), who estimated that 40% of stars between \(16\,M_\odot \) and \(26\,M_\odot \) collapse during an ongoing shell merger. Although such mergers occur in 1D models, they may occur more readily in 3D, and 3D simulations are also necessary to capture the composition inhomogeneities and nucleosynthesis during the dynamical merger phase.

Shell mergers have indeed been seen in several recent 3D simulations. Müller (2016) pointed out the breakout of a thin O shell through an inert, non-convective O layer into the active Ne burning zone in a \(12.5\,M_\odot \) star in the last minute before collapse, which, however, did not lead to a complete shell merger. Mocák et al. (2018) found a merger between the O and Ne shell in a \(23\,M_\odot \) model, and noted that the runaway entrainment leads to a peculiar quasi-steady with two distinct burning zones for O (at the base) and Ne (further out) within the same convective shell. However, their simulation only covered five turnover times and showed the merger occurring during the initial transient phase. Yadav et al. (2020) simulated an O–Ne shell merger in an \(18.88\,M_\odot \) over 15 turnover times, and were able to follow the evolution from the pre-merger phase with a soft, but clearly defined shell boundary and slow steady-state entrainment through the dynamical merger phase to a partially mixed post-merger state at the onset of collapse.. They stressed the emergence of large-scale asymmetries in the velocity field (with extreme velocities of up to \(1700\, \mathrm {km}\, \mathrm {s}^{-1}\)) and the composition during the merger, although the compositional asymmetries are already washed out somewhat at the point of collapse.

Impact on nucleosynthesis With multi-D simulations of the late-burning stages firmly established, it is critical to identify observable fingerprints of additional convective boundary mixing. The nucleosynthesis yields may provide one such fingerprint, which has already been discussed by several studies, even though one can only draw conclusions based on qualitative arguments and on 1D models with artificially enhanced mixing so far.

Davis et al. (2019) pointed out that the assumptions for convective boundary mixing can significantly affect the yields of various \(\alpha \)-elements (C, O, Ne, Mg, Si), simply as a consequence of the change in shell structure. However, entrainment and shell mergers may leave more specific abundance patterns. In their investigation of O–C shell mergers in 1D Ritter et al. (2018) found significant overproduction of P, Cl, K, Sc, and possibly p-process isotopes, and argue that the occurrence of shell mergers may have important consequences for galactic chemical evolution (GCE). More recently, Côté et al. (2020) considered a Si–C shell merger, for which they find significant overproduction of \({}^{51}\mathrm {V}\) and \({}^{52}\mathrm {Cr}\), which allows them to strongly constrain the rate of such events based on observed Galactic stellar abundances. This is related to the long-standing realization that the ashes of hydrostatic silicon burning under neutron-rich conditions cannot be ejected in large quantities because of GCE constraints (Woosley et al. 1973; Arnett 1996).

Supernova spectroscopy may also help constrain additional convective boundary mixing and shell mergers may via their nucleosynthetic fingerprints. For example, the ejection of neutron-rich material from the silicon shell that is mixed out by entrainment before the explosion would lead to a supersolar Ni/Fe ratio as observed in some supernovae (Jerkstrand et al. 2015). Mixing of minimal amounts of Ca into O-rich zones can also have significant repercussions since only a mass fraction of a few \(10^{-3}\) in Ca is required for Ca to be the domnant coolant during the nebular phase and quench O line mission in a shell (Fransson and Chevalier 1989; Kozma and Fransson 1998). This diagnostic potential of supernova spectroscopy for convective boundary mixing needs to be explored further in the future, but further (macroscopic) mixing during the explosion presents a major complication as it is not straightforward to disentangle the effect of mixing processes prior and during the explosion.

Secular effect on stellar evolution Evaluating the observational consequences of the convective boundary mixing seen in 3D models is also difficult because there is still no rigorous method for treating these processes in 1D stellar evolution codes. A crude estimate for the shell growth by entrainment can be formulated by noting that the work required to entrain material with density contrast \(\delta \rho /\rho \) against buoyancy must be no larger than a fraction of the time-integrated convective energy flux (Spruit 2015; Müller 2016). During the late burning stage, the convective energy flux is set by the nuclear energy generation rate, and hence one can estimate that the entrained mass \(\varDelta M_\mathrm {entr}\) over the lifetime of a shell with mass \(M_\mathrm {shell}\) and radius r is roughly

$$\begin{aligned} \frac{G M}{r}\frac{\delta \rho }{\rho } \varDelta M_\mathrm {entr} \sim A M_\mathrm {shell} \varDelta Q, \end{aligned}$$
(41)

where A is the dimensionless coefficient in Eq. (38) and \(\varDelta Q\) is the average Q-value for a given burning stage (Müller 2016). Based on Eq. (41), Müller (2016) estimates that O shells could grow by tens of percent in mass by entrainment; for Si shells one expects a smaller effect, for C shells, the effect may be bigger.

How one can go beyond such simple estimates by using improved recipes for convective boundary mixing in stellar evolution codes is still an unresolved question. A common approach, based on the simulations of surface convection by Freytag et al. (1996), models entrainment as diffusive overshooting with an exponential decay of the MLT diffusion coefficient outside the convective boundary. The length scale \(\lambda _\mathrm {OV}\) for the exponential decay can be calibrated against 3D simulations. This approach has been followed by Côté et al. (2020), Davis et al. (2019) and by many works on additional convective boundary mixing in low-mass mass stars, but has several issues. Entrainment is a very different process than diffusive overshoot that operates in a distinct physical regime (high Péclet number), and hence one should not expect that it can be described by the same formalism (Viallet et al. 2015). It is also unclear why diffusive overshoot should be applied only for compositional mixing in the entrainment regime. The common approach of expressing \(\lambda _\mathrm {OV}\) by the pressure scale height is also open to criticism because the relevant length scale should be set by the convective velocities and the buoyancy jump, so that one would rather expect \(\lambda _\mathrm {OV}\propto v_\mathrm {conv}^2/\varDelta b\).

Staritsin (2013) proposed an alternative approach of extending convectively mixed regions with time following the entrainment law (38), which better reflects the physics of the entrainment process. However, this approach has not been applied to the late neutrino-cooled burning stages yet (i.e., precisely where entrainment should operate). It also has some conceptual issues, for example, the entrainment law (38) obviously breaks down if there is convection on both sides of a shell interface. Yet another approach for extra mixing in 1D models was followed by Young et al. (2005), who handle mixing based on the local gradient Richardson number \(\mathrm {Ri}\) for shear flows, which is estimated using an elliptic equation for the amplitudes of waves excited by convective motions (Young and Arnett 2005). This approach is physically motivated, but is still awaiting (and worthy of) a more quantitative comparison with 3D simulations beyond the qualitative discussion in Young and Arnett (2005).

Flame propagation in low-mass progenitors Around the minimum progenitor mass for supernova explosions, multi-D effects can have a more profound effect than merely changing shell masses on a modest scale, and may decide about the final fate of the star. This regime is best exemplified by the electron-capture supernova channel of super-asymptotic giant branch (SAGB) stars (see Jones et al. 2013; Doherty et al. 2017; Nomoto and Leung 2017; Leung and Nomoto 2019 for a broader overview and a discussion of uncertainties). In this evolutionary channel, the star builds up a degenerate core composed primarily of O and Ne. If this core grows to \(1.38\,M_\odot \), electron captures on Ne and Mg trigger an O deflagration. Depending on the interplay of deleptonization (which decreases the degeneracy pressure) and the nuclear energy release, the core either contracts, collapses to a neutron stars, and explodes as an electron capture supernova, or the core does not collapse and explodes as a weak thermonuclear supernova (Jones et al. 2016). Since the flame is turbulent, its propagation needs to be modelled in multi-D, similar to deflagrations in Type Ia supernovae. Simulations of this problem have been conducted by Jones et al. (2016), Kirsebom et al. (2019) in 3D and (Leung and Nomoto 2019; Leung et al. 2020) in 2D. Efforts to improve the nuclear input physics and explore the sensitivity to the ignition geometry, general relativity, and flame physics are ongoing (e.g., Kirsebom et al. 2019; Leung et al. 2020).

For slightly more massive stars, one encounters similar situations with convectively-bounded flames after off-center ignition of O or Si (Woosley and Heger 2015). Again, multi-D effects may significantly affect the final evolutionary phase before collapse in this regime, but multi-D simulations of such supernova progenitors are yet to be carried out (but see Lecoanet et al. 2016 for idealized 3D simulations relevant to this regime).

Current and future issues

Significant progress notwithstanding, multi-D simulations of the late stages of convective burning still face further challenges. For the evolution towards collapse, models will eventually need to include a more sophisticated treatment of burning and deleptonization in the QSE and NSE regime and forgo the current approach of either using small networks or excising the Fe/Si core. Perhaps an even greater concern about the conclusions of current multi-D simulations lies with the time-scale problem, however. No 3D simulations have yet been evolved sufficiently long to reach the state of balanced power (or to reveal why it would not be reached). This may have repercussions for turbulent entrainment, which ultimately taps the energy in turbulent motions and hence cannot be completely disconnected from the energy budget within a shell.

Moreover, although a few attempts have been made to simulate convection in rotating shells in 2D (Arnett and Meakin 2010; Chatzopoulos et al. 2016) and 3D (Kuhlen et al. 2003), multi-D simulations have yet to investigate the angular momentum distribution and angular momentum transport during the late pre-supernova stages in a satisfactory manner. Three-dimensional simulations are even more critical for this purpose than in the non-rotating case since many relevant phenomena such as (Rossby waves, Taylor-Proudman columns) cannot be modeled adequately in 2D. Simulations also need to explore a larger parameter space since the convective dynamics will depend on the Rossby number \(\mathrm {Ro}\sim v_\mathrm {conv}/(\varOmega R)\) (where \(\varOmega \) is the rotational velocity). Furthermore, the time scales become a bigger challenge because models need to be run for several rotational periods \(T=2\pi \varOmega ^{-1}\) and several convective turnover times \(\tau _\mathrm {conv}\) (whichever is longer), which is problematic since rotation in pre-supernova models is likely slow (e.g., Heger et al. 2005) so that \(\mathrm {Ro}\gg 1\) and \(T\gg \tau _\mathrm {conv}\). On the bright side, multi-D simulations may reveal much more interesting differences to 1D stellar evolution models: Both Kuhlen et al. (2003) and Arnett and Meakin (2010) found pronounced differential rotation developing from a rigidly rotating initial state, and Arnett and Meakin (2010) suggest that convective shells might adjust to a stratification with constant angular momentum as a function of radius rather than to uniform rotation as assumed in stellar evolution models. However, more simulations and more rigorous analysis is required to investigate these claims.

The problem of rotation can obviously not be solved without including magnetic fields in the long run. It is well known (e.g., Shu 1992) that the criterion for the instability of rotating flow becomes less restrictive in the MHD case, and effects such as the magnetorotational instability (MRI, Balbus and Hawley 1991) or a small-scale dynamo may enforce a more uniform rotation profile than hydrodynamic convection alone. But the importance of magnetic fields is not confined to the case rotating progenitors. Prompted by helioseismic measurements that indicate smaller convective velocities in the deeper layer of the solar convection zone (Gizon and Birch 2012; Hanasoge et al. 2012), some simulations of magnetoconvection in the Sun found a suppression of convective velocities by up to 50% compared to hydrodynamic simulations due to strong magnetic fields from a small-scale dynamo that reach equipartition with the turbulent kinetic energy (Hotta et al. 2015). Magnetic fields can also inhibit or enhance mixing in shear layers (Brüggen and Hillebrandt 2001) and may hence affect convective boundary mixing. Thus, there is still plenty of ground left to explore for simulations of the late burning stages.

Core collapse and shock revival

In the Introduction, we already outlined a variety of multi-D effects than can play a role in reviving the stalled supernova shock as a subsidiary agent to neutrino heating (i.e., neutrino-driven convection in the gain region, the SASI, and progenitor asphericities), or also as the main drivers of the explosion (rotation and magnetic fields). Historically, a number of works have also considered convection in the PNS interior as a means for precipitating explosions by enhancing the neutrino emission from the PNS (Epstein 1979; Wilson and Mayle 1988; Burrows and Lattimer 1988; Wilson and Mayle 1993), but these hopes were not substantiated in subsequent decades. Nonetheless, convection inside the PNS remains important for various aspects of the supernova problem such as the neutrino and gravitational wave signals and the nucleosynthesis conditions in the innermost ejecta.

Since each of these phenomena has proved rich and complex over the years, it is no longer possible to treat them adequately within a chronological narrative of the quest for the supernova explosion mechanism. Nevertheless, ascertaining the explosion mechanism by means of first-principle simulations remains the overriding concern in supernova theory, and it is therefore still useful to recapitulate the progress in supernova explosion modelling from the advent of the first 2D models with a simplified treatment of neutrino heating and cooling in the 1990s (Herant et al. 1992; Shimizu et al. 1993; Yamada et al. 1993; Janka et al. 1993; Herant et al. 1994; Burrows et al. 1995; Janka and Müller 1995, 1996). A more detailed analysis of the individual hydrodynamic phenomena beyond this chronicle of simulations is then provided in Sects. 4.14.7.

Neutrino-driven explosions in 2D Although the 2D simulations of the early and mid-1990s had shown multi-D effects to be helpful for shock revival, these models did not utilize neutrino transport on par with the best available methods for 1D simulations at the time. In a first attempt to better model neutrino heating and cooling in 2D by using the pre-computed neutrino radiation field from a 1D simulation with multi-group flux limited diffusion, Mezzacappa et al. (1998) were unable to reproduce the successful explosions found in earlier 2D models. This led to a resurgence of interest in accurate methods for neutrino transport, culiminating in the development of Boltzmann solvers for relativistic (Yamada et al. 1999; Liebendörfer et al. 2001, 2004) and pseudo-Newtonian simulations (Rampp and Janka 2000, 2002). The explosion problem was then revisited in 2D using various types of multi-group neutrino transport from the mid-2000s onwards. Neutrino-driven explosions were obtained in many of these 2D simulations for a wide range of progenitors (Buras et al. 2006a; Marek and Janka 2009; Müller et al. 2012b, a, 2013; Janka 2012; Janka et al. 2012; Suwa et al. 2010, 2013; Bruenn et al. 2013, 2016; Nakamura et al. 2015; Burrows et al. 2018; Pan et al. 2018; O’Connor and Couch 2018b), though still with significant differences between the various simulation codes.

Challenges and successes in 3D Following isolated earlier attempts at 3D modelling using the “light-bulb” style models of the 1990s (Shimizu et al. 1993; Janka and Müller 1996), or gray flux-limited diffusion (Fryer and Warren 2002), the role of 3D effects in the explosion mechanism was finally investigated vigorously in the last decade, starting again with simple light-bulb models (Nordhaus et al. 2010b; Hanke et al. 2012; Couch 2013; Dolence et al. 2013). Except for spurious results in Nordhaus et al. (2010b), these light-bulb models indicated a similar “explodability” in 2D and 3D. However, subsequent 3D models with rigorous neutrino transport proved more reluctant to explode; indeed the first 3D models of \(11.2\,M_\odot \), \(20\,M_\odot \), and \(27\,M_\odot \) progenitors using multi-group, three-flavour neutrino transport did not explode at all (Hanke et al. 2013; Tamborra et al. 2014a).

Even though various groups have now obtained explosions in 3D simulations, shock revival usually occurs later than in 2D, and often requires additional (and sometimes hypothetical) ingredients to improve the heating conditions or a specific progenitor structure. For low-mass single-star (Melson et al. 2015b; Müller 2016; Burrows et al. 2019) and binary (Müller et al. 2018) progenitors just above the iron-core formation limit, 3D simulations readily yield explosions since the steep drop of the density outside the iron core implies a rapid drop of the accretion rate onto the shock after bounce. For more massive stars the record is mixed. For standard, non-rotating progenitors in the range between \(11.2\,M_\odot \) and \(27\,M_\odot \) and unmodified, state-of-the-art microphysics, no explosions were found in simulations using the Vertex code (Hanke et al. 2013; Tamborra et al. 2014a; Melson et al. 2015a; Summa et al. 2018) and the Flash-M1 code (O’Connor and Couch 2018a). On the other hand, the Oak Ridge group obtained an explosion for a \(15\,M_\odot \) star (Lentz et al. 2015) with their Chimera code, and the Princeton group observed shock revival in eleven out of fourteen models between \(9\,M_\odot \) and \(60\,M_\odot \) (Vartanyan et al. 2019b; Burrows et al. 2019; Radice et al. 2019; Burrows et al. 2020) with the Fornax code. In both cases the accuracy of the microphysics, the neutrino transport, and gravity treatment appears comparable to Vertex. Three-dimensional simulations using other codes (that are constantly evolving!) are more difficult to compare as they involve simplifications in the microphysics or transport compared to Vertex, Chimera, and Fornax, although some of them compensate for this by higher resolution in real space and energy space and a better treatment of gravity. At any rate, results obtained with other codes such as CoCoNuT-FMT, Fugra, Zelmani, and 3DnSNe add to the picture of simulations straddling the verge between successful shock revival (Takiwaki et al. 2012, 2014; Müller 2015; Roberts et al. 2016; Chan et al. 2018; Ott et al. 2018; Kuroda et al. 2018) and failure (Müller et al. 2017a; Kuroda et al. 2018) for standard, non-rotating progenitors and standard or simplified microphysics.

These different results may simply indicate that the neutrino-driven mechanism operates close to the critical threshold for explosion in nature. Observations of supernova progenitors indeed indicate that black hole formation occurs already at relatively low masses down to to \(\mathord {\sim } 15\,M_\odot \) for single stars (Smartt et al. 2009; Smartt 2015). Since the lack of robust explosions in 3D persists to even lower masses, and since strongly delayed explosions in 3D may turn out too weak to be compatible with observations, several groups have explored new avenues towards more robust explosions. Some of the proposed ideas invoke modifications or improvements to the microphysics that ultimately lead to improved neutrino heating conditions, such as strangeness corrections to the neutral-current scattering rate (Melson et al. 2015a) and muonization (Bollig et al. 2017). Other studies have explored purely hydrodynamic effects. Among these, Takiwaki et al. (2016), Janka et al. (2016), Summa et al. (2018) pointed out that rapid progenitor rotation could be conducive to shock revival even without invoking MHD effects. Another idea posits that including seed perturbations from the late convective burning stages can facilitate shock revival. First studied by Couch and Ott (2013) and Müller and Janka (2015) using parametric initial conditions, this “perturbation-aided” mechanism has subsequently been explored further using pre-collapse perturbations from 3D models of the late burning stages, initially with ambiguous results in the leakage simulations of Couch et al. (2015) and then in a number of 3D simulations using multi-group neutrino transport (Müller 2016; Müller et al. 2017a, 2019), where it led to robust explosions over a wider mass range from \(11.8\,M_\odot \) to \(18\,M_\odot \) for single stars.

Neutrino-driven explosion models have thus matured considerably in recent years, but it would be premature to declare the problem of shock revival solved. The discrepancies between the results of different groups have yet to be sorted out, and there is still no “gold standard” among the simulations that combines the best neutrino transport, the best microphysics, 3D progenitor models, and general relativity. Moreover, phenomenological models of neutrino-driven explosions (Ugliano et al. 2012; Pejcha and Thompson 2015; Sukhbold et al. 2016; Müller et al. 2016a) suggest that a different mechanism is still needed to explain hypernova explosions with energies above \(2 \times 10^{51}\, \mathrm {erg}\).

Magnetohydrodynamic simulations The mechanism(s) behind hypernovae likely rely on rapid rotation and strong magnetic fields (Akiyama et al. 2003; Woosley and Bloom 2006), but the importance of magnetic fields may not end there. There may be a continuous transition from neutrino-driven explosions to MHD-driven explosions (Burrows et al. 2007a), and strong magnetic fields may also a role in non-rotating progenitors as an important driving agent or as a subsidiary to neutrino heating (Obergaulinger et al. 2014).

Although the ideas of Akiyama et al. (2003) quickly triggered first 2D MHD core-collapse supernova simulations (e.g., Yamada and Sawai 2004; Sawai et al. 2005; Obergaulinger et al. 2006; Shibata et al. 2006), there is still only a small corpus of magnetorotational supernova explosion models, especially if we focus on models of the entire collapse, accretion, and early explosion phase using reasonably detailed microphysics and disregard parameterized models of relativistic and non-relativistic jets and of collapsar disks. Burrows et al. (2007a) presented 2D simulations of magnetorotational explosions of a \(15\,M_\odot \) progenitor (later followed by MHD simulations of accretion-induced by collapse in Dessart et al. 2007) with the Newtonian radiation-MHD code Vulcan and demonstrated the ready emergence of jets powered by strong hoop stresses for sufficiently strong initial fields. Burrows et al. (2007a) made the important point that these non-relativistic jets are a distinctly different phenomenon from the relativistic jets seen in long gamma-ray bursts (GRBs), which may be formed several seconds after shock revival.

Like most other subsequent simulations, these models relied on parameterized initial conditions with artificially strong magnetic fields to mimic the purported fast amplification of much weaker fields in the progenitor by the magnetorotational instability (Balbus and Hawley 1991; Akiyama et al. 2003). They also imposed the progenitor rotation profile by hand. The 2D studies of Obergaulinger and Aloy (2017, 2020), Bugli et al. (2020) have recently explored variations in the assumed initial field strength and topology and the assumed rotation profiles more thoroughly. While they find considerable variation in the outcome of their models, it is interesting to note that in some instances Obergaulinger and Aloy (2020) even find magnetorotational explosions for the unmodified rotation profile and magnetic field strength of two of the \(35\,M_\odot \) progenitor models from Woosley and Heger (2006), although it is not perfectly clear what the precise geometry of the field in the stellar evolution models ought to be.

The imposition of axisymmetry is an even bigger concern in the case of magnetorotational supernovae than for nuetrino-driven explosion models. Several 3D simulations of MHD-driven explosions have by now been performed, but among these only Obergaulinger and Aloy (2020) included multi-group neutrino transport, whereas the others (Winteler et al. 2012; Mösta et al. 2014b, 2018) employed a leakage scheme. The prospects for successful magnetorotational explosions in 3D are still somewhat unclear. Mösta et al. (2018) reported the destruction of the emerging jets by a kink instability, although the jet can apparently be stabilised if the poloidal field strength is comparable to the toroidal field strength (Mösta et al. 2018). Moreover, the explosion dynamics already depends sensitively on the assumed initial field geometry; strong dipole fields appear to be required for the most powerful explosions (Bugli et al. 2020). Given the vast uncertainties concerning the initial rotation rates, field strengths, and field geometries in the supernova progenitors, considerably more work is necessary before the magnetorotational mechanism can be considered robust even for a small sub-class of progenitors. We will therefore focus only on the hydrodynamics of neutrino-driven explosions in the subsequent discussion.

Fig. 9
figure9

Schematic 1D structure of the supernova core after the formation of the gain region, illustrated by profiles of the density \(\rho \), pressure P, temperature T (left), radial velocity \(v_r\), entropy s, and electron fraction \(Y_\mathrm {e}\) (right). The profiles are taken from a 1D radiation hydrodynamics simulation of the \(20\,M_\odot \) progenitor of Woosley and Heger (2007) at a post-bounce time of \(200\,\mathrm {ms}\). See text for details

Structure of the accretion flow and runaway conditions in spherical symmetry

Before analyzing the role of multi-D phenomena in core-collapse supernovae in greater depth, it is expedient to discuss the structure of the supernova core that emerges once the gain region has formed a few tens of milliseconds after collapse in an idealized, spherically-symmetric picture as shown in Fig. 9. Our discussion closely follows the works of Janka (2001), Müller and Janka (2015) and Müller et al. (2016a) which may be consulted for further details.

Structure of the accretion flow At this stage, the PNS consists of an inner core of about \(0.5\,M_\odot \) (depending on the equation of state) with low entropy, which is surrounded by an extended mantle of about \(1\,M_\odot \) that was heated to entropies of about \(6k_\mathrm {b}/\mathrm {nucleon}\) as the shock propagated through the outer part of the collapsed iron core and most of the Si shell. The mantle extends out to the neutrinosphere at high subnuclear densities, where neutrinos on average undergo their last interaction before escape (for details see Kotake et al. 2006; Janka 2017; Müller 2019b; Mezzacappa 2020). In the atmosphere immediately outside the neutrinosphere radius \(R_\nu \), the pressure P is dominated by non-relativistic baryons, and neutrino interactions are still frequent enough to act as “thermostat” and maintain a roughly isothermal stratification, resulting in an exponential density profile (Janka 2001):

$$\begin{aligned} \rho = \rho _\nu \exp \left[ -\frac{G M m_\mathrm {n}}{ r k_b T_\nu }\left( 1-\frac{R_\nu }{r}\right) \right] , \end{aligned}$$
(42)

where M is the PNS mass, \(R_\nu \), \(T_\nu \), and \(\rho _\nu \) are the neutrinosphere radius, temperature, and density, and \(m_\mathrm {n}\) is the neutron mass. To maintain rough isothermality with the neutrinosphere, the accreted matter must cool as it is advected through the atmosphere. Below a density of about \(10^{10}\, \mathrm {g}\, \mathrm {cm}^{-3}\), the pressure is dominated by relativistic electron-positron pairs and photons, and around this point neutrino heating starts do dominate over neutrino cooling at the gain radius \(R_\mathrm {g}\).Footnote 7 Since the cooling and heating rate scale with \(T^6\propto P^{3/2}\) and \(L_\nu E_\nu ^2/r^2\) in terms of the matter temperature T and the electron-flavor neutrino luminosity \(L_\nu \) and mean energy \(E_\nu \) (appropriately averaged over electron neutrinos and antineutrinos), balance between heating and cooling defines an effective thermal boundary condition for the radiation-dominated gain region further out,

$$\begin{aligned} P^{3/2}\propto \frac{L_\nu E_\nu ^2}{R_\mathrm {g}^2}. \end{aligned}$$
(43)

Before shock revival, the stratification between the gain radius is roughly adiabatic out to the shock,Footnote 8 resulting in power-law profiles \(\rho \propto r^{-3}\) and \(T\propto r^{-1}\) for the temperature and density. Ahead of the shock, the infalling material moves with a radial velocity of \(|v_r|\approx \sqrt{GM/r}\) (i.e., a large fraction of the free-fall velocity), and the density is given in terms of the mass acrretion rate \(\dot{M}\) as \(\rho = \dot{M}/(4\pi r^2 |v_r|)\). In a quasi-stationary situation, the stalled accretion shock will adjust to a radius \(R_\mathrm {sh}\) such that the jump conditions are fulfilled and the post-shock pressure \(P_\mathrm {sh}\) and the pre-shock ram pressure \(P_\mathrm {ram}=\rho v_r^2\) are related by

$$\begin{aligned} P_\mathrm {sh}= \frac{\beta }{\beta -1} P_\mathrm {ram} = \frac{\beta }{\beta -1} \frac{\dot{M}}{4\pi R_\mathrm {sh}^2}\sqrt{\frac{GM}{ R_\mathrm {sh}}} \propto \dot{M} R_\mathrm {sh}^{-5/2}. \end{aligned}$$
(44)

Here \(\beta \) is the compression ratio in the shock, which varies from \(\beta \approx 10\) early on, which is slightly larger than the value of \(\beta =7\) for an ideal gas with adiabatic index \(\gamma =4/3\) because of the dissociation of nuclei in the shock, to \(\beta \approx 4\) during the explosion phase when there is a net release of energy by burning in the shock. Equation (44) immediately implies that the quasi-stationary accretion shock radius increases with the post-shock pressure roughly as \(R_\mathrm {sh}\propto P_\mathrm {sh}^{2/5}\). Recognizing that the heating rate \(\dot{q}_\mathrm {heat}\propto L_\nu T_\nu ^2 r^{-2}\) and the cooling rate \(\dot{q}_\mathrm {cool}\propto T^6 \propto P^{3/2}\) balance at the gain radius, and using the adiabatic stratification in the gain region and the jump conditions, one can go further and derive that the shock radius scales as (Janka 2012; Müller and Janka 2015)

$$\begin{aligned} R_\mathrm {sh} \propto \frac{(L_\nu T_\nu ^2)^{4/9} R_\mathrm {g}^{16/9}}{\dot{M}^{2/3} M^{1/3}}, \end{aligned}$$
(45)

in spherical symmetry in terms of \(L_\nu \), \(T_\nu \), \(R_\mathrm {g}\), M and the mass accretion rate \(\dot{M}\), which is related to the density profile of the progenitor (Woosley and Heger 2012; Müller et al. 2016a).

Conditions for shock revival So far, we have assumed a stationary accretion flow in this picture. The problem of shock revival is, however, related to the breakdown of stationary accretion solutions (Burrows and Goshy 1993; Janka 2001), or more strictly speaking, to the development of non-linearly unstable flow perturbations (Fernández 2012). The transition to runaway shock expansion can be understood in terms of a competition of time scales, namely the advection or residence time \(\tau _\mathrm {adv}\) that the accreted material spends in the gain region, and the time scale \(\tau _\mathrm {heat}\) for unbinding the material in the gain region by neutrino heating (Thompson 2000; Janka 2001; Thompson et al. 2005; Buras et al. 2006a; Murphy and Burrows 2008). These can be computed in terms of the binding energy \(E_\mathrm {g}\) and mass \(M_\mathrm {g}\) of the gain region, the mass accretion rate \(\dot{M}\), and the volume-integrated heating rate \(\dot{Q}_\nu \) as

$$\begin{aligned} \tau _\mathrm {adv}&=\frac{M_\mathrm {g}}{\dot{M}}, \end{aligned}$$
(46)
$$\begin{aligned} \tau _\mathrm {heat}&=\frac{|E_\mathrm {g}|}{\dot{Q}_\nu }. \end{aligned}$$
(47)

Transition to runaway expansion is expected if \(\tau _\mathrm {adv}/ \tau _\mathrm {heat}\gtrsim 1\), which is borne out by 1D light-bulb simulations with a pre-defined neutrino luminosity (Fernández 2012). Alternatively, the runaway condition can be expressed in terms of a critical luminosity \(L_\mathrm {crit}\) above which there are no stationary 1D accretion solutions (Burrows and Goshy 1993). Janka (2012) and Müller and Janka (2015) have pointed out that these two descriptions are essentially equivalent by converting the time-scale criterion into a power law for the critical value of the “heating functional” \(\mathcal {L}=L_\nu E_\nu ^2\),

$$\begin{aligned} \mathcal {L}_\mathrm {crit}= (L_\nu E_\nu ^2)_\mathrm {crit} \propto (M \dot{M})^{3/5} R_\mathrm {g}^{-2/5}. \end{aligned}$$
(48)

Other largely equivalent ways to characterize the onset of a runaway instability (at least in spherical symmetry) are the notion that the Bernoulli parameter reaches zero somehwere in the gain region around shock revival (Burrows et al. 1995; Fernández 2012), and the antesonic condition \(c_\mathrm {s}^2>3/8 GM/r\) (Pejcha and Thompson 2012), which is effectively a condition for the flow enthalpy just like the Bernoulli parameter (Müller 2016).

Impact of multi-dimensional effects on shock revival

Qualitative description This simple picture is useful for qualitatively understanding how multi-D effects modify the spherically-averaged bulk structure of the post-shock flow and hence affect the conditions for shock revival. It is intuitive from Eq. (44) that increasing the post-shock pressure (e.g., by turbulent heat transfer), or adding turbulent or magnetic stresses will increase the shock radius and modify Eq. (45) for the spherically symmetric case. This will then affect the time scales \(\tau _\mathrm {adv}\) and \(\tau _\mathrm {heat}\) and thereby modify the conditions for runaway shock expansion driven by neutrinos. Moreover, certain multi-D phenomena may also facilitate runaway shock expansion more directly by dumping extra energy into the gain region, which may take the form of thermal energy, turbulent kinetic energy, or magnetic energy. This is, of course, only a coarse-grain interpretation of the effect of multi-D effects, which needs to be based on a more careful analysis of the underlying hydrodynamic phenomena.

The studies from the 1990s also outlined qualitative explanations for the beneficial role of multi-D effects. Herant et al. (1994) interpreted convection as of an open-cycle heat engine that continuously pumps transfers energy from the gain radius (where neutrino heating is strongest) further out into the gain region, and Janka and Müller (1996) similarly stress the importance of more effective heat transfer from the gain region to the shock. Herant et al. (1994) argued that large-scale mixing motions are also advantageous because they continue to channel fresh matter to the cooling region during the explosion phase so that the neutrino heating is not quenched when the shock is revived. Finally, Burrows et al. (1995) pointed out what we now subsume under the notion of turbulent stresses: As the convective bubbles collide with the shock surface with significant velocities (or in modern parlance provide “turbulent stresses”) and thereby deform and expand it.

Modified critical luminosity Since then, the impact of multi-D effects has been analyzed more quantitatively. Several studies (Buras et al. 2006a; Murphy and Burrows 2008; Hanke et al. 2012) showed that the advection time scale \(\tau _\mathrm {adv}\) is systematically larger in multi-D, while the onset of runaway shock expansion is still determined by the criterion \(\tau _\mathrm {adv}/\tau _\mathrm {heat}\). This suggests that the runaway is still powered by neutrino heating just as in 1D, and that multi-D effects facilitate explosions facilitate shock revival by somewhat expanding the stationary shock, keeping a larger amount of mass in the gain region, and thereby increasing the heating efficiency.Footnote 9 To a lesser extent, mixing also reduces the binding energy of the gain region (Müller 2016), but this appears to be of secondary importance for shock revival.

Building on the 1D picture from Sect. 4.1, the increase of the quasi-stationary shock radius can be understood as the consequence of additional “turbulence”Footnote 10 terms that arise in a spherical Reynolds or Favre decomposition of the flow, whose importance can be gauged by the square of the turbulent Mach number \(\mathrm {Ma}\) in the gain region (Müller et al. 2012a). Using light-bulb simulations, Murphy et al. (2013) demonstrated quantitatively that the inclusion of Reynolds stresses (“turbulent pressure”) largely accounts for the higher shock radius in multi-D models. The critical role of the turbulent pressure was confirmed by Couch and Ott (2015) using a leakage scheme and by Müller and Janka (2015) with multi-group neutrino transport.

The resulting effect on the critical luminosity can be estimated by including a turbulent pressure termFootnote 11 in Eq. (44), which ultimately leads to (Müller and Janka 2015)

$$\begin{aligned} (L_\nu E_\nu ^2)_\mathrm {crit} \propto (M \dot{M} )^{3/5} R_\mathrm {g}^{-2/5} \left( 1+\frac{4 \mathrm {Ma}^2}{3}\right) ^{-3/5} = (L_\nu E_\nu ^2)_\mathrm {crit,1D} \left( 1+\frac{4 \mathrm {Ma}^2}{3}\right) ^{-3/5}, \end{aligned}$$
(49)

where the critical luminosity in 1D, \((L_\nu E_\nu ^2)_\mathrm {crit,1D}\), is modified by a correction factor containing the turbulent Mach number in the gain region. Although based on a rather simple analytic model, Eq. (49) describes shock revival in 2D (Summa et al. 2016) and 3D models (Janka et al. 2016) remarkably well. This suggests that the critical parameter for increased explodability in multi-D is indeed the turbulent Mach number, although, as argued by Mabanta and Murphy (2018), the larger accretion shock radius may not be due to turbulent pressure alone. Even if other effects such as turbulent heat transport, turbulent dissipation (Mabanta and Murphy 2018), and even turbulent viscosity (Müller 2019a) play a role, one expects a scaling law similar to Eq. (49) simply because any leading-order correction to the 1D jump condition (44) from a spherical Reynolds decomposition will scale with \(\mathrm {Ma}^2\), only with a slightly different proportionality constant than in Eq. (49). The turbulent Mach number itself will be determined by the growth and saturation mechanisms of the non-radial instabilities in the gain region as discussed in the following sections.

Neutrino-driven convection in the gain region

Convection in the gain region develops because neutrino heating establishes a negative entropy gradient. In many respects, this “hot-bubble convection” resembles convection on top of a quasi-hydrostatic spherical background structure as familiar from the earlier phases of stellar evolution, but there are subtle differences because the instability occurs in an accrretion flow.

Condition for instability Under hydrostatic conditions, the Ledoux criterion for convective instability can be written as (Buras et al. 2006b),

$$\begin{aligned} C_\mathrm {L} = \frac{\partial \rho }{\partial r}-\frac{1}{c_\mathrm {s}^2}\frac{\partial P}{\partial r} = \left( \frac{\partial \rho }{\partial s}\right) _{P,Y_\mathrm {e}} \frac{\partial s}{\partial r} + \left( \frac{\partial \rho }{\partial Y_\mathrm {e}}\right) _{P,s} \frac{\partial Y_\mathrm {e}}{\partial r} >0, \end{aligned}$$
(50)

in terms of the gradients of density, pressure, entropy s and electron fraction \(Y_\mathrm {e}\). Using a local stability analysis for a displaced blob, one finds a growth rate \(\mathrm {Im}\,\omega _\mathrm {BV}\), where the Brunt–Väisälä frequency \(\omega _\mathrm {BV}\) is defined as

$$\begin{aligned} \omega _\mathrm {BV}^2=-\frac{g C_\mathrm {L}}{\rho }. \end{aligned}$$
(51)

In a stationary accretion flow, the radial derivatives can be expressed in terms of the time derivatives \(\dot{s}\) and \(\dot{Y}_\mathrm {e}\) and the advection velocity \(v_r\),

$$\begin{aligned} C_\mathrm {L} = \frac{1}{v_r} \left[ \left( \frac{\partial \rho }{\partial s}\right) _{P,Y_\mathrm {e}} \dot{s} + \left( \frac{\partial \rho }{\partial Y_\mathrm {e}}\right) _{P,s} \dot{Y}_\mathrm {e} \right] >0. \end{aligned}$$
(52)

Once the gain region forms around 80–\(100\, \mathrm {ms}\) after bounce, the electron fraction gradient plays a minor role for stability, and the material in the gain region can be well described as a radiation-dominated gas with \(P\propto (\rho s)^{4/3}\) so that

$$\begin{aligned} \omega _\mathrm {BV}^2=-\frac{g \dot{q}_\mathrm {e}}{v_r c_\mathrm {s}^2}. \end{aligned}$$
(53)

This has an important consequence: different from a hydrostatic background, the stability of a heated accretion flow (or outflow) depends on the sign of the advection velocity rather than on the profile of the heating function. Using the aforementioned scaling for the heating rate and assuming a linear velocity profile behind the shock, we obtain an estimate

$$\begin{aligned} \omega _\mathrm {BV}^2 \propto \frac{GM}{R_\mathrm {g}^2}\frac{\langle L_\nu E_\nu ^2\rangle }{R_\mathrm {g}^2} \left( \frac{\beta R_\mathrm {sh}}{G M}\right) ^{1/2} \left( \frac{R_\mathrm {sh}}{R_\mathrm {g}}\right) . \end{aligned}$$
(54)

More importantly, however, advection can stabilize the flow against convection because perturbations only have a finite time to grow as they cross the gain region as pointed out by Foglizzo et al. (2006), who demonstrated that instability is regulated by a parameter \(\chi \),

$$\begin{aligned} \chi = \int \limits _{r_\mathrm {g}}^{r_\mathrm {sh}}\frac{\mathrm {Im}\,\omega _\mathrm {BV}}{|v_r\ |}\mathrm {d}r . \end{aligned}$$
(55)

Instability should only occur for \(\chi \gtrsim 3\). This has indeed been confirmed in a number of parameterized (Scheck et al. 2008; Fernández et al. 2014; Fernández 2015; Couch and O’Connor 2014) and self-consistent simulations (Müller et al. 2012a; Hanke et al. 2013) in 2D and 3D.

Dominant eddy scale Similar to the situation in convective shell burning, the length scale of the most unstable linear mode is determined by the width of the gain region according to Eq. (37) (Foglizzo et al. 2006). In 3D this remains the characteristic length scale of convective eddies during the non-linear saturation stage (e.g., Hanke et al. 2013). Since the ratio of the shock and gain radius typically lies in the range \(R_\mathrm {sh}/R_\mathrm {g}=1.5\)–2 before the heating conditions become close to critical, convection is characterized by medium-scale eddies with angular wavenumbers \(\ell \approx 4\)–8 during the accretion phase (Hanke et al. 2013; Couch and O’Connor 2014). Around and after shock revival, large-scale modes with \(\ell =1\) and \(\ell =2\) emerge. By contrast, 2D simulations of hot-bubble convection tend to develop large-scale (\(\ell =1\) and \(\ell =2\)) vortices during the non-linear stage (Hanke et al. 2012, 2013; Couch 2013; Couch and O’Connor 2014) as a result of the inverse turbulent cascade of 2D turbulence (Kraichnan 1967).

Non-linear saturation The evolution towards shock revival typically proceeds over sufficiently long time scales for hot-bubble convection to reach a quasi-stationary state. Using 2D light-bulb simulations, Murphy et al. (2013) first demonstrated that this quasi-stationary state closely mirrors the situation in stellar convection (cf. Sect. 3.1), i.e., neutrino heating, buoyant driving, and turbulent dissipation balance each other (see also Murphy and Meakin 2011), and as a result the convective luminosity scales with the neutrino heating rate. Alternatively, the quasi-stationary state can be characterized by the notion of marginal stability; the flow adjusts itself that such that the \(\chi \)-parameter for the spherically averaged flow converges to \(\langle \chi \rangle \approx 3\) (Fernández et al. 2014). Müller and Janka (2015) showed that these properties of the non-linear stage result in a scaling law for the convective velocity that is completely analogous to Eq. (36). In 2D simulations, the velocity perturbations \(\delta v\) scale as

$$\begin{aligned} \delta v =(\dot{q}_\nu (R_\mathrm {sh}-R_\mathrm {g}))^{1/3}, \end{aligned}$$
(56)

in terms of the average mass-specific neutrino heating rate \(\dot{q}_\nu \) in the gain region. In 3D, the convective velocities are slightly smaller (Müller 2016),

$$\begin{aligned} \delta v =0.7[\dot{q}_\nu (R_\mathrm {sh}-R_\mathrm {g})]^{1/3}. \end{aligned}$$
(57)

The smaller proportionality constant in 3D can be motivated by the tendency of the forward turbulent cascade to create smaller structures, which decreases the dissipation length and increases the dissipation rate of the flow.

Quantitative effect on shock revival Based on these scaling laws for the convective velocity, Müller and Janka (2015) determined that convective motions should reach a characteristic squared turbulent Mach number \(\mathrm {Ma}\sim 0.3\)–0.45 around the time of shock revival. Using this value in Eq. (49) for the modified critical luminosity, they predict a reduction of the critical luminosity by 15–25% due to convection, which is in the ballpark of the numerical results (Murphy and Burrows 2008; Hanke et al. 2012; Couch 2013; Dolence et al. 2013; Fernández et al. 2014; Fernández 2015)

One might also be tempted to use Eqs. (56) and (57) to explain the lower explodability of self-consistent 3D models compared to their 2D counterparts. The nature of the differences between 2D and 3D is more complicated, however, since the critical luminosity for shock revival is roughly equal in 2D and 3D light-bulb simulations. Evidently, there are effects that partly compensate for the smaller convective velocities in 3D in some situations: the forward turbulent cascade (Melson et al. 2015b) and the different behavior of the Kelvin–Helmholtz instability in 3D (Müller 2015) affect the interaction between updrafts and downdrafts and can result in reduced cooling in 3D (Melson et al. 2015b). Moreover, compatible with earlier studies of the Rayleigh–Taylor instability for planar geometry (Yabe et al. 1991; Hecht et al. 1995), Kazeroni et al. (2018) found a faster growth of convective plumes and more efficient mixing in a planer toy model of neutrino-driven convection. Along similar lines, Handy et al. (2014) appealed to the higher volume-to-surface ratio of convective plumes in 3D to explain the reduced critical luminosity in 3D in their light-bulb simulations. It is plausible that these factors establish a similar critical luminosity threshold for shock revival in light-bulb simulations, but they do not explain the much more decisive effect of dimensionality in self-consistent simulations. One possible explanation lies in the fact that explosions in self-consistent models usually occur in a short non-stationary phase with a rapidly decreasing mass accretion rate and neutrino luminosity around the infall of the Si/O shell interface; under these conditions the more sluggish emergence of large-scale modes in 3D due to the forward cascade may delay or inhibit shock revival (Lentz et al. 2015). Moreover, the more rapid response of the mass accretion rate to shock expansion in 3D (Melson et al. 2015b; Müller 2015) might be hurtful around shock revival because this effect can reduce the accretion luminosity and hence undercut neutrino heating before a runaway situation can develop.

Resolution dependence and turbulence Because of the turbulent nature of neutrino-driven convection, the spectral properties of the flow and the the resolution dependence in simulations have received considerable attention in the literature. Most self-consistent models with multi-group neutrino transport can only afford a limited resolution (about \(1.5^\circ \)\(2^\circ \) in angle and about 100 zones or less in the gain region) and do not reach a fully developed turbulent state, with Handy et al. (2014) going so far as to speak of “perturbed laminar flow” instead. Various authors (Abdikamalov et al. 2015; Radice et al. 2015, 2016) have argued that considerably higher resolution is needed to obtain clean turbulence spectra with a developed inertial range and a Kolmogorov spectrum and raised concerns that a pile-up of kinetic energy (“bottleneck effect”) at small scales might affect the overall dynamics. However, the detailed spectral properties of the flow are usually not critical, and integral properties of the flow are more important for the impact of convection on shock revival. The resolution dependence nonetheless remains a concern for the question of shock revival, as some 3D resolution studies (Hanke et al. 2012; Abdikamalov et al. 2015; Roberts et al. 2016) found a trend towards decreasing explodability with increased resolution. Recent work by Melson et al. (2020) resolved most of these concerns in a resolution study using light-bulb simulations. They demonstrated that the resolution dependence in Hanke et al. (2012) was a spurious effect connected to details in their light-bulb scheme, and instead found a trend towards increased explodability at higher resolution. In rough agreement with Handy et al. (2014), Melson et al. (2020) found that the overall flow dynamics converges at an angular resolution of about \(1^\circ \), which is not far from what most self-consistent simulations can afford (but one still needs to bear in mind that the resolution requirements depend on the details of the numerical scheme, cf. Sect. 2.1.2). Melson et al. (2020) also pointed out that neutrino drag plays a non-negligible role in the gain region, so that merely increasing the resolution does not add physical realism beyond numerical Reynolds numbers of a few hundred unless neutrino drag is also included as a non-ideal effect. Melson et al. (2020) speculate that findings of decreased explodability with higher resolution in Cartesian 3D models may be explained because the grid-induced seed asphericities are lower at higher resolution.

Fig. 10
figure10

The advective-acoustic mechanism for the standing accretion shock instability. Upward propagating acoustic waves (blue) generate vorticity perturbations (red) as they interact with the accretion shock (orange circle). The vorticity perturbations are advected downward with the accretion flow to the PNS surface where they generate acoustic waves due to advective-acoustic coupling in the steep density gradient. Instability for a given mode obtains if the product of the amplitude ratios \({\mathscr {Q}}_\mathrm {sh}\) and \({\mathscr {Q}}_\nabla \) for between ingoing and outgoing waves at the shock and PNS surface satisfies \({\mathscr {Q}}_\mathrm {sh} {\mathscr {Q}}_\nabla >1\). Image reproduced with permission from Guilet and Foglizzo (2012), copyright by the authors

The standing accretion shock instability

Using adiabatic 2D simulations of spherical accretion shocks, the seminal work of Blondin et al. (2003) demonstrated that another instability, dubbed “SASI” (standing accretion shock instability), can operate in the supernova core even without a convectively unstable gradient in the gain region. This instability takes the form of large-scale (\(\ell =1\) and sometimes \(\ell =2\)) oscillatory motions of the shock, and it was immediately realized that it can support shock revival in a similar manner as convection. In early 2D supernova simulations, the SASI was sometimes confused with convection because the two phenomena share superficial similarities like high-entropy bubbles and low-entropy accretion downflows. However, the SASI is set apart from convection by dipolar (and sometimes quadrupolar) flow. Foglizzo et al. (2006) pointed out that for the typical ratio between the shock and gain radius in the pre-explosion phase there are no unstable convective modes with \(\ell =1,2\) in the gain region; instead one finds \(\ell \approx 4\)–8 according to Eq. (37), and for \(\chi <3\) the flow becomes stable against convection altogether without strong perturbations (see also Sect. 4.3). This implies that a different instability mechanism—the one discovered by Blondin et al. (2003)—must be responsible for the \(\ell =1\) and \(\ell =2\) modes in supernova models with small ratios \(R_\mathrm {sh}/R_\mathrm {g}\).

Amplification mechanism The stability of accretion shocks had in fact already been analyzed earlier in the context of accretion onto compact objects using linear perturbation theory (Houck and Chevalier 1992; Foglizzo 2001, 2002), which provided useful groundwork for identifying the physical mechanism behind the SASI and explaining the \(\ell =1,2\) nature of the instability from its dispersion relation. The accepted picture is now that of a vortical-acoustic cycle (Foglizzo 2002; Foglizzo et al. 2007): Shock deformation generates vorticity waves that are advected towards the PNS surface, and due the deceleration of the flow in the steep density gradient below the gain region (see Fig. 9), these vorticity waves in turn generate outgoing sound waves that again couple to acoustic waves at the shock (Fig. 10). Although a purely acoustic amplification cycle has been considered as well (Blondin and Mezzacappa 2006), analytical (Laming 2007, 2008; Yamasaki and Yamada 2007) and numerical (Ohnishi et al. 2006; Scheck et al. 2008; Fernández and Thompson 2009a, b) studies have on the whole supported the advective-acoustic cycle, culminating in the work of Guilet and Foglizzo (2012) who sharpened and summarized the arguments in favor of this amplification mechanism.

Different from convection, the SASI is an oscillatory instability with a periodicity \(T_\mathrm {SASI}\) that is set by the sum of the advective and acoustic crossing times \(\tau _\mathrm {adv}\) and \(\tau _\mathrm {ac}\) between the shock and the deceleration region (Foglizzo et al. 2007),

$$\begin{aligned} T_\mathrm {SASI} = \tau _\mathrm {adv}+\tau _\mathrm {ac} = \int _{r_\nabla }^{r_\mathrm {sh}} \frac{\mathrm {d}r}{|v_r|}+ \int _{r_\nabla }^{r_\mathrm {sh}} \frac{\mathrm {d}r}{c_s-|v_r|}. \end{aligned}$$
(58)

The advective time scale usually dominates, and neglecting a weak dependence on the PNS mass, one can determine empirically that the period of the \(\ell =1\) mode of the SASI roughly scales as (Müller and Janka 2014),

$$\begin{aligned} T_\mathrm {SASI} = 19 \, \mathrm {ms} \left( \frac{R_\mathrm {sh}}{100 \, \mathrm {km}}\right) ^{3/2} \ln \left( \frac{R_\mathrm {sh}}{R_\mathrm {PNS}} \right) , \end{aligned}$$
(59)

where \(R_\mathrm {PNS}\) is the PNS radius. SASI-induced fluctuations in the neutrino emission (Lund et al. 2010; Tamborra et al. 2013; Müller and Janka 2014; Müller et al. 2019) and gravitational waves (Kuroda et al. 2016a, 2018; Andresen et al. 2017) could provide direct observational confirmation for the SASI if this frequency can be identified in spectrograms of the neutrino or gravitational wave signal.

The growth rate of the SASI is set both by the period \(T_\mathrm {SASI}\) and the quality factor \({\mathscr {Q}}\) of the amplification cycle (Foglizzo et al. 2006, 2007),

$$\begin{aligned} \omega _\mathrm {SASI} =\frac{\ln |{\mathscr {Q}}|}{T_\mathrm {SASI}}, \end{aligned}$$
(60)

where \({\mathscr {Q}}\) depends on the coupling between vortical and acoustic waves at the shock and in the deceleration region, and hence on the details of the density profile and the thermodynamic stratification. Nuclear dissociation and recombination also affect the SASI growth rate and saturation amplitude (Fernández and Thompson 2009a, b).

Interplay of SASI, convection, and neutrino heating In reality, the SASI grows in an accretion flow with neutrino heating, and in 2D, it is not trivial at first glance to distinguish SASI and convection in the non-linear phase where both instabilities lead to a similar \(\ell =1\) “sloshing” flow. Nonetheless, a clear distinction between a SASI- and convection-dominated regime already emerged in 2D models using gray (Scheck et al. 2008) or multi-group (Müller et al. 2012a) neutrino transport, or simpler light-bulb models (Fernández et al. 2014): Different from convection-dominated models SASI-dominated models clearly show an oscillatory growth of the multipole coefficients of the shock surface and coherent wave patterns in the post-shock cavity in the linear regime, and maintain a rather clear quasi-periodicity even in the non-linear regime. The distinction between the two different regimes tends to become more blurred around shock revival, when large-scale convective modes emerge and the periodicity of the SASI oscillations is eventually broken.

The criterion \(\chi \approx 3\) roughly separates the two regimes even though unstable SASI modes can in principle exist above this value. The reason is likely that convection destroys the coherence of the waves involved in the SASI amplification cycle (Guilet et al. 2010) if \(\chi >3\). For \(\chi <3\), high quality factors \(\ln |{\mathscr {Q}}|\sim 2\) can be reached and result in rapid SASI growth. In terms of PNS parameters and progenitor parameters, such low values of \(\chi <3\) are encountered in case of rapid PNS and shock contraction (Scheck et al. 2008) and appear to occur preferentially in high-mass progenitors with high mass accretion rates (Müller et al. 2012b), although a detailed survey of the progenitor-dependence of the \(\chi \)-parameter is still lacking.

Three-dimensional simulations with neutrino transport (Hanke et al. 2013; Tamborra et al. 2014b; Kuroda et al. 2016a; Müller et al. 2017a; Ott et al. 2018; O’Connor and Couch 2018a) as well as simplified leakage and light-bulb models (Couch and O’Connor 2014; Fernández 2015) show an even cleaner distinction between the SASI- and convection-dominated regimes for several reasons. The convective eddies remain smaller in the non-linear stage than in 2D because of the forward cascade, and without the constraint of axisymmetry, the convective flow is not prone to artificial oscillatory sloshing motions. The SASI, on the other hand, exhibits a cleaner periodicity prior to shock revival in 3D, and can develop a spiral mode that is very distinct from convective flow (e.g., Blondin and Mezzacappa 2007; Fernández 2010; Hanke et al. 2013; see also Sect. 5.3 for possible implications on neutron star birth periods). Self-consistent models show that the post-shock flow can transition back and forth between the convection- and SASI-dominated regime as the accretion rate and PNS parameters, and hence the \(\chi \)-parameter change (Hanke et al. 2013).

Saturation mechanism Guilet et al. (2010) argued that parasitic Kelvin–Helmholtz and Rayleigh–Taylor instabilities are responsible for the non-linear saturation of the SASI, and showed that this mechanism can explain the saturation amplitudes in the adiabatic simulations of Fernández and Thompson (2009b). Assuming that the Kelvin–Helmholtz instability is the dominant parasitic mode in 3D, one can derive (Müller 2016) a scaling law for the turbulent velocity fluctuations \(\delta v\) in the saturated state,

$$\begin{aligned} \delta v \sim \omega _\mathrm {SASI} (R_\mathrm {sh}-R_\mathrm {g}) \sim \frac{\ln |{\mathscr {Q}}| (R_\mathrm {sh}-R_\mathrm {g})}{\tau _\mathrm {adv}} \sim \ln {\mathscr {Q}} |\langle v_r\rangle |, \end{aligned}$$
(61)

which is in good agreement with self-consistent 3D simulations. Interestingly, this scaling results in similar turbulent velocities as in the convection-dominated regime for conditions typically encountered in supernova core (Müller 2016).

The saturation of the SASI can also be understood as a self-adjustment to marginal stability (Fernández et al. 2014), which is a closely related concept. As the SASI grows in amplitude, the flow is driven towards \(\langle \chi \rangle \approx 3\), but stays slightly below this critical value (Fernández et al. 2014).

Effect on shock revival The SASI provides similar beneficial effects as convection to increase the shock radius and bring the accretion flow closer to a neutrino-driven runaway, i.e., it generates turbulent pressure, brings high-entropy bubbles to large radii, channels cold matter towards the PNS, and converts turbulent kinetic energy thermal energy throughout the gain region by turbulent dissipation. Due to the different instability mechanism (which feeds on the energy of the accretion flow directly instead of the neutrino energy deposition), and the different flow pattern (which affects the rate of turbulent dissipation), the quantitative effect on shock revival can be different from convection. Using light-bulb simulations Fernández (2015) indeed found a significantly lower critical luminosity in the SASI-dominated regime than in the convection-dominated regime and a lower critical luminosity in 3D by \(\sim \, 20\%\) compared to 2D, which he ascribed to the ability of the spiral mode to store more kinetic energy than sloshing modes in 2D. An even bigger difference to convective models (albeit with a different and very idealized setup) was found by Cardall and Budiardja (2015). Self-consistent simulations, on the other hand, have not found higher explodability in 3D in the SASI-dominated regime (Melson et al. 2015a). The reason for this discrepancy could, e.g., lie in the feedback of shock expansion on neutrino heating, but is not fully understood at this stage. Cardall and Budiardja (2015) also observed considerably more stochastic variations in shock revival in the SASI-dominated regime in their idealized models (i.e., a smeared-out critical luminosity threshold), but it again remains to be seen whether this is borne out by self-consistent 3D models, where the SASI oscillations tend to be of smaller amplitude and shorter period than in Cardall and Budiardja (2015).

Perturbation-aided explosions

Progenitor asphericities from convective shell burning can aid shock revival by affecting both the growth and saturation of convection of the SASI. That a higher level of seed perturbations leads to a faster growth of non-radial instabilities behind the shock and thereby fosters explosions (as in the early studies of Couch and Ott 2013; Couch et al. 2015) may be intuitive, but appears less important in practice. In self-consistent simulations, shock revival typically occurs only once convection or the SASI have already reached the stage of non-linear saturation, and it is rather the permanent “forcing” by infalling perturbations that matters (Müller and Janka 2015; Müller et al. 2017a). In either case, it is useful to separately consider (a) how the initial perturbations in the porgenitor are translated to perturbations ahead of the shock, and (b) how the infalling perturbations interact with the shock and the post-shock flow.

Initial state and infall phase Typically, the Si and O shell (and sometimes a Ne shell) are the only active convective shells that can reach the shock at a sufficiently early post-bounce time to affect shock revival. As described in Sect. 3.2, these shells are characterized by Mach numbers \(\mathrm {Ma}_\mathrm {prog} \sim 0.1\) with significant variations between different shells and progenitors, and can have a wide range of dominant angular wave numbers \(\ell \). Due to its subsonic nature, the flow is almost solenoidal with \(\nabla \cdot (\rho \mathbf {v})\approx 0\), and density perturbations \(\delta \rho /\rho \sim \mathrm {Ma}_\mathrm {prog}^2\) are small within convective zones. Viewed as a superposition of linear waves, the convective flow consists mostly of vorticity and entropy waves with little contribution from acoustic waves.

From analytic studies of perturbed Bondi accretion flows in the limit \(r \rightarrow 0\) in a broader context (Kovalenko and Eremin 1998; Lai and Goldreich 2000; Foglizzo and Tagger 2000), it is known that such initial perturbations are amplified during infall, and that acoustic waves are generated from the vorticity and entropy perturbations. Estimating the pre-shock perturbations for the problem at hand (Takahashi and Yamada 2014; Müller and Janka 2015; Abdikamalov and Foglizzo 2020) involves some subtle differences, but the upshot is rather simple: Advective-acoustic coupling generates strong acoustic perturbations ahead of the shock that scale linearly with the convective Mach number at the pre-collapse stage (Müller and Janka 2015; Abdikamalov and Foglizzo 2020),

$$\begin{aligned} \delta P/P \sim \delta \rho /\rho \sim \mathrm {Ma}_\mathrm {prog}. \end{aligned}$$
(62)

According to simulations (Müller et al. 2017a) and analytic theory (Abdikamalov and Foglizzo 2020), this scaling is roughly independent of the wave number \(\ell \).Footnote 12

Fig. 11
figure11

Interaction of infalling perturbations with the shock and the post-shock flow, illustrated by snapshots of the entropy (in units of \(k_\mathrm {b}/\mathrm {nucleon}\), left panel) and the absolute value of the non-radial velocity (in units of \(\mathrm {km}\, \mathrm {s}^{-1}\), right panel) in the \(12.5\,M_\odot \) model of Müller et al. (2019) at a post-bounce time of \(510 \, \mathrm {ms}\). The left panel also shows the deformation of the isodensity surface with \(\rho = 7 \times 10^{6}\, \mathrm {g}\, \mathrm {cm}^{-6}\) (red curve). Due to the infalling density perturbations, the pre-shock ram pressure is anisotropic and creates a protrusion of the shock. Additional energy is pumped into non-radial motions in the gain region both because of substantial lateral velocity perturbations ahead of the shock and because of the oblique infall of material through the deformed shock

Shock-turbulence interaction and forced shock deformation The infalling perturbations affect the shock and the post-shock flow in several ways (Müller and Janka 2015; Müller et al. 2016b). They provide a continuous flux of acoustic and tranverse kinetic energy into the gain region, and also create post-shock density perturbations that will be converted into turbulent kinetic energy by buoyancy. Moreover, the shock becomes deformed due to the anisotropic ram pressure (Fig. 11), which results in fast lateral flow behind the shock, i.e., in the generation of additional transverse kinetic energy. Thus, more violent turbulent flow can be maintained in the gain region, which is conducive to shock revival [cf. Eq. (49)]. If the infalling perturbations are of large scale, the deformation of the shock creates large and stable high-entropy bubbles. This is also helpful for shock revival since runaway shock expansion in multi-D appears to require the formation of such large bubbles with sufficient buoyancy to rise and expand against the supersonic drag of the infalling material (Fernández et al. 2014; Fernández 2015).

There is as yet no comprehensive quantitative theory for the interaction of infalling perturbations with the shock and the instabilities in the gain region, but several studies have investigated aspects of the problem. Using order-of-magntitude estimates, Müller et al. (2016b) argued that turbulent motions are primarily boosted by the action of buoyancy on the injected post-shock density perturbations. This hypothesis is supported by controlled parameterized simulations of shock-turbulence interactions in planar geometry (Kazeroni and Abdikamalov 2019). Müller et al. (2016b) also attempted to derive a correction for the saturation value of turbulent kinetic energy depending on the convective Mach number \(\mathrm {Ma}_\mathrm {prog}\) and wave number \(\ell \) in the progenitor. They predicted a reduction of the critical luminosity functional \(\mathcal {L}_\mathrm {crit}=(L_\nu E_\nu ^2)_\mathrm {crit}\) by

$$\begin{aligned} \frac{\delta \mathcal {L}_\mathrm {crit}}{\mathcal {L}_\mathrm {crit}} \approx 0.47 \frac{\mathrm {Ma}_\mathrm {prog}}{\ell \eta _\mathrm {heat} \eta _\mathrm {acc}} \end{aligned}$$
(63)

in terms of the heating efficiency \(\eta _\mathrm {heat}\) and the accretion efficiency \(\eta _\mathrm {heat}=L/(GM\dot{M}/R_\mathrm {g})\). However, the analysis of Müller et al. (2016b) did not account in detail for the interaction of the infalling perturbations with the shock. This has been investigated using linear perturbation theory (Takahashi et al. 2016; Abdikamalov et al. 2016, 2018; Huete et al. 2018; Huete and Abdikamalov 2019). As a downside, this perturbative approach cannot easily capture the non-linear interaction of the injected perturbations with fully developed neutrino-driven convection and the SASI, but Huete et al. (2018) recently incorporated the effects of buoyancy downstream of the shock. The more sophisticated treatment of Huete et al. (2018) predicts a similar effect size as Eq. (63).

Phenomenology of perturbation-aided explosions Whatever its theoretical justification, Eq. (63) successfully captures trends seen in 2D and 3D simulations of perturbation-aided explosions starting from parameterized initial conditions or 3D progenitor models. Both high Mach numbers \(\gtrsim 0.1\) and large-scale convection with \(\ell \lesssim 4\) are required for a significant beneficial effect on the heating conditions (Müller and Janka 2015). In this case, the perturbations can be the decisive factor for shock revival as in the \(18\,M_\odot \) model of Müller et al. (2017a). In leakage-based models with high heating efficiency early on (Couch and Ott 2013; Couch et al. 2015), the effect is smaller, especially if the pre-collapse asphericities are restricted to medium-scale modes as in octant simulations (Couch et al. 2015).

By now, there is a handful of exploding supernova models that use multi-group neutrino transport and 3D progenitor models (Müller et al. 2017a, 2019). While this is encouraging, more 3D simulations are needed to determine to what extent convective seed perturbations generally contribute to robust explosions. At present, one can nonetheless extrapolate the effect size based on the properties of convective shells in 1D stellar evolution models using Eq. (63). Analysing over 2000 supernova progenitors computed with the Kepler code Collins et al. (2018) predict a substantial reduction of the critical luminosity due to perturbation by 10% or more in the mass range between \(15\,M_\odot \) and \(27\,M_\odot \), and in isolated low-mass progenitors. Below \(15\,M_\odot \), the expected reduction is usually 5% or less, which could still make the convective perturbations one of several important ingredients for robust explosions. In the vast majority of progenitors, only asphericities from oxygen shell burning are expected to have an important dynamic effect.

Outlook: rotation and magnetic fields in neutrino-driven explosions

Earlier on, we already briefly touched simulations of magnetorotational explosion scenarios and the uncertainties that still beset this mechanism. It is noteworthy that rotation and magnetic fields could also play a role within the neutrino-driven paradigm.

Rotationally-supported explosions Since early attempts to study the impact of rotation on neutrino-driven explosions either employed a simplified neutrino treatment (e.g., Kotake et al. 2003; Fryer and Warren 2004; Nakamura et al. 2014; Iwakami et al. 2014) or were restricted to 2D in the case of models with multi-group transport (Walder et al. 2005; Marek and Janka 2009; Suwa et al. 2010), more robust conclusions had to wait for 3D simulations with multi-group neutrino transport (Takiwaki et al. 2016; Janka et al. 2016; Summa et al. 2018). The 3D simulations indicate that the overall effect of rapid rotation is to support neutrino-driven explosions. Centrifugal support reduces the infall velocities and hence the average ram pressure at the shock (Walder et al. 2005; Janka et al. 2016; Summa et al. 2018). Moreover, 3D neutrino hydrodynamics simulations of rotating models tend to develop a strong spiral SASI (Janka et al. 2016; Summa et al. 2018). This is in line with analytic theory (Yamasaki and Foglizzo 2008) and idealized simulations (Iwakami et al. 2009; Blondin et al. 2017; Kazeroni et al. 2018), which demonstrated that rotation enhances the growth rate of the prograde spiral mode and stabilises the retrograde mode. For sufficiently rapid rotation, an even more violent spiral corotation instability can occur (Takiwaki et al. 2016). There is also a subdominant adverse effect, since lower neutrino luminosities and mean energies at low latitudes close to the equatorial plane are detrimental for shock revival (Walder et al. 2005; Marek and Janka 2009; Summa et al. 2018), which is particularly relevant since the explosion tends to be aligned with the equatorial plane in the case of rapid rotation (Nakamura et al. 2014). Summa et al. (2018) found that the overall combination of these effects can be encapsulated by a further modification of the critical luminosity,

$$\begin{aligned} (L_\nu E_\nu ^2)_\mathrm {crit} = (L_\nu E_\nu ^2)_\mathrm {crit,1D} \left( 1+\frac{4 \mathrm {Ma}^2}{3}\right) ^{-3/5} \sqrt{1-\frac{j^2}{GM R_\mathrm {sh}}}. \end{aligned}$$
(64)

Here j is the spherically-averaged angular momentum of the shell currently falling through the shock. The last factor accounts for the reduced pre-shock velocities, and the effect of stronger non-radial flow in the gain region is implicitly (but not predictively) accounted for in the turbulent Mach number.

Magnetic fields without rotation Given the expected pre-collapse spin rates, rotation is unlikely to have a major impact in the vast majority of supernova explosions. It is harder to exclude a significant role of magnetic fields a priori. Even if the progenitor core rotates slowly and does not have strong magnetic fields, convection and the SASI might furnish some kind of turbulent dynamo process that could generate dynamically relevant fields in the gain region. There could also be other processes to provide dynamically relevant magnetic fields, e.g., the accumulation of Alfvén waves at an Alfvén surface (Guilet et al. 2011) or the injection of Alfvén waves generated in the PNS convection zone (Suzuki et al. 2008).

Fig. 12
figure12

Entropy s in \(k_\mathrm {b}/\mathrm {nucleon}\) (left panel) and the logarithm \(\log _{10} P_\mathrm {B}{/}P_\mathrm {gas}\) of the ratio between the magnetic pressure \(P_\mathrm {B}\) and the gas pressure \(P_\mathrm {gas}\) (right panel) in a 3D simulation (left half of panels) and a 2D simulation (right half of panels) of the slowly rotating progenitor \(15\,M_\odot \) progenitor m15b6 of Heger et al. (2005) with the CoCoNuT-FMT code. The initial field is assumed to be combination of a dipolar poloidal field and a toroidal field. Outside convective zones, the field strength is taken from the progenitor, inside convective zones, the magnetic pressure is set to a fraction of \(10^{-4}\) of the thermal pressure. The figures shows meridional slices \(140 \, \mathrm {ms}\) after bounce. Field amplification is driven by convection. Strong fields are generated in regions of strong shear, but these strong field are highly localized, and the total magnetic energy in the gain region remains much smaller than the turbulent kinetic energy and thermal energy

The simulations available so far do not suggest that sufficiently high field strengths can be reached by a small-scale turbulent dynamo. In idealized 2D and 3D simulations of Endeve et al. (2010, 2012), the SASI indeed drives a small-scale turbulent dynamo, and strong field amplification occurs locally up to equipartition and super-equipartition field strengths, especially when a strong spiral mode develops. On larger scales, the magnetic field energy remains well below equipartition, however, and does not become dynamically important. The total magnetic energy in the gain region remains one order of magnitude smaller than the turbulent kinetic energy, and the field does not organize itself into large-scale structures. The situation is similar in the 2D neutrino hydrodynamics simulations of non-rotating progenitors of Obergaulinger et al. (2014) for initial field strengths of up to \(10^{11}\, \mathrm {G}\), with even lower ratios between the total magnetic and turbulent kinetic energy in the gain region. Only for initial field strengths of \(\mathord {\sim }10^{12}\, \mathrm {G}\), which yields magnetar-strength fields after collapse, do Obergaulinger et al. (2014) find that magnetic fields become dynamically important and accelerate shock revival. If the fossil field hypothesis for magnetars is correct and the fields of the most strongly magnetized main-sequence stars translate directly to supernova progenitor and neutron star fields by flux conservation (Ferrario and Wickramasinghe 2006), such conditions for magnetically-aided explosions might be still realized in nature in a substantial fraction of core-collapse events.

Ultimately, a thorough exploration of resolution effects and initial field configurations in the convection- and SASI-dominated regime will be required in 3D to confidently exclude a major role of magnetic field in weakly magnetized, slowly rotating progenitors. First tentative results from 3D MHD simulations with neutrino transport (Fig. 12) suggest a picture of fibril flux concentrations with equipartition field strengths, and sub-equipartition fields in most of the volume, akin to the situation in solar convection (e.g., Solanki et al. 2006).

Fig. 13
figure13

LESA instability in a simulation of an \(18\,M_\odot \) star at time of \(453 \, \mathrm {ms}\) after bounce, illustrated by 2D slices showing the electron fraction \(Y_\mathrm {e}\) (left) and the radial velocity in units of \(\mathrm {km}\,\mathrm {s}^{-1}\) in the PNS convection zone. Note that the \(Y_\mathrm {e}\) distribution in the PNS convection zone between radii of \(10\, \mathrm {km}\) and \(20\, \mathrm {km}\) shows a clear dipolar asymmetry, whereas the radial velocity field is dominated by small-scale modes superimposed over a much weaker dipole mode. Image repoduced with permission from Powell and Müller (2019), copyright by the authors

Proto-neutron star convection and LESA instability

Prompt convection Convective instability also develops inside the PNS. As already recognized in the late 1980s (Bethe et al. 1987; Bethe 1990), a first episode of “prompt convection” occurs within milliseconds after bounce around a mass coordinate of \(\sim 0.8\,M_\odot \) as the shock weakens and a negative entropy gradient is established. The negative entropy gradient is, however, quickly smoothed out, and the convective overturn has no bearing on the explosion mechanism, although it can leave a prominent signal in gravitational waves (see reviews on the subject; Ott 2009; Kotake 2013; Kalogera et al. 2019).

Proto-neutron star convection Convection inside the PNS is triggered again latter as neutrino cooling establishes unstable lepton number (Epstein 1979) and entropy gradients (see profiles in Fig. 9). PNS convection was investigated extensively in the 1980s and 1990s as a possible means of enhancing the neutrino emission from the PNS, which would boost the neutrino heating and thereby aid shock revival (e.g., Burrows 1987; Burrows and Lattimer 1988; Wilson and Mayle 1988, 1993; Janka and Müller 1995; Keil et al. 1996). In particular, Wilson and Mayle (1988, 1993) assumed that PNS convection operates as double-diffusive “neutron finger” instability that significantly increases the neutrino luminosity.

None of the modern studies of PNS convection since the mid-1990s (Keil et al. 1996; Buras et al. 2006a; Dessart et al. 2006) found a sufficiently strong effect of PNS convection on the neutrino emission for a significant impact on shock revival. PNS convection indeed increases the heavy flavor neutrino luminosity by \(\mathord {\sim } 20\%\) at post-bounce times of \(\gtrsim 150 \, \mathrm {ms}\), leaves the electron neutrino luminosity about equal, but tends to decrease the electron antineutrino luminosity, and reduces the mean energy of all neutrino flavors (Buras et al. 2006a). This can be explained by the effects of PNS convection on the bulk structure of the PNS, namely a modest increase of the PNS radius and a higher electron fraction (due to mixing) close to the neutrinosphere of \(\nu _e\) and \(\bar{\nu }_e\) (Buras et al. 2006a). Convective instability appears to be governed by the usual Ledoux criterion and does not develop as a double-diffusive instability in the simulations.

LESA instability Although PNS convection does not have a decisive influence on shock revival, its indirect effect on the gain region is quantitatively important; effectively PNS convection changes the inner boundary condition for the flow in the gain region. PNS convection also has an important impact on the neutrino signal from the PNS cooling phase (e.g., Roberts et al. 2012; Mirizzi et al. 2016), and may provide a sizable contribution to the gravitational wave signal (Marek et al. 2009; Yakunin et al. 2010; Müller et al. 2013; Andresen et al. 2017; Morozova et al. 2018).

Moreover, starting with (Tamborra et al. 2014a), the dynamics of PNS convection has proved more intricate upon closer inspection in recent years with potential repercussions on nucleosynthesis and gravitational wave emission. In their 3D simulations, Tamborra et al. (2014a) noted that a pronounced \(\ell =1\) asymmetry in the electron fraction develops in the PNS convection zone, which leads to a sizable anisotropy in the radiated lepton number flux, i.e., the fluxes of electron neutrinos and antineutrinos show a dipole asymmetry with opposite directions. The direction of the dipole varies remarkably slowly compared to the characteristic time scales of the PNS. Curiously, no such pronounced dipole was seen in the velocity field, which remained dominated by small-scale eddies. This is very different from convection in the gain region or the convective burning in the progenitors, where the asymmetries in the entropy and the composition are reflected in the velocity field as well. This unusual phenomenon, which is illustrated in Fig. 13, has been christened LESA (“Lepton Number Emission Self-Sustained Asymmetry”), and could have important repercussions on the composition of the ejected matter, whose \(Y_\mathrm {e}\) is sensitive to the differences in electron neutrino and antineutrino emission.

Since then this phenomenon has been reproduced by many 3D simulations using very different methods for neutrino transport (O’Connor and Couch 2018a; Janka et al. 2016; Glas et al. 2019; Powell and Müller 2019; Vartanyan et al. 2019a), and even in the 3D leakage models of Couch and O’Connor (2014) The dipolar \(Y_\mathrm {e}\) asymmetry has also been seen in the 2D Boltzmann simulations of Nagakura et al. (2019). This demonstrates that the LESA is a robust phenomenon; claims that it depends on the details of neutrino transport (Nagakura et al. 2019) are not convincing.

The nature of this instabiity is still not fully understood. Tamborra et al. (2014a) initially suggested a feedback cycle between \(\ell =1\) shock deformation, a dipolar asymmetry in the accretion flow, and a dipolar asymmetry in the lepton fraction in the PNS convection zone. However, more recent studies suggest that the LESA does not depend on an external feedback cycle between asymmetries in the accretion flow and asymmetries in the PNS convection zone. Glas et al. (2019) demonstrated that LESA can be even more pronounced in exploding low-mass progenitor models with low accretion rates, which suggests that the mechanism behind LESA works within the PNS convection zone.

This, however, leaves the question why the flow within the PNS convection zone would organise itself to generate a dipolar lepton fraction asymmetry. Some papers have, however, formulated first qualitative arguments to suggest that there is an internal mechanism for a dipole asymmetry in the lepton fraction, and that LESA may just be a very peculiar manifestation of buoyancy-driven convection. What appears to play a role is that the lepton fraction gradient becomes stabilizing against convection in the middle of the PNS convection zone (Janka et al. 2016; Powell and Müller 2019). Janka et al. (2016) suggested that this can give rise to a positive feedback loop because a hemispheric lepton asymmetry will attenuate or enhance the stabilizing effect in the different hemispheres, thereby leading to more vigorous convection in one hemisphere, which in turn maintains the lepton asymmetry. On a different note, Glas et al. (2019) sought to explain the large-scale nature of the asymmetry by applying the concept of a critical Rayleigh number for thermally-driven convection (Chandrasekhar 1961). However, one still needs to account for the fact that the typical scales of the velocity and lepton number perturbations appear remarkably different in the PNS convection zone. This was confirmed by the quantitative analysis of Powell and Müller (2019), who found a very broad turbulent velocity spectrum peaking around \(\ell =20\), which conforms neither to Kolmogorov or Bolgiano–Obukhov scaling for stratified turbulence. Powell and Müller (2019) suggested that this could be explained by the scale-dependent effective buoyancy experienced by eddies of different sizes as they move across the partially stabilized central region of the PNS convection zone. Powell and Müller (2019) also remarked that the non-linear state of PNS convection is characterized by a balance between the convective and diffusive lepton number flux. All these aspects suggest that the LESA could be no more than a manifestation of PNS convection, but that PNS convection is in fact quite dissimilar from the high-Péclet number convection as familiar from the gain region or the late convective burning stages. A satisfactory explanation of the phenomenon likely needs to go beyond concepts from linear stability theory and the usual global balance arguments behind MLT, and will have to take into account scale-dependent forcing and dissipation, and the “double-radiative” nature of the instability.

Since the stabilising lepton number gradient in the middle of the PNS convection zone figures prominently in these attempts to understand LESA, one might justifiably ask whether there is some role for double-diffusive instabilities in the PNS after all. Local stability analysis (Bruenn et al. 1995; Bruenn and Dineva 1996; Bruenn et al. 2004) in fact suggests that double-diffusive instabilities (termed lepto-entropy fingers and lepto-entropy semiconvection) should occur in the PNS. But why were such double-diffusive instabilities never identified in multi-D simulations so far? Further careful analysis and interpretation of the simulation results and theory is in order to clarify this. One possible interpretation could be that the characteristic step-like lepton number profile established by LESA (Powell and Müller 2019) is actually a manifestation of layer formation in the subcritical regime as familiar from semiconvection (Proctor 1981; Spruit 2013; Garaud 2018). However, the slow, global turnover motions in LESA do not readily fit into this picture.Footnote 13 One should also beware premature conclusions because PNS convection is an inherently difficult regime for numerical simulations due to small convective Mach numbers of order \(\mathord {\sim }\,0.01\) and the importance of diffusive effects. The potential issues go beyond the question of resolution and unphysically high numrical Reynolds numbers (cf. 2.1.2), and there are concrete reasons to investigate these in more depth. For example, although different codes agree qualitatively concerning the region of instability and the qualitative features of the convective flow, substantial differences in the turbulent kinetic energy density have been reported in a comparison between the Alcar and Vertex codes in the PNS convection zone, even though the agreement between the codes is otherwise excellent (Just et al. 2018). While it is unlikely that the uncertainties in models PNS convection have any impact on the problem of shock revival, they need to be addressed to obtain a better understanding of LESA and reliable predictions of gravitational wave signals and the nucleosynthesis conditions in the neutrino-heated ejecta.

The explosion phase

Regardless of whether the explosion is driven by neutrinos or magnetic fields, there is no abrupt transition to a quasi-spherical outflow after shock revival. In this section, we shall focus on the situation in neutrino-driven explosions, which has already been quite thoroughly explored.

The early explosion phase

In typical neutrino-driven models, the multi-dimensional flow structure in the early explosion phase appears qualitatively similar to the pre-explosion phase at first glance. Buoyancy-driven outflows and accretion downflows persist for hundreds of milliseconds to seconds and allow for simultaneous mass accretion and ejection. Because of the ongoing accretion, high neutrino luminosities and hence high heating rates can be maintained to continually dump energy into the developing explosion. As the shock radius slowly increases, large-scale \(\ell =1\) and \(\ell =2\) modes start to dominate the flow irrespective of whether medium-scale convection or large-scale SASI modes dominated prior to shock revival, even though 2D explosion models probably tended to exaggerate this effect. The basic features of this pictures have held since the 1990s (Herant et al. 1992; Shimizu et al. 1993; Yamada et al. 1993; Janka et al. 1993; Herant et al. 1994; Burrows et al. 1995; Janka and Müller 1995, 1996), and have proved critical for explaining the energetics of core-collapse supernovae.Footnote 14 Even in electron-capture supernova progenitors, which explode even without the help of multi-dimensional effects (Kitaura et al. 2006), there is a brief phase of convective overturn after shock revival (Wanajo et al. 2011).

More recent 3D explosion models using multi-group transport (Takiwaki et al. 2014; Melson et al. 2015a; Lentz et al. 2015; Müller 2015; Müller et al. 2017a, 2019; Burrows et al. 2020) have confirmed this picture, but paved the way towards a more quantitative theory of the explosion phase. In massive progenitors, shock expansion is usually sufficiently slow for one or two dominant bubbles of neutrino-heated ejecta to form (Fig. 14). Only at the low-mass end of the progenitor spectrum (Melson et al. 2015b; Gessner and Janka 2018) do the convective structures freeze out so quickly that the neutrino-heated ejecta are organized in medium-scale bubbles instead of a unipolar or bipolar structure.

The detailed dynamics of the outflows and downflows proved to be significantly different in 3D compared to 2D, and that only restricted insights on explosion and remnant properties and nucleosynthesis can be gained from the impressive corpus of successful 2D simulations with multi-group transport (Buras et al. 2006a; Marek and Janka 2009; Müller et al. 2012a, b, 2013; Janka 2012; Janka et al. 2012; Suwa et al. 2010, 2013; Bruenn et al. 2013, 2016; Nakamura et al. 2015; Burrows et al. 2018; Pan et al. 2018; O’Connor and Couch 2018b). Except at the lowest masses, the 2D simulations are uniformly characterized by almost unabated accretion through fast downflows that reach directly to the bottom of the gain region, by outflows that are often weak and intermittent, and by a halting rise of explosion energies. Long-time 2D simulations showed that this situation can persist out to more than \(10 \, \mathrm {s}\) (Müller 2015), and as a result, implausibly high neutron star masses are reached. The halting growth of explosion energies in 2D can partly be explained by the topology of the flow which lends itself to outflow constriction by equatorial downflows, but the primary difference between 2D and 3D lies in the velocity of the downflows. Melson et al. (2015b) already noticed that the downflows appear to subside more quickly in their 3D model of a low-mass progenitor, which they ascribed to the forward turbulent cascade in 3D; this led to a slight enhancement of the explosion energy by 10% in 3D compared to 2D. In more massive progenitors with stronger accretion after shock revival stronger braking of the downflows in 3D compared to 2D is even more evident (Müller 2015). Instead of crashing into a secondary accretion shock at \(\mathord {\sim } 100 \, \mathrm {km}\) at a sizable fraction of the free-fall velocity, the downflows are gently decelerated, and secondary shocks rarely form. Müller (2015) ascribed this pathology of the 2D models to the behavior of the Kelvin–Helmholtz instability between the outflows and downflows, which is stabilized at high Mach numbers in 2D, but can always grow in 3D (Gerwin 1968). Since the typical Mach number of the downflows is higher during the explosion phase, the assumption of 2D symmetry becomes even more problematic than during the accretion phase.

Fig. 14
figure14

Volume renderings of the entropy in different 3D supernova simulations showing the emergence of stable large-scale plumes around and after shock revival as a common phenomenon despite differences in resolution and in the neutrino transport treatment. The outer translucent surface is the shock, the structures inside are neutrino-heated high-entropy bubbles: a\(15\,M_\odot \) model of Lentz et al. (2015) with a unipolar explosion geometry at a post-bounce time of \(400\, \mathrm {ms}\). b\(3\,M_\odot \) He star model of Müller et al. (2019) at \(1238 \, \mathrm {ms}\), with two prominent plumes in the 7 o’clock and 11 o’clock directions and weaker shock expansion on the opposite side. c\(20\,M_\odot \) model of Burrows et al. (2020, Fig. 8) with a more dipolar explosion geometry at \(651 \, \mathrm {ms}\). d\(11.2\,M_\odot \) model of Nakamura et al. (2019, Fig. 6) at a time of \(991 \, \mathrm {ms}\). Images reproduced with permission from [a] Lentz et al. (2015), copyright by AAS; [c] from Burrows et al. (2020) and [d] from Nakamura et al. (2019), copyright by the authors

Explosion energetics

Estimators for the explosion energy Strictly speaking, the final demarcation between ejected and accreted materialFootnote 15 cannot be determined before the explosion becomes kinetically dominated after shock breakout, and the same holds true for the final explosion energy \(E_\mathrm {exp}\). It is, however, customary and useful to consider the diagnostic explosion energy\(E_\mathrm {diag}\) (often shortened to “diagnostic energy” or “explosion energy” when there is no ambiguity), which is defined as the total energy of the material that is nominally unbound at any given instance (Buras et al. 2006a; Müller et al. 2012b; Bruenn et al. 2013). By definition the diagnostic energy will eventually asymptote to \(E_\mathrm {exp}\), but might do so only over considerably longer time scales than can be simulated with neutrino transport. In particular, the diagnostic energy can in principle decrease as the shock sweeps up bound material from the outer shells. To account for this, one can correct \(E_\mathrm {diag}\) for the binding energy of the material ahead of the shock (“overburden”) to obtain a more conservative estimate for \(E_\mathrm {exp}\) (Bruenn et al. 2016). In practice, \(E_\mathrm {diag}\) usually rises monotonically because energy continues to be pumped into the ejecta over seconds, but there are exceptions, most notably in cases of early black hole formation (Chan et al. 2018). In most cases, one expects that \(E_\mathrm {diag}\) levels out after a few seconds and then provides a good estimate for \(E_\mathrm {exp}\).

Explosion energies from self-consistent simulations Unfortunately, \(E_\mathrm {diag}\) has not levelled off in most of the available self-consistent 3D explosion models, though the growth of the explosion energy has already slowed down significantly in some long-time simulations using the CoCoNuT-FMT code (Müller et al. 2017a, 2018, 2019). Even in 2D, only some of the models of the Oakridge group appear to have approached their final explosion energy (Bruenn et al. 2016).

This means that no final verdict on the fidelity of the simulations can be pronounced based on a comparison with observationally inferred explosion energies. The models of Müller et al. (2017a, 2018, 2019) and Bruenn et al. (2016), whose explosion energies are admittedly on the high side among modern simulations, have demonstrated that neutrino-driven explosions can reach energies of up to \(8 \times 10^{50}\, \mathrm {erg}\). Similarly, plausible nickel masses of several \(0.01\,M_\odot \) appear within reach, although no firm statements can be made for CoCoNuT-FMT models due to uncertainties in the \(Y_\mathrm {e}\) of the ejecta from the approximative transport treatment, and due to the highly simplified treatment of nucleon recombination. Explosion energies beyond \(10^{51}\, \mathrm {erg}\) may simply be a matter of longer simulations, different progenitor models, and slightly improved physics; and there may be no conflict with the distribution of observationally inferred explosion energies (Kasen and Woosley 2009; Pejcha and Prieto 2015; Müller et al. 2017b) of Type IIP supernovae. First attempts to extrapolate the non-converged explosion energies from simulations and compare them to observations using a rigorous statistical framework (Murphy et al. 2019) indicate that the predicted values are still somewhat too low, but Murphy et al. (2019) also point out that conclusions are premature due to biases and uncertainties in the comparison.

Growth of the explosion energy Even at this stage, the simulations already elucidate how the energy of neutrino-driven explosions is determined. Upon closer inspection, the energy budget of the ejecta is quite complicated and includes contributions from the injection of neutrino-heated material from below, form nucleon recombination and nuclear burning, from the accumulation of bound material by the shock, and from turbulent mixing with the downflows (for a broader discussion, see Marek and Janka 2009; Müller 2015; Bruenn et al. 2016). Nonetheless, a few key findings have emerged. The most critical determinant for the growth of \(E_\mathrm {diag}\) is the mass outflow rate \(\dot{M}_\mathrm {out}\) of neutrino-heated material. Neutrino heating only marginally unbinds the material, and the net contribution to \(E_\mathrm {diag}\) comes from the energy \(\epsilon _\mathrm {rec}\) released by nucleon recombination, which occurs at a radius of about \(300 \, \mathrm {km}\). To first order, the resulting growth rate of the diagnostics energy is (Scheck et al. 2006; Melson et al. 2015b; Müller 2015),

$$\begin{aligned} \dot{E}_\mathrm {diag}\approx \dot{M}_\mathrm {out} \epsilon _\mathrm {rec}. \end{aligned}$$
(65)

In principle, \(8.8\, \mathrm {MeV}\) per nucleon can be released from full recombination to the iron group, but for the relevant entropies and expansion time scales, recombination is incomplete and does not convert all the neutrino-heated ejecta to iron-group elements. Mixing between the outflows and downflows reduces the effective value of \(\varepsilon _\mathrm {rec}\) further to about 5–\(6\, \mathrm {MeV}/\mathrm {nucleon}\).

The mass outflow rate is roughly determined by the volume-integrated neutrino heating rate and the energy required to lift the material out of the gravitational potential. One can argue that the relevant energy scale is the binding energy at the gain radius \(|e_\mathrm {gain}|\), so that

$$\begin{aligned} \dot{M}_\mathrm {out} = \eta _\mathrm {out} \frac{\dot{Q}_\nu }{|e_\mathrm {gain}|}. \end{aligned}$$
(66)

with some efficiency parameter \(\eta _\mathrm {out}\) that accounts for the fact that only part of the neutrino-heated matter is ejected. Initially, one finds \(\eta _\mathrm {out}<1\) as expected, but Müller et al. (2017a) pointed out that the situation changes later after shock revival because much of the ejected material never makes it down to the gain radius and starts is expansion significantly further out with lower initial binding energy. This leads to efficiency parameters \(\eta _\mathrm {out}>1\), and helps compensate for the declining heating rates as the supply of fresh material to the gain radius slowly subsides.

The situation in 2D models is somewhat different (Müller 2015). Here, the ejected material comes from close to the gain radius, and the outflow efficiency \(\eta _\mathrm {out}\) is lower than in 3D. Although the lack of turbulent mixing results in a higher asymptotic specific total energy and entropy in 2D, the net effect is a slower growth of the explosion energy than in 3D. Moreover, the higher entropies in 2D will result in reduced recombination to the iron group and hence lower nickel masses. Despite these shortcomings, 2D simulations remain of some use because they already allow extensive parameters studies of explodability and explosion and remnant properties (Nakamura et al. 2015).

Compact remnant properties

Accretion rates and remnant masses The forward cascade and the stronger Kelvin–Helmholtz instabilities between the outflows and downflows in 3D imply that the accretion rate onto the PNS drops more quickly than in 2D (Müller 2015). As a result, some self-consistent 3D simulations have been able to determine firm numbers for final neutron star masses (Melson et al. 2015b; Müller et al. 2019; Burrows et al. 2019, 2020), barring the possibility of late-time fallback. The predicted neutron star masses appear roughly compatible with the range of observed values (Özel and Freire 2016; Antoniadis et al. 2016), but as with all other explosion and remnant properties, a robust statistical comparison is not yet possible.

Neutron star kicks Observations show that most neutron stars receive a considerable “kick” velocity at birth. The kick velocity is typically a few hundred \(\mathrm {km}\, \mathrm {s}^{-1}\), but there is a broad distribution ranging from very low kicks up to more than \(1000 \, \mathrm {km}\, \mathrm {s}^{-1}\) (e.g., Hobbs et al. 2005; Faucher-Giguère and Kaspi 2006; Ng and Romani 2007). The large-scale ejecta asymmetries that emerge during the explosion provide a possible explanation for this phenomenon (for an overview including other mechanisms such as aniostropic neutrino emission, see Lai et al. 2001; Janka 2017).

The 2D simulations of the 1990s could not yet naturally obtain the full range of observed kick velocities by a hydrodynamic mechanism (Janka and Müller 1994), unless unrealistically large seed asymmetries in the progenitor were invoked (Burrows and Hayes 1996). A plausible range of kicks was first obtained in parameterized 2D simulations by Scheck et al. (2004, 2006), thanks to more slowly developing explosions that allowed the \(\ell =1\)-mode of the SASI, or an \(\ell =1\) convective mode to emerge. The work of Scheck et al. (2004, 2006) revealed that the kick velocity can grow for well over a second in models with slower shock expansion. They concluded that the kick arises primarily from the asymmetric gravitational pull of over- and underdense regions in the ejecta (later termed “gravitational tugboat mechanism”Footnote 16) rather than from pressure force and hydrodyanmic momentum fluxes onto the PNS; anisotropic neutrino emission was found to play only a minor role. Subsequent simulations have not fundamentally changed this analysis. Although various studies showed that the momentum flux onto the PNS can be comparable to the gravitational force onto the PNS (Nordhaus et al. 2010a, 2012; Müller et al. 2017a), this does not invalidate the tugboat mechanism. Effectively, the contribution of each parcel of accreted material to the PNS momentum via the hydrodynamic flux and the gravitational tug almost cancel, and the net acceleration of the PNS is due to the gravitational pull of the material that is actually ejected.

Three-dimensional simulations have not altered this picture substantially. Even though 2D simulations tend to obtain higher kicks, values of several hundred \(\mathrm {km} \, \mathrm {s}^{-1}\) were already obtained in the parameterized 3D simulations of Wongwathanarat et al. (2010b, 2013). Recently, Müller et al. (2017a, 2019) performed sufficiently long 3D simulations with multi-group neutrino transport to extrapolate the final kick velocities, which fall nicely within the observed range of up to \(1000 \, \mathrm {km}\, \mathrm {s}^{-1}\).

Based on the physics of the kick mechanism, various authors have posited a correlation between the kick and the ejecta mass (Bray and Eldridge 2016) or, using more refined arguments, on the explosion energy (Janka 2017; Vigna-Gómez et al. 2018). Tentative support for a loose correlation comes from the small kicks obtained in simulations of low energy, ultra-stripped supernovae (Suwa et al. 2015; Müller et al. 2017a) and electron-capture supernovae (Gessner and Janka 2018), and from more recent 3D simulations over a larger range of progenitor masses (Müller et al. 2019).

Neutron star spins If the downflows hit the PNS surface with a finite impact parameter, they also impart angular momentum onto the PNS. While this was realized already by Spruit and Phinney (1998), 3D simulations are needed for quantitative predictions of PNS spin-up by asymmetric accretion. The predicted spin-up in 3D models of non-rotating varies. Parameterized simulations (Wongwathanarat et al. 2010b; Rantsiou et al. 2011; Wongwathanarat et al. 2013) tend to find longer neutron star spin periods of hundreds of milliseconds to seconds (but extending down to \(100 \, \mathrm {ms}\) in Wongwathanarat et al. 2013). Recent 3D simulations using multi-group transport (Müller et al. 2017a, 2019) obtain spin periods between \(20\, \mathrm {ms}\) and \(2.7\, \mathrm {s}\), which roughly coincides with the range of observationally inferred birth periods (Faucher-Giguère and Kaspi 2006; Perna et al. 2008; Popov and Turolla 2012; Noutsos et al. 2013). Assuming the core angular momentum is conserved after the collapse to a PNS and not changed by angular momentum transport during the explosion, current stellar evolution models computed with the Tayler–Spruit dynamo, predict spin periods in the same range (Heger et al. 2005), which makes it difficult to draw inferences on the explosion mechanism or the progenitor rotation from the observed spin periods. None of the simulations can as yet explain the spin-kick alignment that is suggested by observations (Johnston et al. 2005; Ng and Romani 2007; Noutsos et al. 2013). Proposed mechanisms for natural spin-kick alignment by purely hydrodyanmic processes (Spruit and Phinney 1998; Janka 2017) have not been borne out by the models. However, the possibility of natural spin-kick alignment in rotating progenitors has yet to be investigated.

Role of the spiral SASI mode In the first idealized simulations of the spiral mode of the SASI, Blondin and Mezzacappa (2007) noted a significant flux of angular momentum into the “neutron star” (modeled by an inner boundary condition) that would lead to rapid neutron star rotation in the opposite direction to the SASI flow with angular frequencies of the order of \(100 \, \mathrm {rad}\, \mathrm {s}^{-1}\) even in the case of non-rotating progenitors. This idea has been explored further in several numerical (Hanke et al. 2013; Kazeroni et al. 2016, 2017) and analytical (Guilet and Fernández 2014) studies. The potential for spin-up of non-rotating progenitors may be more modest than initially thought; both numerical and analytical results suggest that the angular momentum imparted onto the PNS is only a few \(10^{46}\, \mathrm {erg}\, \mathrm {s}\), corresponding to spin periods of hundreds of milliseconds (Hanke et al. 2013; Guilet and Fernández 2014). Moreover, part of the angular momentum contained in the spiral mode may be accreted after shock revival, negating the separation of angular momentum previously achieved by the SASI. Spin-up and spin-down by SASI in the case of rotating progenitors still merits further investigation; the idealized simulations of Kazeroni et al. (2017) suggest different regimes of random spin-up and spin-down for slow progenitor rotation, systematic spin-down for intermediate rotation, and weaker spin-down for high rotation rates in the regime of the corotation instability. The possibility of magnetic field amplification due to the induced shear in the PNS surface region in the case of significant spin-up or spin-down by the SASI also needs to be explored.

Fig. 15
figure15

a Emergence of Rayleigh–Taylor instability during the propagation of the shock through the envelope, illustrated by spherically averaged profiles of density \(\rho \), pressure P, and radial velocity \(v_r\) from a 2D long-time simulation of a \(9.6\,M_\odot \) star based on the explosion model of Müller et al. (2013). At this stage (\(140 \,\mathrm {s}\) after the onset of the explosion), the shock has reached the He shell, and a reverse shock has formed. Behind the forward shock, the pressure gradient is positive and decelerates the expansion of the ejecta. Rayleigh–Taylor (RT) instability grows in a region with \(\mathrm {d}\rho /\mathrm {d}r<0\) behind the shock. Note that the structure of the blast wave can be more complicated in general with several unstable regions and reverse shocks that interact with each other. b Mass fraction isocontours in a 3D model of mixing in SN 1987A. Note that while the biggest Ni-rich Rayleigh–Taylor clumps are seeded by large-scale asymmetries from the early explosion phase, these develop into the finger-like structures characteristic of the Rayleigh–Taylor instability, and there is also considerable growth of small-scale plumes. Image reproduced with permission from Wongwathanarat et al. (2015), copyright by ESO

Mixing instabilities in the envelope

Structure of the flow in the later explosion phase As the propagating shock scoops up matter and as the explosion energy levels off, the structure of the post-shock region changes (Fig. 15a). Early on, the post-shock expansion velocities are subsonic and the outflows are accelerated by a positive pressure gradient, but eventually the post-shock flow enters a Sedov-like regime where a positive pressure gradient is established and matter is decelerated behind the shock (Chevalier 1976). Generally, the shock velocity \(v_\mathrm {sh}\) also decreasesFootnote 17 as the mass \(M_\mathrm {ej}\) of the shock ejecta grows; it roughly scales as \(v_\mathrm {sh}\propto (E_\mathrm {exp}/M_\mathrm {ej})^{1/2}\). The shock can, however, transiently accelerate when it encounters density gradients steeper than \(\rho \propto r^{-3}\) at shell interfaces (Sedov 1959). Both effects can be captured by the formula of Matzner and McKee (1999),

$$\begin{aligned} v_\mathrm {sh} \approx 0.794 \left( \frac{E_\mathrm {exp}}{M_\mathrm {ej}}\right) ^{1/2} \left( \frac{M_\mathrm {ej}}{\rho r^3}\right) ^{0.19}. \end{aligned}$$
(67)

The post-shock pressure and density profiles adjust to variations in shock velocity to establish something of a “quasi-hydrostatic” stratification behind the shock with an effective gravity that is directed outward. However, once the post-shock velocities become supersonic, the post-shock pressure profile can no longer globablly adjust to changing shock velocities, and reverse shocks are formed. A first reverse shock forms typically forms at a few \(1000\, \mathrm {km}\) as the developing neutrino-heated wind crashes into more slowly moving ejecta. Later on, further reverse shocks emerge after the shock encounters various shell interfaces. Their strength depends on the density jump at the shell interface. In hydrogen-rich progenitors, the reverse shock from the H/He interface is particularly strong (especially in red supergiant) and therefore sometimes referred to simply as the reverse shock.

Rayleigh–Taylor instability The non-monotonic variations in \(v_\mathrm {sh}\) result in mon-monotonic post-shock entropy and density profiles, and some layers become Rayleigh–Taylor unstable (Chevalier 1976; Müller et al. 1991; Fryxell et al. 1991). In the relevant highly compressible regime, the growth rate for the Rayleigh–Taylor instability from a local stability analysis is given by Bandiera (1984), Benz and Thielemann (1990) and Müller et al. (1991)

$$\begin{aligned} \omega _\mathrm {RT}= \frac{c_\mathrm {s}}{\varGamma } \sqrt{\frac{\partial \ln P}{\partial r} \left( \frac{\partial \ln P}{\partial r}- \varGamma \frac{\partial \ln \rho }{\partial r} \right) } = \sqrt{ g_\mathrm {eff} \left( \frac{1}{\varGamma }\frac{\partial \ln P}{\partial r}- \frac{\partial \ln \rho }{\partial r} \right) }, \end{aligned}$$
(68)

where \(g_\mathrm {eff}=\rho ^{-1} \partial P/\partial r\) is the effective gravity. The second form elucidates that stability is determined by the sub- or superadiabaticity of the density gradient as for buoyancy-driven convection. In the relevant radiation-dominated regime, composition gradients have a minor impact on stability, and the entropy gradient is the deciding factor. However, since the effective gravity points outwards, positive entropy gradients are now unstable. Such positive entropy gradients arise when the shock accelerates at shell interfaces. One should note that the actual growth rate of perturbations depends on their length scale (Zhou 2017), and Eq. (68) roughly applies to the fastest growing modes with a wavelength comparable to the width of the unstable region. Since the unstable regions tend to be narrow, Rayleigh–Taylor mixing tends to produce smaller, clumpy structures, but large-scale asymmetries can also grow considerably for sufficiently strong seed perturbations.

Intriguingly, it has been suggested that the overall effect of Rayleigh–Taylor mixing can roughly be captured in 1D by an appropriate turbulence model (Duffell 2016; Paxton et al. 2018). The key idea here is to incorporate the proper growth rate, saturation behavior, and a velocity-dependent mixing length (Duffell 2016). While a first comparison of this model with 3D results from Wongwathanarat et al. (2015) proved encouraging (Paxton et al. 2018), a few caveats remain. A detailed analysis of Rayleigh–Taylor mixing in a 3D model of a stripped-envelope progenitor by Müller et al. (2017a) unearthed some basis for a phenomenological 1D description of Rayleigh–Taylor mixing (most notably buoyancy-drag balance in the non-linear stage) and suggested some improvements to the model of Paxton et al. (2018), but cast doubt on the use of local gradient to estimate the density and composition contrasts of the Rayleigh–Taylor plumes. In particular, the Rayleigh–Taylor instability sometimes produces partial inversions of the initial composition profiles, which cannot be modeled by diffusive mixing in 1D.

In addition to the Rayleigh–Taylor instability, the Richtmyer–Meshkov instability (see Richtmyer 1960; Zhou 2017, for details of the instability mechanism) can develop because the shock is generally aspherical and hits the shell interfaces obliquely. The literature on mixing instabilities in supernovae is extensive, and we can only provide a very condensed summary of extant numerical studies. We will exclusively focus on the optically thick phase of the explosion and not consider the remnant phase.

Simulations of mixing in SN 1987A After a few earlier numerical experiments, significant interest in mixing instabilities was prompted by observations of SN 1987A that pointed to strong early mixing of nickel (see Arnett et al. 1989b; McCray 1993, and references therein). Two-dimensional simulations of mixing instabilities (Arnett et al. 1989a; Müller et al. 1991; Fryxell et al. 1991; Hachisu et al. 1990; Benz and Thielemann 1990; Herant and Benz 1992) in the wake of SN 1987A took a first step towards explaining the observed mixing. The typical picture revealed by these models is that of a strong Rayleigh–Taylor instability at the H/He-interface with linear growth factors of thousands (Müller et al. 1991), Some models (Müller et al. 1991; Fryxell et al. 1991) also showed a second strongly unstable region at the He/C-interface that eventually merges with the mixed region further outside. Mixing was dominated by many small-scale plumes in these first-generation simulations. However, the maximum velocities of nickel plumes still fell short by about a factor of two compared to the observed velocities of up to \(\mathord {\sim } \,4000\, \mathrm {km} \, \mathrm {s}^{-1}\).

Many subsequent studies have investigated stronger, large-scale initial seed perturbations as a possible explanations for the strong mixing in SN 1987A and other observed Type II supernovae. Such seed perturbations are naturally expected in the neutrino-driven paradigm from the SASI and low-\(\ell \) convective modes, and in magnetorotational explosions with veritable jets. Most simulations have explored the effect of large-scale seed perturbations by specifying them ad hoc (e.g., Nagataki et al. 1998; Nagataki 2000; Hungerford et al. 2003; Couch et al. 2009; Ono et al. 2013; Mao et al. 2015; Ellinger et al. 2012). One must therefore be cautious in drawing conclusions on the role of “jets” in explaining the observed mixing. In fact, simulations with artificially injected jets rather serve to rule out kinetically-dominated jets in Type IIP supernovae based on early spectropolarimetry, though thermally-dominated jets are not excluded in principle (Couch et al. 2009).

In their 2D AMR simulations, Kifonidis et al. (2000, 2003, 2006) followed a more consistent approach by starting from light-bulb simulations of neutrino-driven explosions. While the seed asymmetries from the early explosion phase initially led to high nickel clump velocities in the Type IIP model of Kifonidis et al. (