1 Introduction

As atmospheric temperatures have risen in recent decades (e.g., Trenberth et al. 2007), global lake surface temperatures have also increased substantially (O'Reilly et al. 2015;Arhonditsis et al. 2004; Coats et al. 2006; Woolway et al. 2020). In fact, many lakes, including the world’s largest lakes, are warming faster than local air temperatures (Austin and Colman 2007; Hampton et al. 2008; Tierney et al. 2010), suggesting that complex hydrodynamic feedbacks may be accelerating warming. For example, enhanced surface warming rates have previously been linked to a loss of historic ice cover (e.g., Austin and Colman 2007), which leads to an earlier onset of summer stratification and an increase in the total number of warming days each year. The link between increased surface temperatures and loss of ice cover is especially prevalent in the Laurentian Great Lakes, where rapidly warming surface temperatures (Austin and Colman 2008; Bartolai et al. 2015; Zhong et al. 2016; White et al. 2018) have coincided with dramatic declines in ice coverage (> 70% since 1973: Wang et al. 2012; Magnuson et al. 2000). These changes have significant consequences for both surface dynamics (e.g., atmospheric heat and gas exchange: Matsumoto et al. 2015) and ecosystem functions (e.g., primary production: Scavia and Fahnenstiel 1987), and they may be indicative of even more substantial changes below the surface.

Although there is considerable evidence for lake surface warming and ice cover loss in response to climate change, changes in subsurface thermal structure and the timing of stratification remain largely unexplored. This represents an important gap in the literature, since deep, subsurface waters are capable of integrating climatic conditions across years and provide a more robust indicator of long-term climate change impacts (Verburg et al. 2003). Subsurface thermal structure also directly influences many physical and biogeochemical functions in lakes, including vertical mixing (Cannon et al. 2021a; McKinney et al. 2018, 2012; Ralph 2002), benthic heat and gas exchange (Rao et al. 2008), coastal circulation (Scavia & Bennett 1980; Boyce et al. 1989), water quality (Holland et al. 2003), primary production (Frenette et al. 1994), and nutrient cycling (Goldman et al. 1996). As such, understanding how subsurface thermal structure is affected by climate-change induced surface warming is imperative for predicting and responding to changes in overall ecosystem function.

Studying the effects of climate change on the subsurface thermal structure of large lakes is especially difficult due to the lack of high quality, long term observational data. Vertical profiles of water temperature are typically collected using either profiling instruments (e.g., Cannon et al. 2021a), which are logistically expensive and temporally sporadic, or stationary moorings (e.g., Anderson et al. 2021, Cannon and Troy 2018), which require significant resources for consistent deployment, but are capable of providing temporally resolute, long-term measurements at single location. While mooring data provide the best opportunity for directly observing the effects of climate warming on changing lake temperatures (e.g.,Anderson et al. 2021; Ficker et al. 2017), they lack the spatial resolution required to describe processes across large lakes, which have a high degree of spatial heterogeneity. Some studies of large lakes instead rely on remotely sensed lake surface water temperatures to make inferences regarding subsurface dynamics. For instance, Fichot et al. (2019) used satellite measurements to study overturning behavior in the Laurentian Great Lakes, assuming that the water column was fully mixed when surface temperatures were ~ 4 \(^\circ{\rm C}\) (i.e., the temperature of maximum density for freshwater) and either stratified (summer) or inversely stratified (winter) otherwise. However, making assumptions regarding subsurface stratification and mixing is potentially problematic, especially for intermediate depths where the water column may be isothermal (and completely mixed) for a wide range of surface temperatures both above and below the temperature of maximum density (e.g., Cannon et al. 2019).

Hydrodynamic models can be used to supplement sparse subsurface observations and extend analyses below the surface layer (i.e., remote sensing) and beyond stationary mooring locations. Hydrodynamic modelling in large lakes has improved significantly over the last several decades, allowing researchers to accurately model lake-wide hydrodynamics (e.g., surface temperatures, currents, waves, and thermal structure) and biogeochemistry (e.g., dissolved oxygen, nutrient concentrations, and phytoplankton). This is especially true in the Laurentian Great Lakes, where scientists have made significant progress developing high-resolution operational and research models in recent years (e.g., Xue et al. 2022; Li et al. 2021; Fujisaki-Manome et al. 2020; Anderson et al. 2018; Bai et al. 2013; Fujisaki et al. 2013). Forcing three-dimensional hydrodynamic models with historic atmospheric forcing data (observations or reanalysis products) allows researchers to investigate temporal changes in the lacustrine environment over years, or even decades, regardless of the availability of continuous in-situ observations. Assuming that model outputs (e.g., surface temperatures, ice cover, thermal structure and timing, mixing rates) can be sufficiently validated using appropriate observational datasets, historic simulations can be a powerful tool for identifying trends in surface and subsurface hydrodynamics related to ongoing climate warming.

A detailed characterization of surface and subsurface changes in lake thermodynamics is imperative for understanding and responding to environmental changes in large lakes. Analysis is perhaps most urgent in the Laurentian Great Lakes, where even minor changes in the temperate climate are expected to have dramatic consequences for lake ecosystem functions (e.g., Austin and Colman 2007). The goal of this study is to use historical climate data to drive a three-dimensional hydrodynamic model, with simulations used to investigate changes in temperature structure and ice dynamics in Lake Michigan-Huron. Forcing data and simulation results will be validated using point-wise observations collected across the region, providing confidence in trends derived from model output. We hypothesize that hindcast simulations will show distinct warming trends, with significant increases in surface and subsurface temperatures, decreases in ice cover and ice thickness, and changes in seasonal stratification cycles. The novelty of this work lies in its ability to describe subsurface changes over large spatial scales, which is impossible using traditional observational techniques. The model described in this manuscript is a part of the Great Lakes Earth System Model (GLESM) project, with applications in ecosystem modelling, resource management, and climate change predictions in the Laurentian Great Lakes and beyond (i.e., other large, mid-latitude lakes).

2 Methods

2.1 Model Setup

Lake physics and thermodynamics were modelled using the spherical coordinate Great Lakes Finite Volume Community Ocean Model (GL-FVCOM), originally implemented by Bai et al. (2013). GL-FVCOM is an unstructured-grid finite volume lake model based on FVCOM, which was initially developed, calibrated, and subsequently modified by Chen et al. (2003, 2006, 2013). FVCOM is designed to solve the primitive (i.e., hydrostatic) equations numerically using a split-mode method; the vertically integrated transport equations are solved in an external mode using a short timestep and the three-dimensional (3-D) governing equations are solved internally using a longer time step. The 3-D computational domain is composed of unstructured triangular elements (horizontal) coupled with a vertical terrain-following \(\sigma\)-coordinate scheme, and the Coriolis force is allowed to vary spatially with latitude. Horizontal turbulent mixing is simulated using the Smagorinsky parameterization (Smagorinsky 1963) and vertical diffusion is handled using the modified Mellor and Yamada level 2.5b turbulent closure scheme (Mellor and Yamada 1982; Mellor 2001; Mellor and Blumberg 2004) with air–water drag coefficients are calculated as a function of wind speed (Large and Pond 1981).

GL-FVCOM includes several modifications to the original FVCOM script (version 3.1.6) which are intended to improve stability and performance in the Laurentian Great Lakes. Ice dynamics are modelled using a coupled elastic-viscous-plastic rheology ice model based on the Los Alamos Sea Ice Model (CICE; Hunke & Dukowicz 1997). The model produces two-dimensional fields of ice cover concentration, velocity, and thickness. Ice thickness distributions are used to resolve mechanical deformation, grow, and decay of lake ice, and ice surface albedo is calculated as a function of surface temperature, ice thickness, and incoming solar radiation (visible and infrared). Parameterized ice floe diameters and melting coefficients (i.e., basal and lateral) were calibrated prior to model runs. Wind-wave mixing is also improved using the surface boundary condition scheme developed by Hu and Wang (2010). This scheme produces more realistic thermal structure and mixed layer dynamics in the Laurentian Great Lakes, as described in detail by Bai et al. (2013), who also performed initial mixing coefficient calibrations for Lake Michigan. Finally, a leapfrog (centered differencing) scheme (Bai et al. 2020) was used to replace the default time integration schemes (internal: Euler forward; external: Euler forward Runge–Kutta) in order to improve inertial stability with explicit treatment of the Coriolis terms (Wang & Ikeda 1997a,b).

For this study, the model domain includes Lake Michigan and Lake Huron, two of the five Laurentian Great Lakes (Fig. 1). Lake Michigan and Lake Huron are connected via the Straits of Mackinac and share a common mean water elevation. From a hydrological sense, the lakes are considered a single water body, (i.e., Lake Michigan-Huron) and must be modelled together for accurate hydrodynamic simulations. The lakes were modelled as a single enclosed basin (average horizontal resolution: 4.75 km; Fig. 1), and mass fluxes were ignored, including mass loss due to evaporation, mass gain due to precipitation, ground water flows, and stream flows. Lake depths were sourced from 3 arc-second (~ 90 m cell size) bathymetry available through the NOAA National Centers for Environmental Information (https://www.ngdc.noaa.gov/mgg/greatlakes/). The vertical resolution of computation grids included 21 sigma layers with increased resolution at surface and bottom boundaries to capture boundary layer processes.

Fig. 1
figure 1

Bathymetry map for Lake Michigan-Huron. Shading represents 30 m depth contours, which are indicated with black lines. Observational mooring sites are indicated with colored triangles and labeled based on their location (M: Lake Michigan; H: Lake Huron) and approximate site depth (i.e., 15, 55, 150, 200 m). Important lake features (e.g., bays, straits) are labelled for reference. The inset diagram shows an example of the unstructured hydrodynamic model grid in Saginaw Bay, Lake Huron

The GL-FVCOM model was forced using the North American Regional Reanalysis (NARR) product, which reports meteorological variables in three-hour intervals at a uniform horizontal resolution of 32 km (Mesinger et al. 2006). Although NARR has a known high bias in air temperatures (Bennington et al. 2010) and shortwave radiation (Zhao et al. 2013), it remains one of the most accurate, highest resolution forcing datasets available in the Laurentian Great Lakes region. Input forcing data was spatially interpolated to model grid cells using natural neighbor interpolation, with linear temporal interpolation used to match model timesteps. Model integration took place with an external (internal) mode time step of 10 (100) s and a minimum depth of 5 cm was used for wet/dry treatment. The model was initialized in January 1979 using a uniform temperature (T = 4 \(^\circ{\rm C}\)) and velocity (U = 0; V = 0) field. The first year of simulation is treated as a spin-up period, and all data are removed from trend analysis to avoid biases linked to unknown initial conditions. The model is runs continuously from 1979 through 2021. Three-dimensional model results were output in 6 h time increments, with data averaged over daily and annual time scales for validation and trend analysis.

2.2 Model validation

Model simulations were validated using a combination of satellite and in-situ observations. Modelled lake surface temperatures and ice cover concentrations were compared to historical remote sensing products available from the Great Lakes Surface Environmental Analysis (GLSEA), which is part of the National Oceanic and Atmospheric Administration (NOAA) CoastWatch program (https://coastwatch.glerl.noaa.gov). Lake surface temperatures (LST) were derived from NOAA Advanced Very High Resolution Radar (AVHRR) and Visible Infrared Imaging Radiometer Suite (VIIRS S-NPP and VIIRS NOAA-20) imagery, which were used to produce LST maps with spatial and temporal resolutions of ~ 1.3 km and 1 day, respectively. Ice cover data were sourced from Yang et al. (2020a, b), who reprocessed daily gridded ice cover concentrations from the Laurentian Great Lakes available through the U.S. National Ice Center (NIC; https://usicecenter.gov/Products/GreatLakesData), producing a consistent 1.8 km resolution grid. Historical lake surface temperature observations were available starting in 1995, while ice cover observations were available over the entire model simulation period (1979–2021).

Modelled thermal structure and vertical mixing rates were validated using in-situ temperature moorings and microstructure data. Although in-situ measurements are sporadic and limited in their ability to validate lake-wide hydrodynamics, they are ideal for validating the timing and structure of local stratification (e.g., thermocline structure and mixed layer depth) and mixing conditions (e.g., scalar diffusivity). Several long-term (i.e., multi-year) temperature mooring datasets were used for point-wise subsurface temperature validation, with sample sites in both Lake Michigan (150 m depth: GLERL, 2019; 55 m depth: Troy et al. 2019; 15 m depth: Ahmed et al. 2014) and Lake Huron (200 m depth: NOAA GLERL, 2019). Modelled vertical mixing profiles were validated using microstructure data collected in the hypolimnetic waters of Lake Michigan (55-m depth) between 2017 and 2019 (Cannon et al. 2021b). Observed mixing profiles covered a range of seasonal mixing conditions, including stratification-limited summer mixing, nearly isothermal spring mixing, and radiatively convective winter mixing (e.g., Yang et al. 2020a, b; Cannon et al. 2019; Austin 2019). This mixing validation is especially novel, since modelled mixing rates are rarely validated with observations due to the difficulty of collecting in-situ measurements.

Model skill in reproducing observed water surface temperatures, ice cover, and thermal structure was evaluated using typical validation metrics (Table 1). Model biases were assessed using mean absolute error (MAE), mean bias error (MBE), and fractional bias (FB) estimators, while overall model performance was assessed using the root-mean square error (RMSE) and correlation coefficient (CC). Skill metrics were estimated using daily averaged model outputs and observations, with sums computed over all available observational data. Given the limited quantity of data available for vertical mixing validation (< 15 days), model skill in reproducing vertical mixing profiles is only assessed qualitatively.

Table 1 Parameters used for model skill assessment. Variables include model values (\(x\)), observed values (\(y\)), and the total number of observations (\(N\))

2.3 Time series analysis

GL-FVCOM output was used to assess temporal changes in the thermal structure of Lake Michigan-Huron over the simulation period (1979–2021). In addition to the analysis of typical surface characteristics (i.e., lake surface temperature and ice cover), three-dimensional hindcast simulations allowed for the investigation of subsurface temperatures, ice thicknesses, and overturning dates. Subsurface temperature analysis included both lake bottom temperatures (LBT) and layer averaged temperatures (LAT). To compute LAT, the sigma-layer temperature profile of each model grid-cell was first linearly interpolated to a 5 m depth interval before calculating the volume-averaged temperature in 30 m depth layers for each lake, such that \(LAT={\int }_{{z}_{i}}^{{z}_{j}}T\left(z\right)A\left(z\right)dz/{\int }_{{z}_{i}}^{{z}_{j}}A\left(z\right)dz\), where \({z}_{i}\) and \({z}_{j}\) are the depth intervals of interest (0 to 270 m), \(T\) is the water temperature at depth \(z\), and \(A\) is the sum area of all cells at a given depth.

The fall overturn date was defined as the date of maximum bottom temperature when surface-to-bed temperature differences were less than 5 \(^\circ{\rm C}\). This definition is similar to that used by Anderson et al. (2021), with the addition of a temperature gradient threshold used to reduce erroneously early turnover date identification associated with coastal downwelling. The spring overturn date was defined as the first day when surface temperatures exceeded the temperature of maximum density (i.e., 4 \(^\circ{\rm C}\)) each year in any given grid cell (e.g., Anderson et al. 2021). It is assumed that the spring turnover was incomplete for all grid cells where the surface temperature did not fall below 4 \(^\circ{\rm C}\) over the winter. All turnover date estimates were averaged over 30 m depth contours to investigate changing thermal bar dynamics. Lake-wide averages were computed using area-weighting, with output from individual grid cells weighted by their relative area. Finally, the ice thickness was calculated as the modelled ice volume divided by the modelled ice area for each cell.

In this study, long-term trends in annual averages are used to investigate changing thermodynamics in Lake Michigan-Huron. Ice variables (i.e., ice cover and thickness) are averaged over the ice season (November 1st–May 31st), while all other parameters (i.e., surface and subsurface temperatures) are averaged over the calendar year (January 1st–December 31st). Trends were computed using robust linear fitting (i.e., iteratively reweighted least squares), which is less susceptible to the effects of outliers. For robust linear fitting, the algorithm iteratively assigns weights to individual sample points, with lower weights assigned to observations that are further from the model prediction. Iteration stops when best-fit model coefficients converge. Confidence intervals (95%) on all trends were computed using the Wald method, which assumes that errors in the regression are normally distributed, a reasonable assumption given the relatively large sample size (42 years: 1980–2021).

3 Results and discussion

3.1 Lake climatology and model validation

Modelled lake temperatures and ice conditions followed expected seasonal patterns, and lake-wide averages agreed well with the historic observations (Figs. 2 and 3; Table 2). Lake-averaged surface temperatures (Fig. 2a, c) ranged from 0 to 25 \(^\circ{\rm C}\) with annual maxima (minima) in the late summer (winter). Minimum temperatures typically occurred in the first week of March, with a spring turnover in April followed by a summer maximum in mid-August and a fall turnover between September and December. Average lake bottom temperatures (Fig. 2b, d) lagged behind surface conditions by approximately 1 month, varying between ~ 3 \(^\circ{\rm C}\) in late March and ~ 9 \(^\circ{\rm C}\) in early October. Modelled ice cover concentrations (Fig. 3a, c) and thicknesses (Fig. 3b, d) were generally higher on Lake Huron (mean ice cover maxima: 64%; mean ice thickness maxima: 23 cm) than on Lake Michigan (mean ice cover maxima: 33%; mean ice thickness maxima: 10 cm), and peak ice cover dates occurred in late February through early March. Simulated ice durations were also longer in Lake Huron (mean: 83 days) than in Lake Michigan (mean: 65 days), where higher air temperatures associated with the lower latitude (44 \(^\circ\) N vs. 45 \(^\circ\) N) and increased thermal mass associated with the average depth (85 m vs. 60 m) reduced ice growth. Although ice onset (offset) dates vary dramatically year-to-year, on average, ice cover occurred on both lakes between mid-January (1/14–1/20) and early April (3/26–4/7).

Fig. 2
figure 2

Time series of lake-averaged surface temperature (LST: a, c) and bottom temperature (LBT: b, d) in Lake Michigan (a, b) and Lake Huron (c, d). Model results are shown in black while remote-sensing observations are shown in red (LST: NOAA CoastWatch GLSEA). Annual averages for GL-FVCOM results are shown as black circles, and best-fit trend lines are indicated with dashed black lines

Fig. 3
figure 3

Time series of lake-averaged ice cover (IC: a, c) and ice thickness (IT: b, d) in Lake Michigan (a, b) and Lake Huron (c, d). Model results are shown in black while remote-sensing observations are shown in red (IC: U.S. National Ice Center).). Annual averages for GL-FVCOM results are shown as black circles, and best-fit trend lines are indicated with dashed black lines

Table 2 Model validation metrics for lake-averaged lake surface temperature (LST) and ice cover (IC). Variables are computed over the entire historical archive for each variable, which spans from 1995 to 2021 for lake surface temperature (source) and 1980–2021 for ice cover (source)

Lake-averaged surface temperature and ice cover simulations showed strong agreement with remote-sensing observations (Figs. 2 and 3; Table 2). For lake surface temperatures, overall RMSE estimates were approximately 1 \(^\circ{\rm C}\) for each lake, with correlation coefficients near unity (CC = 0.99). The model tended to overpredict lake surface temperatures, though mean absolute errors (Lake Michigan: 0.83 \(^\circ{\rm C}\); Lake Huron: 0.80 \(^\circ{\rm C}\)), mean bias errors (Lake Michigan: 0.67 \(^\circ{\rm C}\); Lake Huron: 0.73 \(^\circ{\rm C}\)), and fractional biases (Lake Michigan: − 0.07; Lake Huron: − 0.08) remained relatively low over the simulation period. Modelled ice cover estimates were generally within 5% of observations, with the largest biases associated with over-icing during the peak ice season. Relatively low root-mean-square errors (Lake Michigan: 7.28% Lake Huron: 13.26%), high correlation coefficients (Lake Michigan: 0.88 Lake Huron: 0.90), and low biases (MAE: < 8%; MBE: < 5%; |FB|: < 0.25) highlight the success of the model in reproducing observed ice cover conditions. Overall, validation metrics show that GL-FVCOM performs as well (or better) than other three-dimensional models designed to simulate surface temperatures and ice cover concentrations in the Laurentian Great Lakes, especially those designed to work without external nudging (e.g., Bai et al. 2013; Xue et al. 2017; Anderson et al. 2018; Fujisaki-Manome et al. 2020, Li et al. 2021).

The simulated subsurface thermal structure was compared to observations collected at temperature moorings in Lake Michigan and Lake Huron. The most comprehensive observational dataset was collected at approximately 150 m depth in the central basin of Lake Michigan (i.e., M150; Fig. 1), where temperature profiles (Fig. 4b) have been measured almost continuously since 1990 (Anderson et al. 2021; GLERL (Great Lakes Environmental Research Laboratory) 2019). Modelled and observed thermal structure at M150 agreed reasonably well, with maximum summer thermocline depths between 25 and 35 m. However, simulated thermocline structure was overly diffuse compared to observations, and metalimnion thicknesses were significantly larger for modelled temperature profiles. This over-diffusion was best exemplified in the estimated thermocline steepness (i.e., maximum buoyancy frequency, N2), which was twice as large for observations (mean: 4 × 10−3 rad2/s2) as for simulations (mean: 2 × 10−3 rad2/s2) at the peak of the stratified season. Excessive heating in the metalimnion was also apparent in vertical profiles of model errors at M150 (Fig. 4c, d), which displayed peaks at 20 m depth (RMSE: 2.6 \(^\circ{\rm C}\); MBE: 1.6 \(^\circ{\rm C}\)). In contrast, errors in subsurface hypolimnion temperatures were similar to those estimated using lake-averaged surface temperatures (RMSE: < 2 \(^\circ{\rm C}\); MBE: < 1 \(^\circ{\rm C}\)). Similar trends were observed for all other observation sites (i.e., M15, M55, H200), with reasonable model agreement in the epilimnion and hypolimnion and warm biases in the metalimnion. Although excessive subsurface heating is commonly reported in hydrodynamic models, where it is often linked to numerical mixing (e.g., Beletsky et al. 2006) or forcing biases (e.g., Bennington et al. 2010), the biases reported in this study are relatively low and should have a marginal effect on estimated warming trends, especially outside the metalimnion (discussed below).

Fig. 4
figure 4

Examples of modelled (a) and observed (b) temperature profiles collected at a long-term mooring (M150) used for validation of subsurface thermal structure. Y-axis limits have been restricted to 100 m to highlight differences in modelled and observed thermocline structure. A horizontal dashed line is plotted near the top of the summer thermocline (25 m depth) for reference. Depth dependent RMSE (c) and MBE (d) for the four moorings (M15, M55, M150, and H200) are indicated with colored circles. Individual mooring locations can be seen in Fig. 1

Simulated hypolimnetic mixing rates were validated against microstructure measurements collected at 55 m depth in Lake Michigan (Fig. 5). Measurements were designed to capture distinct seasonal mixing characteristics in the stratified summer, nearly isothermal spring, and convective winter, when surface temperatures were far below the temperature of maximum density (TMD = 4 \(^\circ{\rm C}\)). The magnitudes of modelled and observed mixing rates (i.e., turbulent vertical scalar diffusivity: Kz) were nearly identical during the isothermal spring (O(10−3) m2/s) and convective winter (O(10−2) m2/s), though diurnal enhancements due to radiative convection (e.g., Austin 2019) were minimal in the model output. Strong full water column mixing during these seasons is largely driven by wind and surface wave breaking (e.g., Cannon et al. 2021a), which is well represented in GL-FVCOM. Although the lack of convective enhancement suggests that model physics could be improved, modelled and observational mixing timescales (Tmix = H2/Kz; H: water depth) are both on the order of 1 day, and differences are not expected to have a dramatic effect on model output over sufficiently long time scales (e.g., > 1 week). Stratified mixing was also well-represented by model physics, with similar mixing magnitudes in the surface (O(10−3) m2/s) and bottom (O(10−4)) m2/s boundary layers and hypolimnion (O(10−6) m2/s). Favorable comparisons of surface (~ 15 m) and bottom (~ 10 m) boundary layer thicknesses imply that over-heating in the hypolimnion is likely caused by numerical mixing or forcing biases rather than inherently flawed model turbulence parameterizations.

Fig. 5
figure 5

Modelled (circles) and measured (lines) vertical scalar diffusivity (Kz) profiles collected at M55 (see Fig. 1). Profiles represent both stratified (a), isothermal (b), and convective (c) conditions. Subplot (c) includes profiles observed during the day and overnight to differentiate between radiative convective and wind-driven mixing (Cannon et al. 2021a, b)

3.2 Trend analysis

Hindcast simulations were used to estimate long-term trends in lake thermodynamics between 1980 and 2021. Spatial maps of modelled trends in individual grid cells are shown in Fig. 6, with time series of annual means as averaged over individual study regions shown in Fig. 7. Model analysis showed significant increases in annually averaged lake surface and bottom temperatures as well as commensurate decreases in ice cover and ice thickness (Figs. 2, 3, and 6; Table 3). Modelled lake surface temperatures (Fig. 6a) increased at rates between 0.3 and 0.7 \(^\circ{\rm C}\)/decade, with lake-wide averages of 0.47 ± 0.19 \(^\circ{\rm C}\)/decade and 0.51 ± 0.19 \(^\circ{\rm C}\)/decade in Lake Michigan and Lake Huron, respectively (Fig. 2a, c). Lake surface warming was strongest in the lake interior, and warming trends in shallower bays (Table 3, Fig. 7a–f), though significant, were substantially below lake wide averages (e.g., Green Bay: 0.31 ± 0.15 \(^\circ{\rm C}\)/decade; Saginaw Bay: 0.32 ± 0.13 \(^\circ{\rm C}\)/decade). Modelled lake bottom temperatures also increased significantly over the simulation period (Fig. 6b), with nearshore (offshore) warming trends in excess of 0.4 \(^\circ{\rm C}\)/decade (0.1 \(^\circ{\rm C}\)/decade). Importantly, bottom warming trends in shallow, ecologically important bays (Fig. 7g–l) were similar to (or larger than) surface warming trends (e.g., Green Bay: 0.34 ± 0.15 \(^\circ{\rm C}\)/decade; Saginaw Bay: 0.31 ± 0.13 \(^\circ{\rm C}\)/decade) in the same regions, highlighting potential shifts in benthic habitat (e.g., Lynch et al. 2010) and mixing dynamics (i.e., deepening of the surface mixed layer; Lehman 2002).

Fig. 6
figure 6

Spatial maps of modelled trends in lake surface temperature (a; \(^\circ{\rm C}\)/decade), lake bottom temperature (b; \(^\circ{\rm C}\)/decade), ice cover (c; %/decade) and ice thickness (d; cm/decade). Grid cells without significant linear trends (p > 0.05) are shown in white

Fig. 7
figure 7

Modelled trends (dashed red lines) in annually averaged time series (black dots) of lake surface temperature (af; \(^\circ{\rm C}\)/decade), lake bottom temperature (gl; \(^\circ{\rm C}\)/decade), ice cover (mr; %/decade) and ice thickness (sx; cm/decade) in Georgian Bay (a, g, m, s), North Channel (b, h, n, t), Saginaw Bay (c, i, o, u), the Straits of Mackinac (d, j, p, v), Green Bay (e, k, q, w), and Grand Traverse Bay (f, l, r, x). Variables are averaged over all grid cells in each region of interest. The best fit slope for each trend line (mean ± 95% CI) is indicated for reference, and insignificant trends (p > 0.05) are indicated with (*)

Table 3 Lake warming trends estimated from model output. Reported variables represent the best-fit linear models (slope: b; intercept: a) and 95% confidence intervals for annually averaged parameters (1980–2021). Lake surface temperature, lake bottom temperature, ice cover, and ice thickness are lake- and bay-averaged for lake-wide and regional trends, respectively. Trends that are not significant at the 95% confidence levels are indicated with (*). Table data includes the relative trend magnitude (b/a) and total change over the model period (1980–2021) for reference. The intercept (a) represents the best-fit linear model estimate in 1980

Modelled lake ice cover (Fig. 6c) and ice thickness (Fig. 6d) decreased significantly over the simulation period (1980–2021). For trend analysis, ice characteristics were averaged over the “ice season” (November 1–May 31), which inherently includes trends in both the specified parameters and the ice duration. Annual average ice cover decreased by 2.1 ± 1.3%/decade in Lake Michigan and 5.2 ± 2.7%/decade in Lake Huron (Fig. 3; Table 3), with stronger trends observed at higher latitudes. For example, annual average ice cover decreased by 7.7 ± 3.1%/decade in Georgian Bay (Fig. 7m) and 4.1 ± 2.1%/decade in Green Bay (Fig. 7q). Ice cover loss was relatively minor in southern Lake Michigan (< 2%/decade), and long-term trends in ice-cover were insignificant in the deepest offshore waters of each lake. Significant changes in modelled ice thickness were only observed in regions with relatively high and consistent ice cover, including Green Bay, Saginaw Bay, Georgian Bay, the Straits of Mackinac, and the North Channel (Fig. 7s–x). Ice thickness trends were strongest in the North Channel, where the average annual ice thickness decreased by 6.3 ± 2.4 cm/decade, resulting in a total loss of nearly 30 cm over the 42-year simulation period. Saginaw Bay (2.5 cm/decade) and southern Green Bay (> 5 cm/decade) experienced relatively large decreases in ice thickness, despite relatively minor losses in ice cover (4 and < 2%/decade, respectively). This suggests that climate warming may be leading to significant changes in ice characteristics, even in area where perceived changes in surface conditions are minimal.

Trends in modelled ice cover and lake surface temperature agreed reasonably well with those estimated using limited observational datasets. For example, recent analysis of summer lake surface temperatures measured using a combination of remote sensing satellites and in-situ buoys has produced trends on the order of 1.0 \(^\circ{\rm C}\)/decade (Schneider & Hook 2010: 0.3–0.7 \(^\circ{\rm C}\)/decade; Zhong et al. 2019: 0.2–1.2 \(^\circ{\rm C}\)/decade; Mason et al. 2016: 0.1–2.0 \(^\circ{\rm C}\)/decade), and long-term trend ranges generally overlap those presented in the current work (0.3–0.7 \(^\circ{\rm C}\)/decade). Spatial patterns in modelled and observed trends (e.g., Mason et al. 2016; Fig. 3; Zhong et al. 2019; Fig. 3) were also consistent, each indicating increased warming trends in deeper offshore waters. Ice cover trends were similarly comparable for simulations and observations, with estimated ice cover loss between 2 and 20%/decade (Mason et al. 2016: − 1– − 20%/decade; Wang et al. 2012: − 16–20%/decade) and increased trend magnitudes nearshore and at northern latitudes (e.g., Mason et al. 2016, Fig. 3). Modelled surface temperature and ice cover trends were marginally weaker than those estimated using observations, with minor differences (observed vs. modelled) potentially linked to variability in averaging timescales (seasonal vs. annual) and time series lengths (pre-2012 vs. pre-2021). Anomalously cold winters have occurred more frequently in recent years (2014–2015, 2018–2019), reducing the magnitudes of best-fit trends for both surface temperature warming (annual averaged) and ice cover loss.

Modelled subsurface temperatures highlight significant warming trends across depth layers in Lake Michigan-Huron (Fig. 8; Table 4). Warming trends were largest near the surface (0–30 m) in both lakes (Lake Michigan: 0.53 ± 0.19 \(^\circ{\rm C}\)/decade; Lake Huron: 0.48 ± 0.19 \(^\circ{\rm C}\)/decade), and estimated trends tended to decrease with depth (Fig. 8c). Statistically significant warming trends were observed in the deepest water layers of each lake, with warming rates of 0.13 ± 0.06 \(^\circ{\rm C}\)/decade and 0.06 ± 0.03 \(^\circ{\rm C}\)/decade in Lake Michigan (240–270 m) and Lake Huron (150–180 m), respectively. Simulated near-surface and hypolimnetic warming rates were 1.5 × –3 × larger than those estimated using observations (Fig. 8d) from the central basin of Lake Michigan (i.e., M150; Anderson et al. 2021), but 95% confidence intervals on modelled and observed trends overlapped (or nearly overlapped) in both regions (0–5 m; 55–150 m). Differences in mean warming trends were especially apparent in the metalimnion (5–50 m) and near the bed (~ 140 m), where warming was statistically significant for simulations (30 m: 0.56 ± 0.21 \(^\circ{\rm C}\)/decade; 140 m: 0.17 ± 0.05 \(^\circ{\rm C}\)/decade) but not for observations (30 m: 0.07 ± 0.08 \(^\circ{\rm C}\)/decade; 140 m: 0.00 ± 0.08 \(^\circ{\rm C}\)/decade). While differences in trends may be linked (at least partially) to the analysis time range (model: 1980–2021; observations: 1990–2020), model estimates are likely high-biased due to overheating through the overly-diffuse thermocline (see above). As such, trends in near-bed and metalimnetic temperatures should treated with caution and interpreted as high-bounds on potential real-world heating rates.

Fig. 8
figure 8

Annual-averaged subsurface temperatures in Lake Michigan (a) and Lake Huron (b) between 1980 and 2021. Layer depths used for averaging are indicated with colored circles (0–270 m; 30 m intervals). Estimated warming trends (Lake Michigan: black; Lake Huron: blue) and 95% confidence intervals (colored shading) for each depth layer are shown in (c), with corresponding best fit lines included in (a) and (b) for reference. Trends and confidence intervals estimated from observations (gray triangles; Anderson et al. 2021) and simulations (black triangles) collected at M150 are shown in (d)

Table 4 Lake warming trends estimated from model output. Trends represent the best-fit linear slopes and 95% confidence intervals for annually occurring (i.e., spring and fall overturn dates) and annually averaged (i.e., subsurface temperatures) lake characteristics (1980–2021). Subsurface temperatures and overturn dates are averaged over specified depth layers or depth contours, respectively

Simulated spring and fall overturn dates highlight temporal changes in the seasonal stratification cycle of Lake Michigan-Huron (Fig. 9; Table 4). The spring turnover, triggered by lake surface temperatures warming above the temperature of maximum density (TMD \(\approx\) 4 \(^\circ{\rm C}\)), generally occurred between late March and early April in Lake Michigan (Fig. 9a) and between late April and early May in Lake Huron (Fig. 9c). The timing of the spring turnover changed significantly over the simulation period, with changes in both the relative turnover date as well as the spatial (i.e., onshore-offshore) progression of lake turnover. While the spring turnover is generally associated with an onshore-to-offshore progression of the 4 \(^\circ{\rm C}\) thermal bar (e.g., Rao & Schwab 2007; Bai et al. 2013), as is observed in the early years of the simulation, offshore warming led to a disruption of spatial trends post-1998. This is especially evident in Lake Michigan, where generally higher seasonal water temperatures and lower ice cover concentrations meant that offshore waters often failed to cool below 4 \(^\circ{\rm C}\), precluding the occurrence of a spring turnover at depths greater than ~ 100 m. At depth contours where the spring turnover occurred consistently over the simulation period (i.e., < 100 m depth), the turnover date tended to occur earlier in recent years, with decreasing trends between − 2 and − 8 days/decade (Table 4).

Fig. 9
figure 9

Modelled spring (a, c) and fall (b, d) overturn dates for Lake Michigan (a, b) and Lake Huron (c, d). Overturn dates were computed for each unstructured grid cell and averaged over 30 m depth contours (0–270 m). Dates (DOY: day of year) are indicated with separate colormaps for the spring and fall turnover, and DOY values past 366 indicate days past December 31 for a given year. Missing colors represent years where the turnover was incomplete at a specific depth contour

Modeled fall turnover dates, defined as the dates of maximum bottom temperature during each simulation year, are shown in Fig. 9b, d. Although near-bed temperatures were likely high-biased, as discussed above, we expect the use of temperature maxima for turnover identification to substantially reduce biases simulated turnover dates and long-term trends. The fall turnover typically occurred between early September and late January, with earlier turnover dates in the nearshore waters of each lake. Overturning dates in deep, offshore waters lagged those in shallow regions by more than 3 months (~ 100 days), indicative of both the larger thermal mass and the longer mixing time scales associated with larger depths. Long-term trends in the fall turnover date (Table 4) mirrored those observed in lake surface temperatures (e.g., Fig. 6a), with stronger trends in offshore waters where surface temperatures are rising more rapidly. Fall turnover dates were found to be increasing significantly (i.e., occurring later in the year) at all depth contours greater than 120 m, and the strongest trends were observed between 210 and 240 m depth in Lake Michigan (6.3 ± 2.5 days/decade). Due to excessive diffusion of heat through the metalimnion simulated fall turnover dates may be marginally affected by excessive model warming at depth (discussed above), it is assumed that spatial and temporal trends.

Simulated changes in spring and fall overturn dates may be indicative of dramatic shifts in lake mixing dynamics (e.g., Woolway and Merchant 2019). The combined effect of positive and negative trends in fall and spring turnover dates, respectively, is a net lengthening of the stratified season. The strongest mixing conditions occur during the isothermal fall and winter, when density stratification is minimized (e.g., Cannon et al. 2021a). This mixing period serves many ecologically important functions, including nutrient redistribution (e.g., Tranvik et al. 2009) and oxygen replenishment (e.g., Matsumoto et al. 2015), and the consequences of incomplete mixing can cascade across subsequent years. For example, reduced overwinter mixing may limit the supply of nutrients to surface waters, altering biological productivity in the following spring and summer (Vincent 2009). Changes may be further exasperated by an overall shift in lake mixing regimes, with simulations showing increasing numbers of incomplete turnover events in recent years, especially in Lake Michigan. Similar results were presented in a recent study from Fichot et al. (2019), where observational remote-sensing data was used to highlight changes in overturning behavior in the Laurentian Great Lakes between 1995 and 2012. Fichot et al. (2019) described an incomplete spring turnover in Lake Michigan and Lake Ontario in 2012 and went on to hypothesize that both lakes are more likely to experience incomplete overturning in the future. The results of the current work support this hypothesis, with 11 of the past 21 years (52%) of simulations resulting in incomplete spring turnover events in Lake Michigan. While these results are likely skewed by minor warm biases in model output (< 1 \(^\circ{\rm C}\)), which artificially increase the number of incomplete turnover events in deep water, the strong agreement with observational analysis suggests that, at minimum, continued climate warming will have significant effects on overturning behavior in Lake Michigan-Huron.

The impacts of global teleconnection patterns can be seen in the time series of surface (Fig. 2) and subsurface temperatures (Fig. 8a, b), ice cover and ice thickness (Fig. 3) and overturning dates (Fig. 9), with increased warming and variability following a series of El Niño events in the late 1990s and early 2000s. The El Niño event of 1997–1998 considered one of the most powerful in recorded history (e.g., Changnon 2000), is often cited as a precipitating event for dramatic changes in lake heat content (e.g., Anderson et al. 2021; Van Cleave et al. 2014), ice cover (e.g., Assel et al. 2000; Bai et al. 2012), precipitation, and over-lake evaporation (Gronewold & Stow 2014) throughout the Great Lakes region. Recent work indicates that this event may have also driven changes in teleconnection pattern influence. For instance, increased correlations with tropical-Northern Hemisphere (TNH) and eastern Pacific oscillation (EPO) patterns have led to higher year-to-year variability in ice cover in the years since 1998 (Lin et al. 2022). The current work supports these findings, with similar shifts in the magnitude and variability of modelled lake surface temperatures and ice characteristics occurring post-1998. Simulations also highlight a marked increase in the magnitude and variability of deeper (> 50 m) subsurface temperatures in 2004–2005 (Fig. 8). While this shift is potentially linked to an anomalously strong (positive) East Pacific/North Pacific (EP/NP) teleconnection pattern in 2004 (EP/NP index = 3.36), additional work is warranted to fully describe the phenomenon. Importantly, these abrupt climatological shifts suggest that physically meaningful trend projections may require more nuanced analysis techniques than those used in the current work, where robust linear trends were used to describe historical changes in physical indices. Although additional discussion of alternative trend analysis techniques is beyond the scope of the current study, we suggest that future work utilize more sophisticated model techniques for trend extrapolation (i.e., Mason et al. 2016; Qian 2014).

Continued lake-wide surface and subsurface warming is likely to have significant impacts on both lake ecology and resource management in Lake Michigan-Huron. The most immediate response of lake warming will manifest as changes in thermal structure, with shifts in the annual cycles of both vertical stratification and lateral heat distribution. These changes in water temperatures are likely to have cascading effects on other ecosystem processes, including nutrient cycles, food web structure, and habitat distributions. For example, changes in the duration of summer stratification may reduce hypolimnetic oxygen replenishment, increasing the probability of hypoxia in productive nearshore water (e.g., Tellier et al. 2022; Foley et al. 2012). Warming lakes are also likely to experience changes in primary productivity (e.g., Verburg et al. 2003), as has been observed in Lake Superior, where warming surface temperatures and decreases in ice cover have been linked to increases in primary production (O’Beirne et al. 2017). Increasing surface and subsurface temperatures may also alter the distributions of aquatic species, with cold-water fishes shifting into deeper, more northern waters and warm-water fishes expanding into newly available habitats (e.g., Lynch et al. 2010). As Lake Michigan-Huron continues to change, management policies may need to be adjusted to effectively manage lake resources. For instance, decreases in ice volume and duration may lead to amplified interest in overwinter shipping, and shifting fish habitats may warrant re-evalution of tribal fishing compacts in indigenous communities. Additional hydrodynamic modelling efforts and climate change projections will be invaluable for informing and developing these policies under continued climate warming.

4 Conclusions

In this study, historical reanalysis forcing data (NARR; 1979–2021) was used to simulate hydrodynamics in Lake Michigan-Huron using a three-dimensional unstructured-grid finite volume hydrodynamic-ice model (GL-FVCOM). Model results were used to analyze trends in thermal structure, ice dynamics, and overturning behavior, providing insight into changing subsurface conditions that are otherwise difficult to detect using traditional observational tools (i.e., in-situ moorings or remote sensing). The analysis presented herein can be used to draw the following conclusions:

  1. 1)

    The Great Lakes Finite Volume Community Ocean Model (GL-FVCOM) is capable of accurately reproducing observed thermal structure, ice dynamics, and vertical mixing in Lake Michigan-Huron. Although there were biases (MBE) in lake-averaged surface temperature (< 0.75 \(^\circ{\rm C}\)) and ice cover (< 5%), as well as point-measured subsurface thermal structure (< 2 \(^\circ{\rm C}\)), estimated warming trends were on par with those estimated using remote sensing observations, providing confidence in simulated warming rates.

  2. 2)

    Historical simulations showed significant increases in annually-averaged lake temperatures at the surface (0.47–0.51 \(^\circ{\rm C}\)/decade) and near the bed (0.26–0.29 \(^\circ{\rm C}\)/decade), as well as significant decreases in average ice cover (2.1–5.2%/decade) and ice thickness (0.68–2.0 cm/decade). The effects of climate warming were especially apparent in shallower subregions, where trend magnitudes were between 10 and 100% larger than lake-wide averages. This extreme surface and subsurface warming will likely have dramatic consequences for nutrient cycling and benthic habitat in historically productive fisheries (e.g., Saginaw Bay and Green Bay).

  3. 3)

    Layer-averaged subsurface temperatures showed significant warming at all depths, with warming trends in excess of 0.45 \(^\circ{\rm C}\)/decade in the surface layer and 0.15 \(^\circ{\rm C}\)/decade in the hypolimnion (50–270 m depth). Warming rates were strongest in Lake Michigan, where simulated water column temperatures experienced significant warming (0.13 ± 0.06 \(^\circ{\rm C}\)/decade) in even the deepest layers of the lake (240–270 m). Estimated warming trends were similar to those reported in a recent observational study in the central basin of Lake Michigan (i.e., Anderson et al. 2021), with notable high biases in modelled trends near the bed (140 m) and in the metalimnion (30–50 m), suggesting that reported heating rates in the overly-diffuse thermocline should be interpreted with caution.

  4. 4)

    Modelled turnover dates showed a general lengthening of the stratified period and a possible shift in mixing classifications in recent years. Trends computed over the simulation period showed that fall turnover dates were delayed (~ 2–6 days/decade) in offshore waters (120–270 m depth), while spring turnover dates occurred earlier (~ 2–7 days/decade) nearshore (0–150 m depth). Increasing overwinter surface temperatures led to the disappearance of the spring turnover far from shore (> 150 m depth), especially in Lake Michigan, where incomplete turnover events were observed in over 25% (11/42) of simulation years.

  5. 5)

    Simulated changes in surface and subsurface temperatures and ice cover in Lake Michigan-Huron may have significant consequences for ecosystem function and service provision in the Laurentian Great Lakes. The warming lake conditions have likely led to shifts in water quality, fishery habitats, shipping practices, and recreational usage, warranting additional consideration for resource management and regulation.