Atmospheric Energy Spectra in Global Kilometre-Scale Models

Eleven 40-day long integrations of five different global models with horizontal resolutions of less than 9 km are compared in terms of their global energy spectra. The method of normal-mode function decomposition is used to distinguish between balanced (Rossby wave; RW) and unbalanced (inertia-gravity wave; IGW) circulation. The simulations produce the expected canonical shape of the spectra, but their spectral slopes at mesoscales, and the zonal scale at which RW and IGW spectra intersect differ significantly. The partitioning of total wave energies into RWs an IGWs is most sensitive to the turbulence closure scheme and this partitioning is what determines the spectral crossing scale in the simulations, which differs by a factor of up to two. It implies that care must be taken when using simple spatial filtering to compare gravity wave phenomena in storm-resolving simulations, even when the model horizontal resolutions are similar. In contrast to the energy partitioning between the RWs and IGWs, changes in turbulence closure schemes do not seem to strongly affect spectral slopes, which only exhibit major differences at mesoscales. Despite their minor contribution to the global (horizontal kinetic plus potential available) energy, small scales are important for driving the global mean circulation. Our results support the conclusions of previous studies that the strength of convection is a relevant factor for explaining discrepancies in the energies at small scales. The models studied here produce the major large-scale features of tropical precipitation patterns. However, particularly at large horizontal wavenumbers, the spectra of upper tropospheric vertical velocity, which is a good indicator for the strength of deep convection, differ by factors of three or more in energy. High vertical kinetic energies at small scales are mostly found in those models that do not use any convective parameterisation.


INTRODUCTION
A fundamental characteristic of the atmosphere is the distribution of wave energy across different horizontal scales. Observations and numerical modelling have supported the idea of a canonical energy spectrum. Horizontal kinetic energy scales with the horizontal wavenumber k as k -3 at synoptic scales (Boer and Shepherd 1983). The k -3 spectral region is largely associated with non-divergent motion and the conservation of total kinetic energy and total vorticity squared (Fjørtoft 1953). At mesoscales the spectral slope transitions towards k -5/3 (Nastrom and Gage 1985). The flattening of the horizontal kinetic energy spectrum at the mesoscale has been subject to intense debates. The most accepted theories for the k -5/3 slope rely on non-linearly interacting inertia-gravity waves (IGWs; e.g., Dewan 1979, VanZandt 1982, Žagar et al. 2017), a forward energy cascade (Lindborg 2006), and the role of non-linear advection in a realistically forced fluid (Lindborg and Mohanan 2017).
In this study we intercompare the atmospheric energy spectra of eleven global kilometre-scale simulations performed as part of the second phase of the DYnamics of the Atmospheric general circulation Modeled On Nonhydrostatic Domains project (DYAMOND; Stevens et al. 2019). Owing to their fine grid mesh, the models are starting to explicitly resolve the dynamics of convective storms in the tropics. In addition, resolved fine structures in topography and land-surface heterogeneity directly influence the atmospheric circulation rather than being subject to subgrid-scale parameterisation. Therefore, many dynamical processes associated with vertical momentum and energy exchanges are explicitly represented in this new generation of models. Yet, some fraction of these exchanges remains parameterised by vertical diffusion, microphysics and in some cases convection schemes, whose formulations vary substantially between the simulations.
A previous study intercompared six simulations of three different models from the first phase of the DYAMOND project in terms of their global gravity wave properties (Stephan et al. 2019b). In this first phase the models integrated 40 days of boreal summer, in the second phase they integrated 40 days of boreal winter. The models well reproduced the observed horizontal pattern of the global gravity wave momentum flux at 30 km altitude, but with amplitudes that differed by factors of 2-3 in the zonal mean. Atmospheric gravity waves are important for forcing the Brewer-Dobson circulation, which is immediately linked with the transport of ozone, water vapour and other trace gases (Alexander 1996). They are also driving the Quasi-Biennial Oscillation (QBO) in the tropics (Labitzke 2005;Marshall and Scaife 2009), which influences troposphere-stratosphere exchanges (Baldwin et al. 2001), and can remotely affect the global circulation. As we are on the verge of using kilometre-scale models for multi-decade predictions, the correct representation of gravity waves is essential.
There are a few limitations to the study by Stephan et al. (2019b), which were necessary for a fair comparison with satellite data. One is the focus on predefined vertical and horizontal scales of 5-10 km and 500-2000 km, respectively, which is a substantial restriction in light of the broad spectrum of gravity waves. A second is the use of the filtering method of Lehmann et al. (2012), which isolates sinusoidal perturbations locally, but does not guarantee that the identified waves are in fact gravity waves. Especially for long horizontal wavelengths, one may expect a contribution of Rossby waves. The contribution may differ between any pair of models if the scale at which Rossby and gravity waves contain equal amounts of energy differs. The third limitation is the focus on a single height level (30 km), which, besides being a limitation in its own right, can also introduce sensitivities to local differences in stratification and background winds.
In this study we follow a completely different approach to shed light on the representation of waves in kilometre-scale simulations, which avoids the abovementioned limitations. We project the three-dimensional fields of geopotential height and horizontal winds onto the orthogonal set of three-dimensional normal-mode functions (NMFs) using the MODES software (Žagar et al. 2015). The NMFs are eigensolutions to the linearised primitive equations and allow a separation of the energy spectra into balanced (Rossby wave; RW) and unbalanced (inertia-gravity wave; IGW) modes. Importantly, this technique does not provide information on a single level in the vertical, but yields the three-dimensional kinetic plus potential available energy spectra of horizontal motions. It has been widely applied to intercompare the wave spectra in analysis and reanalysis data. The set of NMFs implemented in MODES is in the terrain-following sigma coordinate system derived by Kasahara and Puri (1981). Both sigma-based and pressure coordinate-based NMFs have been extensively applied for the computation of atmospheric energy spectra at lower resolutions (e.g., Tanaka 1985Tanaka , Žagar et al. 2017). An exception is a high-resolution NMF decomposition by Terasaki et al. (2011) that provided global energy spectra including 750 zonal wavenumbers. The present study is the first study that uses the high-resolution NMF decomposition to intercompare kilometre-scale models.
At the resolutions considered here, the strength of convection is very sensitive to the use of a convective parameterisation (Stephan et al. 2019b;Wedi et al. 2020). Convection acts as a source of low-level vorticity, triggering RWs that may propagate toward the midlatitudes (Hoskins and Karoly 1981). In addition to meridionallypropagating RWs, localised transient tropical heating also generates a broad spectrum of equatorially trapped IGWs with vertical wavelengths depending on the depth of the heating (Salby and Garcia 1987). Kasahara (1984) studied the normal mode response to prescribed heat sources and found stationary heating to be primarily associated with RW modes, while transient heating forced a broad spectrum of waves, with the IGW portion of the spectrum showing a strong dependence on the time scale of the heating. Heating resembling the MJO generates a strong tropical IGW response as well as a broad RW response in the extratropics, both parts of the response significantly enhanced in the presence of moist dynamics (Kosovelj et al. 2019). Some of the differences in the magnitude of gravity wave momentum fluxes found by Stephan et al. (2019b) could be linked to differences in the strength of convection -stronger and deeper convection is usually associated with stronger vertical velocities and larger gravity wave momentum flux (Müller et al. 2018;Stephan et al. 2019a).
To shed light on differences in simulated convection, we first compare the simulations in terms of tropical precipitation and in terms of upper-tropospheric vertical velocities. Afterwards we turn to the NMF spectra and test to which degree the models produce the canonical spectra. We quantify differences in simulated spectra in terms of total energy levels, synoptic and sub-synopticscale slopes, and the crossing scale of the RW and IGW spectra. Section 2 introduces numerical and observational data and the analysis methods. In Section 3 we report the results, with conclusions following in Section 4.

A. NUMERICAL SIMULATIONS
We analyse eleven 40-day simulations of five different global models with horizontal resolutions of less than 9 km. Table 1 lists the simulations and summarises their main characteristics. All simulations are initialised with the global 9 km meteorological analysis taken from the ECMWF for the 20th January 2020 and are freely evolving until 1st March 2020. One extra data set, ICON-sap+, is the extension of the simulation ICON-sap and covers the period 20th January through 1st March again after one full year of integration. If there are no sensitivities to the initialisation, then ICON-sap+ and ICON-sap should show similar results save for inter-annual variability, which we cannot know for ICON-sap, but which we estimate from ERA5. The simulations that are not coupled to ocean models use prescribed sea surface temperatures and sea ice data from the ECMWF. The number and distribution of model vertical levels is depicted in Figure 1.

1) IFS
The Integrated Forecasting System (IFS) uses a spectral transform model with a cubic octahedral (Gaussian) grid (ECMWF 2020). IFS-9 has 2560 latitudes and 5136 points around the equator with a 1279 wavenumber truncation. This resolution corresponds to 7.8 km in the tropical belt and up to 11 km in the extratropics. IFS-4 has 5120 latitudes and 10256 points around the equator with a 2559 wavenumber truncation. This resolution corresponds to 3.9 km in the tropical belt and up to 4.8 km in the extratropics.
IFS-9 parameterises deep and shallow convection. In IFS-4 the deep-convective parameterisation is turned off. Parameterised mid-level convection only makes a very small contribution in either case.
Unresolved orographic effects are represented in the Turbulent Orographic Form Drag (TOFD) scheme  [km]. Also listed are the grid type, whether the model is coupled to the ocean, how convection is treated, the type of boundary layer parameterisation, and whether subgrid-scale orography is parameterised. For convection, S indicates that shallow convection is parameterised and F indicates full parameterisation. No convective parameterisation is indicated by x. The types of boundary layer parameterisations include diagnostic eddy diffusivity (K), prognostic turbulent kinetic energy (TKE) or turbulent total energy (TTE), Smagorinsky scheme (S), and Simplified Higher Order Closure (SHOC). for scales smaller than 5 km (Beljaars et al. 2004). Low level blocking and a gravity wave scheme is applied for scales >5 km in IFS-9 and scales >2.5 km in IFS-4 (Lott and Miller 1997). Non-orographic gravity waves are parameterised according to Orr et al. (2010). These are formulated such that their respective contributions vanish towards O(1 km) resolution.
The vertical turbulent transport is treated differently in the surface layer and above. In the surface layer, turbulent fluxes are computed using a first order K-diffusion closure based on the Monin-Obukhov similarity theory. Above the surface layer a K-diffusion turbulence closure is used everywhere, except for unstable boundary layers where an Eddy-Diffusivity Mass-Flux (EDMF) framework is applied, to represent the non-local boundary layer eddy fluxes (e.g. Köhler et al. 2011). The scheme is written in moist conserved variables (liquid static energy and total water) and predicts total water variance. A total water distribution function is used to convert from the moist conserved variables to the prognostic cloud variables (liquid/ice water content and cloud fraction), but only for the treatment of stratocumulus clouds. Convective clouds are treated separately by the shallow convection scheme.
Unlike the other models, which use small-time step numerics and different time steps for different physics, the time steps in IFS are the same for dynamics and physics, 240 s (IFS-4) and 450 s (IFS-9).

2) ICON
The dynamical core of the Icosahedral Non-hydrostatic (ICON) model is described in Zängl et al. (2014). The mean horizontal resolution of the ICON simulations is 2466 m for ICON-nwp and 4932 m for all other simulations. The triangular horizontal grid is based on a refined icosahedron. The model top is at 75 km, with a damping layer covering the top 15 levels from 44 km upwards. ICON simulations do not parameterise convection or subgrid-scale orography.
The boundary layer parameterisation in the ICONvd* simulations (ICON-vdu, ICON-vdc, ICON-vda) uses a prognostic total turbulent energy scheme (TTE; Mauritsen et al. 2007), ICON-nwp uses a prognostic model for the turbulent kinetic energy (TKE; Raschendorfer 2001), and ICON-sap a 3D Smagorinsky closure (Smagorinsky 1963). In the simulations using TTE, the mixing length above the boundary layer was limited to 1000 m instead of 150 m as recommended. This was discovered later. We nevertheless include the simulations in our analysis as they serve as sensitivity experiments. ICON-vdu is not coupled to the ocean, whereas ICON-vdc and ICON-vda are coupled to the ocean. In ICON-vda the albedo was increased from 0.07 (ICON-vdc) to 0.12 to compensate for missing clouds, which resulted from the erroneous mixing length setting.

3) GEOS
The Goddard Earth Observing System (GEOS) model is run on a c2880 cubed-sphere grid with 2880 cells per edge of each cube face for a total of 17,280 horizontal grid cells. The c2880 grid has roughly a 3.125 km global grid resolution. The vertical grid consists of 181 hybrid sigma-pressure levels from the surface to 0.01 hPa, with the first terrain following level above the surface at 18 meters. A sponge layer is situated in the top 18 levels from 0.3 to 0.01 hPa.  GEOS uses the non-hydrostatic Finite-Volume Cubed-Sphere Dynamical Core (FV3; Putman and Lin 2007;Harris et al. 2021). Deep and congestus convection is parameterised with the Grell-Freitas scheme (Grell and Freitas 2014). Deep plumes are disabled in the DYAMOND run analysed here. Shallow convection is parameterised with the Park and Bretherton (2009) scheme. The turbulence and boundary layer is parameterised with a combination of the non-local scheme of Lock et al. (2000), acting together with the Richardson-number based scheme of Louis et al. (1982). The land surface model is the catchment-based scheme of Koster et al. (2000) that treats subgrid-scale heterogeneity in surface moisture statistically. The gravity wave parameterisation computes the momentum and heat deposition into the grid-scale flow due to orographic (McFarlane 1987) and non-orographic (after Garcia and Boville 1994) gravity wave breaking. The effects of orographic form drag for features with horizontal scales of 2-20 km are parameterised following Beljaars and Wood (2003). The cloud microphysics is parameterised with the GFDL microphysics (Chen and Lin 2013;Zhou et al. 2019).

4) SHiELD
The GFDL System for High-resolution prediction on Earthto-Local Domains (SHiELD; Harris et al. 2020 (Han et al. 2017), the GFS subgrid orographic blocking scheme, the Noah-MP LSM, and a mixed-layer ocean nudged to analyzed EC SSTs with a 10-day timescale. There is no deep convective parameterisation. Damping is limited to the top three layers of the atmosphere; the constant-pressure top is at 3 hPa (about 40 km).

5) SCREAM
The Simple Cloud Resolving E3SM Atmosphere Model (SCREAM) is being developed for the Energy Exascale Earth System Model (E3SM) project. SCREAM models nonhydrostatic fluid dynamics and includes a turbulence/ cloud fraction scheme, a microphysics scheme, a radiation scheme, an energy fixer, and prescribed-aerosol functionality, described in Caldwell et al. (2021). The energy fixer adjusts the temperature by a small global constant after each timestep (Williamson et al. 2015).
In the horizontal directions, SCREAM uses a spectral finite element discretisation running on unstructured quadrilateral grids. For the DYAMOND simulations, SCREAM used a cubed-sphere grid with 6.29M elements, each containing a p = 3 degree polynomial representation of the prognostic variables. For each variable, there are approximately nine degrees of freedom per element. For cell area, we thus use spectral element area divided by 9, resulting in the square root cell area ranging from a minimum of 2.74 km to a maximum of 3.26 km. In the vertical, SCREAM uses a terrain following hybrid pressure coordinate discretised with a non-hydrostatic extension of the Simmons and Burridge finite differences (Simmons and Burridge 1981;Taylor et al. 2020). The number of vertical levels is 128 between the surface and the model top at 2.25 hPa (∼40 km).
For the top-of-model sponge layer, SCREAM uses horizontal Laplacian smoothing, applied to all prognostic variables in the top 14 layers (starting at 20 hPa, ∼25 km). The turbulence and boundary layer parameterisation is handled by an updated version of the Simplified Higher Order Closure (SHOC; Bogenschutz and Krueger 2013). SHOC is similar to other PDF-based schemes (Golaz et al. 2002;Cheng and Xu 2008), computing subgrid-scale liquid cloud and turbulence using an assumed double Gaussian probability density function (PDF). In SHOC, the higher order moments needed to close the double Gaussian PDF are diagnosed rather than prognosed. SCREAM does not contain convection or subgrid orography parameterisations.

B. ERA5 REANALYSIS
In addition to the DYAMOND simulations, we evaluate the same period, 20th January through 1st March in the ERA5 reanalysis, which is produced and made publicly available by the ECMWF (C3S 2017). ERA5 does not serve as a 'truth' to compare with. Instead, we use it to estimate year-to-year variability in the global energy spectra. For this reason, while most of the analysis focuses on 2020, we also inspect the years 2016, 2017, 2018, 2019. The original data with a horizontal resolution of ∼30 km are stored on 137 hybrid sigma/pressure levels from the surface up to 80 km (0.01 hPa).

C. NMF DECOMPOSITION
We first re-grid the three-dimensional horizontal winds, temperature and specific humidity, and two-dimensional topography and surface pressure from ERA5 and the DYAMOND simulations to a regular N256 Gaussian grid with 1024 × 512 points in longitude and latitude, respectively, corresponding to a resolution of 39 km at the equator. The re-gridding scheme performs a simple averaging over all data points within a given target grid cell. In the second step, we vertically interpolate the data to 68 hybrid sigma/pressure levels, which extend from the surface to ∼10 hPa (about 32 km). The vertical level density of the target vertical grid is roughly 2/3 of ERA5's, which uses the same vertical grid as the IFS simulations (first column in Figure 1). The data prepared in this way are then subjected to the NMF decomposition.
The NMF decomposition as carried out by MODES projects the three-dimensional fields of derived pseudogeopotential height and horizontal wind onto an orthogonal set of predefined basis functions and is performed at single time steps. A detailed description of MODES steps is given in Žagar et al. (2015). The basis functions of the projection are the Hough harmonics, and the three parameters that define them -the zonal wavenumber and the meridonal and vertical wave indices -satisfy the dispersion relationships for RWs and IGWs (Kasahara 2020). The mixed Rossby-gravity wave mode is counted to the RW category and the Kelvin mode is the slowest eastward-propagating IGW mode. The orthogonality of the basis functions allows filtering specific wave modes. By performing an inversion back to physical space, we can isolate the wind fields associated with selected wave modes as demonstrated in previous studies (e.g. Žagar et al. 2017). We perform the NMF decomposition every six hours from the time of initialisation.

D. PRECIPITATION
In addition, we analyse 200 hPa vertical pressure velocity, 200 hPa horizontal kinetic energy, and total precipitation in 30°S-30°N. We re-grid these fields like we did the three-dimensional fields. As an observational reference for precipitation we use data from IMERG (Huffman et al. 2019), GSMaP (Kubota et al. 2007) and CMORPH (Xie et al. 2019) in 30°S-30°N for the simulated period.
The IMERG data are the Global Precipitation Measurement Final Precipitation inter-calibrated L3 version 06B product with global coverage. The horizontal resolution is 0.1° × 0.1° and the temporal resolution is 30 min. The data can be obtained from the Goddard Earth Sciences Data and Information Services Center.
We use the version-7 gauge-corrected GSMaP data with a 0.1° × 0.1° resolution and hourly coverage in 60°S-60°N. They are provided by the JAXA Global Rainfall Watch.
The CMORPH data are the reprocessed and biascorrected global precipitation product covering 60°S-60°N. The horizontal resolution is 8 km × 8 km and the temporal resolution is 30 min. The data can be obtained from the National Centers for Environmental Information's National Oceanic and Atmospheric Administration. Figure 2 shows maps of 40-day mean tropical precipitation. In addition to the simulations, three observational data sets are included: IMERG, GSMaP and CMORPH. In case of the observational data sets, there may be some Figure 2 Forty-day mean precipitation in the tropics (30°S to 30°N). Blue numbers above the panels list the root-mean-squared errors with respect to IMERG (left), GSMaP (middle) and CMORPH (right). The black numbers list spatial linear correlation coefficients.

A. TROPICAL CONVECTION
uncertainties at small scales, as these products are optimised to match point-wise observations and models and reanalyses are used to fill gaps. The observational data exhibit some differences. Their pairwise linear spatial correlations are roughly 0.93 and root-mean-squared errors are between 1.53 and 2.11 mm day -1 . Differences between the simulations or simulations and observations are generally expected to be greater, as the simulations produce their own meteorology. All simulations agree best with CMORPH and least with GSMaP. ICON-nwp has the highest correlation with CMORPH (0.75), GEOS the lowest (0.60). IFS-9 has the smallest root-mean-squared error with respect to CMORPH (2.97 mm day -1 ), IFS-4 the largest (4.10 mm day -1 ).
Even though all simulations capture main features of the large-scale precipitation pattern, the spatial structure of the rainfall is not the same. For instance, IFS-4 differs from IFS-9 in that the deep-convective parameterisation is turned off and it produces a much sharper ITCZ than IFS-9. Previous studies have demonstrated that convection at a model horizontal resolution of few kilometres is still under-resolved, in the sense that turning off the deep convective parameterisation at these resolutions results in too many extreme rainfall events. For the ICON model, this was demonstrated by Stephan et al. (2019a) in their comparison of two 5-km ICON simulations with and without convective parameterisation. Stephan et al. (2019b) found similar results for the IFS model, comparing a 9-km simulation with parameterised deep convection to a 4-km simulation with explicitly simulated deep convection. More recently, Wedi et al. (2020) corroborated this result in their study of two 9-km IFS simulations with and without a deep-convective parameterisation. Wedi et al. (2020) also compared with an explicit 1.4-km IFS simulation and concluded from the good match of the energy spectra between the 1.4km simulation and the well-tuned 9-km simulation with parameterisation that it might be appropriate to turn deep-convective parameterisations off at 1.4 km.
We now address the strength of convection by turning to the vertical velocity ω in the upper troposphere. The spatial variance of ω resembles Figure 2 very closely (not shown). The zonal-wavenumber spectra of 200 hPa pressure velocity (30°S-30°N) based on 1-dimensional FFT are shown in Figure 3. The slopes vary considerably between the data sets, with synoptic-scale slopes between about -2/3 (IFS-9 and ERA5) and -1/6 (ICONvd* and SCREAM). Energies differ by a factor of ∼3 at the large and synoptic scales. Particularly at large horizontal wavenumbers, the slopes differ substantially. We hypothesise that the flat slopes and high energies at small scales in ICON-vd*, ICON-sap, ICON-sap+ and SCREAM are related to the fact that these are the models running without any convective parameterisation. While ICON-nwp also falls into this category, it has a twice finer resolution than ICON-vd*, which is more suitable for turning the convective parameterisation off.
To summarise, the spatial pattern of precipitation and the spectra of upper-tropospheric vertical velocity suggest that the models differ substantially in their representation of convection, particularly at small scales, and that some of these differences are due to model formulation and not due to a different meteorological evolution. Given that convection is an important wave source, we may expect that systematic differences in convection may be reflected in wave energies. We next examine the global energy spectra in light of this result.

B. GLOBAL ENERGY SPECTRA
Even though the DYAMOND simulations are only 40 days long and freely evolving after their initialisation from identical atmospheric states, energy spectra have been shown to be robust footprints of simulations, at least when focusing on synoptic and sub-synoptic scales (Boer and Shepherd 1983). Malardel and Wedi (2016) also stated that a "spectrum is a robust characteristic of the system, quasi-independent of the date and step of the forecast". We do not expect significant differences in planetary scales among the models, as their 40-day simulations may still depend on the initial state from ECMWF. Figure 4 shows the total, RW and IGW energy spectra for the ERA5 reanalysis and the eleven simulations. All spectra Figure 4 Global energy spectra as functions of non-dimensional zonal wavenumber k in the ERA5 reanalysis and the simulations. Shown are the total (TOT; black), RW (red), and IGW (blue) spectra. In addition to the year 2020, the ERA5 panel shows in dashed lines the corresponding spectra for 2016, 2017, 2018, 2019. Grey shading marks the standard deviation computed on 6-hourly data. For reference, spectral slopes of k -1 , k -5/3 and k -3 are drawn as black dashed lines. Their locations are identical in each panel. The axis range is also identical. closely follow the canonical shape. This is encouraging, given that some of the models are stripped down to the bare minimum of physical parameterisations, which removes many options of tuning a model. Additional dashed lines in the ERA5 panel show the 40-day mean spectra for the years 2016, 2017, 2018 and 2019. The grey shading is the 2020 standard deviation computed on 6-hourly spectra for the period 20th Jan to 1st March. By comparing the spread of the dashed lines with respect to the solid lines (20th Jan to 1st March in year 2020) to the grey shading, we note that inter-annual differences are small. Already at k = 4, the mean difference between the other years and 2020 is less than a third of the 6-hourly spread for IGW. For the RW and total spectra it is about one fifth. Moreover, the grey shading becomes almost invisible in the synoptic regime, as expected (Malardel and Wedi 2016). This also holds for the simulations. Note that the grey shading indicates standard deviation, not standard error, which would be even smaller by a factor of 140 and is the more relevant measure for quantifying statistically significant differences between simulations. Therefore, we will treat the 40-day mean spectra beyond k = 7 as truly representative of a simulation. For k < 7 we will not discuss spectral slopes, but only compare the energy integrated over k = 1-7.
The robustness of the spectra implies that deviations from the canonical spectrum must be due to model formulation. This provides an opportunity to better understand what factors shape the energy spectra in kilometre-scale models. Therefore, in the following subsections, we will point out the differences instead of the commonalities that Figure 4 documents. Indeed, a close look at Figure 4 already reveals various discrepancies between the data sets. For instance, by examining the y-axis intersection of the lines, we may already guess that the total energies of RW and IGW modes are not identical between the data sets. A detailed discussion of total energies follows in 3.b.1. Further, by comparing with the dashed reference lines, we note that the spectral slopes are not identical between the data sets. For example, the RW line of IFS-9 follows k -5/3 more closely than ICONsap. Spectral slopes are examined in 3.b.2. The offsets between the RW and IGW lines at large scales differ as well. How this offset, differences in slope, and differences in shape modulate the horizontal wavenumber at which the RW and IGW lines cross is the topic of 3.b.3.

1) Total wave energies
Previous studies have examined the energy partitioning between RW and IGW modes in global analyses (Tanaka et al. 1986;Tanaka and Kung 1988;Tanaka and Ji 1995;Žagar et al. 2009a,b, 2012. These early studies found the analyses to agree much better in terms of their RW energies than their IGW energies. A decade after the first study of this type, Tanaka and Kimura (1996) reported some convergence with respect to the IGW energy levels in the operational analyses, with discrepancies <8% for both RW and IGW energies. Tanaka and Kimura (1996) derived a value of ∼3% for the IGW energy fractions of global motions in the winters 1988/89 in three analyses. Žagar et al. (2009a) noticed that the value of 3% is likely too small, as they found IGW energy contributions between 9% and 15% in the more recent analysis systems, having analysed July 2007 in NCEP, ECMWF, and DART-CAM. They attributed the larger values to an improved analysis quality. These percentages are confirmed by the multi-year long, real-time spectra from operational ECMWF analyses and deterministic forecasts, that are available at http://modes.cen.uni-hamburg.de. Thus, the RW percentage of the total wave flow around 10% serves as a reference for what we may expect to find in the DYAMOND simulations. Žagar et al. (2012) tested the sensitivity of the energy partitioning to the selected vertical density of model levels and to the depth of the model atmosphere chosen for the analysis. Contributions from IGWs increased systematically when the analysis was performed using a greater vertical level density, and when levels in the mesosphere were included. Specifically, for the operational analyses of ECMWF in July 2007, the 91-model-level data contained about 10% of the global energy in IGWs, whereas the 21-standard-pressurelevel data contained only around 7%. Thus, some care must be taken when interpreting the absolute numbers we report here with other studies. To ensure a fair comparison between different models in this study, the level density chosen for the NMF decomposition does not exceed the native vertical level density for any of the models (Figure 1). Figure 5 displays the integrated total global wave energy and its partitioning into RWs and IGWs for ERA5 and the eleven simulations. Also shown is the energy in Kelvin wave modes, which are included in IGW. Simulated total (RW + IGW) energies are lowest in ICONvdc and largest in SCREAM. When we exclude the ICONvd* simulations, IFS-4 has the lowest energy and energy levels differ by up to 21% with respect to SCREAM. The simulations have 3%-30% greater total energies than ERA5 for the same period.
RW energies, again excluding ICON-vd*, are also lowest in IFS-4 and largest in SCREAM. Here, the models agree within 24% with respect to SCREAM. The RW energies exceed those of ERA5 by 2%-35% for the same period.
The energy fraction contributed by IGWs are again smallest for ICON-vd*. Of the remaining simulations ICON-nwp has the lowest energy in IGW and IFS-4 the largest. Models differ within 35% with respect to IFS-4. In contrast to the RW modes, IFS-4 is the only simulation with more energy in IGW than ERA5 (10% more), while the other simulations have less IGW energy than ERA5 (-4% to -29%).
Overall, the simulations tend to have less energy in IGW modes than ERA5, but more energy in RW modes than ERA5. This is also reflected in the partitioning of total energy into RW and IGW contributions, which for ERA5 is 89% in RW and 11% in IGW, but for the simulations varies between 94% in RW and 6% in IGW (ICON-nwp and also ICON-vdu) and 91% in RW and 9% in IGW (GEOS) with the exception of IFS-4 (88% in RW and 12% in IGW).
The three ICON-vd* simulations agree very closely with each another, as does the continuation of the coupled ICON simulation with its 2020 counterpart (ICON-sap+ and ICON-sap). This suggests that spectral characteristics are closely linked to the model setup.

2) Spectral slopes
The reference slopes overlaid in Figure 4 are those of the canonical RW and IGW energy spectra (e.g., Žagar et al. 2017). Žagar et al. (2017) applied the NMF decomposition to global 2014-16 analysis data from the ECMWF and to the ERA-Interim reanalysis. They reported a clear division of the IGW spectra into three regimes: large scales (1 ≤ k ≤ 6) with a slope close to -1, synoptic scales (7 ≤ k ⪅ 35) with a slope near -5/3, and mesoscales of 500 km or smaller (k > 35) with steeper slopes that were attributed to insufficient variability associated with unbalanced dynamics. Unlike the IGW spectrum, the RW energies followed a slope of -3 for all k > 6 down to the smallest scale they considered (about 100 km). We find similar transitions in the RW and IGW spectra for our 40day period in the 2020 ERA5 reanalysis (Figure 4).
A close inspection of Figure 4 reveals that the spectral slopes are not identical between the data sets. Table 2 lists the spectral slopes in the k-ranges 1-7, 8-50 and 51-320 for all curves displayed in Figure 4. The slopes are computed from the values at the respective k-bounds of the intervals. Clearly, at large scales, the RW spectrum dominates the total spectrum and both have slopes close to -1. The variability at these scales in 40 days is large and it would be unreasonable to attempt an interpretation of the differences between the data sets. At the intermediate scales 8 ≤ k ≤ 50 the slope of the total energy is in between the RW and IGW spectral slopes. Overall, the data sets agree well on both RW and IGW slopes in this range and follow the canonical spectra. In contrast to integrated energies, changes in vertical diffusion do not seem to strongly affect spectral slopes, since the latter are almost identical between ICON-vd* and ICON-sap.
For k > 50 the total energy spectra become dominated by the IGW contributions. In all simulations this transition goes along with a flattening of the total energy spectrum. In ERA5 there is a steepening because both the RW and IGW spectra turn steeper than the total energy slope at 8 ≤ k ≤ 50. It may well be that wave energy at small scales of several hundreds of kilometres is underestimated in ERA5 due to limitations in data assimilation procedures.
We next test if the representation of convection, as characterised by the spectral slopes of ω at 200 hPa (Figure 3), is important for the spectral slopes of the global energy spectra. To facilitate an interpretation, we first discuss the spectra of tropical (30°S-30°N) horizontal kinetic energy (KE = U 2 + V 2 ) at 200 hPa, which are like Figure 3 based on 1-dimensional FFT (Figure 6).
The horizontal kinetic energy spectra of Figure 6a have three distinct regimes. At large scales the spectra are nearly flat, at synoptic scales they are slightly shallower than -3, and at mesoscales they transition to even shallower slopes. Note that a -5/3 slope corresponds to a horizontal line in Figure 6. The horizontal wind spectra of the ICON-sap simulations reach a -5/3 slope at k ≈ 50, whereas the other simulations and ERA5 do not flatten as much and have steeper slopes at large k (Figure 6a). Figure 6b,c shows the horizontal kinetic energy spectra separated into their RW and IGW components, respectively, by inverting the NMF decomposition back to physical space. As expected, the large scales in Figure 6b are dominated by RW motions while the mesoscales are associated with IGWs. A mesoscale flattening of the Table 2 Spectral slopes in three wavenumber bands for the total (TOT), RW, and IGW modes shown in Figure 4, and the crossing scale k c . Also listed are the length scales L c that correspond to k c . The values for L c are computed for the equator and the midlatitudes (ϕ denotes latitude).

Figure 6
Compensated zonal-wavenumber spectra of tropically-averaged (30°S to 30°N) horizontal kinetic energy. Horizontal kinetic energy spectra use the (a) total horizontal wind, (b) RW circulation, (c) IGW circulation. The spectra have been multiplied by a factor of (k/360) 5/3 . Dashed black lines show reference slopes.
RW spectra is clearly visible in all simulations. The IGW slopes of ICON-sap and ICON-vd* turn even shallower than -5/3 at the mesoscale. In contrast, most of the other simulations almost maintain their synoptic-scale IGW slopes.
To proceed with the comparison to vertical velocity slopes, we estimate spectral slopes in the wavenumber band 50 ≤ k ≤ 180. We average over slopes computed on adjacent wavenumbers (50 ± 3 and 180 ± 5) to reduce the sensitivity of the slope estimates to bumps in the spectra. Figure 7 shows that the slopes of the tropical spectra of KE, KE RW , KE IGW and the global spectrum of E RW are related to the slope of tropical ω. Linear correlation coefficients exceed 0.9 in all cases. This is plausible, given that transient tropical heating is a source of RWs that propagate within as well as out of the tropics, and of equatorially trapped IGWs. Unlike the slope of E RW , that of E IGW is not related to the slope of tropical ω. This may be due to equatorial trapping, the importance of extratropical IGW sources like frontal systems, and the relatively greater contribution of stratospheric levels to E IGW as compared to E RW .

3) Crossing scales
Crossing scales have important implications for the applicability of spatial averaging, which is a very common technique for decomposing motions into background and waves. Table 2 lists the crossing scales k c of Figure 4. In the simulations k c varies between 24 (IFS-4) and 49 (ICON-nwp). In ERA5 it is 25, which is also true for the considered period in the years 2016, 2017, 2018, 2019. The 95% confidence interval computed on 6-hourly data is Δ 95 k c ≈ 1 for all data sets. In principle, k c could be related to the relative difference in integrated energies if all other characteristics of the spectra were identical. In this case, the relative difference in integrated energy would determine the offset between the RW and IGW spectra. Other factors that could influence the crossing scale could be the spectral slopes or the shapes. By shape we mean the deviation of the spectra from a strict power law, which would result in slightly arched curves in a loglog plot, as is for instance the case for the IGW spectrum of GEOS (Figure 4). In this subsection we investigate the importance of these different factors for determining k c .
For the ECMWF operational analysis of 2014-2016, Žagar et al. (2017) found crossing scales of k c ≈ 35. They also analysed the ERA-Interim analysis, which at the time was based on a forecast system that was about a decade older and had a coarser horizontal resolution. For ERA-Interim the crossing scale was at k c ≈ 50. Žagar et al. (2017) suggested that this larger crossing scale might be due to less IGW variability at small scales in ERA-Interim. Furthermore, their study demonstrated that the exact value of the crossing scale has a minor dependence on season, but is rather sensitive to the considered atmospheric depth. The corresponding values they found when they excluded more and more levels at the top of the analysis were k c = [39,41,52,58] for a total number of [134,123,108,89] levels, where 134 corresponds to 6 hPa and 89 to 53 hPa. This effect is due to relatively Figure 7 Relationships between spectral slopes. On the x-axis, α(ω) indicates the slope of the 200 hPa pressure velocity spectrum shown in Figure 3 for k ∈ [50,180]. On the y-axis, the slope α(ξ) is shown for four different spectra, with ξ either one of the tropical kinetic energy spectra of Figure 6 or the global RW energy spectrum of Figure 4. more energetic IGWs higher up in the atmosphere. This is consistent with the sensitivity of the partitioning of integrated wave energies to the considered atmospheric depth, which we mentioned in subsection 3.1. This effect needs to be considered when interpreting the results reported here. Figure 8 tests the sensitivity of k c to the spectral slopes by modifying the simulated spectra such that their spectral slopes are identical to ERA5's while their integrated energies remain unaffected. If differences in k c were in part due to differing slopes, then this experiment should narrow the spread in k c . Comparing Figure 8 Sensitivity test of the crossing scale to spectral slope. The solid lines repeat the energy spectra of Figure 4. The dashed lines follow a power law k α with α constant between k = 1-7 and k = 7-320, respectively, and the values of α indicated in the panels. α is chosen as the ERA5 slopes shown in Table 2. For each simulation, the dashed curve is scaled such that the total IGW or RW, respectively, energy in k = 1-320 is not changed. The solid vertical lines mark the crossing scales of the solid red and blue curves and correspond to k c of Table 2. The dashed vertical lines indicate the crossing of the dashed curves. the first and second rows of Figure 9, which show the original and new k c of each data set, respectively, proves that there is only a very slight narrowing of the spread in k-space. Instead, the main effect is to shift k c to larger values, which happens for all data sets, including ERA5. This effect is partly due to the re-scaling of the spectra to match the original integrated energy, and partly due to the straightening of the lines, i.e. making them follow strict power laws. To isolate the latter effect, which is a shape effect, we recompute k c by straightening the original spectra at k ≥ 6 using slopes computed from energies at k = 8 and k = 100. The "Shape" row of Figure 9 confirms that the resulting k c of ERA5 now lies in between the original k c and the one which resulted from the first set of modifications. For the simulations, the shapecorrection has either little effect or shifts k c to larger or smaller values, indicating that the simulated spectra differ in the details of their shape. In any case, neither shape nor slope can explain the spread in k c .
Next, we test the influence of offset by scaling the IGW curves such that their energy relative to the RW energy is the same as in ERA5 at k = 1-7. The resulting IGW spectra are shown in magenta in Figure 10. Correcting the offset makes the simulated k c cluster around ERA5's k c (Figure 9). Correcting the shape in addition to the offset does not further reduce the spread but broadens it. Thus, the remaining spread in the "Offset" experiment is mainly due to differences in spectral slopes.
In summary, k c is to first order controlled by the fraction of large-scale RW to IGW energy and to second order by a combination of spectral slopes and spectral shape.
The small year-to-year variability in k c found for the 40day period considered here in ERA5 suggests that the differences between the models are due to model setup.
The crossing scale appears to be sensitive to the boundary layer parameterisation: in ICON-nwp at ± 45°N L c ∼580 km (3D Smagorinsky scheme), the remaining ICON simulations have very similar L c ∼680 km (TTE/TKE scheme), SHiELD follows with 765 km (TKE), and the remaining models using either PDF-based closure (SCREAM) or K-closure (GEOS, IFS-9, IFS-4) have L c > 885 km. Note that it is the large scales that contribute the most to the integrated energy. This study does not assess the effect of the boundary layer parameterisation on the damping of the shortest waves resolved by the models.

SUMMARY AND CONCLUSION
In this study we intercompared the atmospheric energy spectra of eleven global kilometre-scale simulations, using the NMF decomposition method for distinguishing between the global balanced (Rossby wave; RW) and unbalanced (inertia-gravity wave; IGW) circulation. The 40-day simulations include five different global models with horizontal resolutions of less than 9 km.
Energy spectra averaged over a 40-day period include variability on longer time scales that can be considered negligible for all data sets except at the largest spatial scales. At synoptic and sub-synoptic scales, the spectra Figure 9 Original crossing scales and those of the modified spectra shown in Figure 8 and Figure 10. Symbols are vertically offset for visibility. The top row "Original" shows the crossing scales of the original spectra. The row labeled "Shape" (not shown in either Figure 8 or Figure 10) is the crossing scale that results from straightening the original spectra beyond k = 8, as done in Figure 10, but without correcting the offset. The row "ERA5 slopes" are the crossing scales marked by the dashed vertical lines in Figure 8. The row "Offset" corresponds to the magenta lines in Figure 10 and "Offset + Shape" to the black dashed vertical lines of Figure 10. are robust characteristics of the simulations. All simulations produce the expected canonical shape of the spectra. This is encouraging, given that the few remaining physical parameterisations restrict the number of ways in which a model could be tuned. Yet, there are significant differences in total energy levels, spectral slopes and spectral crossing scales.
Total wave energies differ by 21% percent among the simulations (excluding the sensitivity experiments with stronger vertical diffusion). Differences in IGW energy Figure 10 Sensitivity test of the crossing scale to large-scale energy offset and to shape. The solid red and blue lines repeat the RW and IGW, respectively, spectra of Figure 4. The magenta line drawn for the simulations is the IGW line offset to match the IGW/ RW energy fraction at k = 1-7 of ERA5. The dashed lines continue the red and magenta lines, respectively, beyond k = 8, but with a constant spectral slope. This slope is computed from the power difference at k = 8 and k = 100. The solid black vertical lines mark the crossing scales of the solid red and blue curves and correspond to k c of Table 2. The magenta vertical lines indicate the crossing of the red and magenta lines, and the dashed black vertical lines those of the dashed lines. ICON-sap+ levels reach 35%. Simulated total wave energies are 3%-30% greater than in ERA5, with RW energies exceeding those of ERA5 by 2%-35%. In contrast to the RW modes, IFS-4 is the only simulation with more energy in IGWs than ERA5 (10% more), while the other simulations have less IGW energy than ERA5 (-4% to -29%). The three ICON-vd* simulations agree very closely with each another, as does the continuation of the coupled ICON simulation with its 2020 counterpart (ICON-sap+ and ICON-sap). This suggests that spectral characteristics are closely linked to the model setup.
The partitioning of total energy into RW and IGW energies turned out to be the most important factor for determining the spectral crossing scale. Spectral slopes and deviations from theoretical power laws play a secondary role. The crossing scales of RW and IGW spectra vary considerably between the simulations. IFS-4 with k c = 24 (1179 km) is very close to the k c = 25 of ERA5. ICON-nwp has a crossing scale of k = 49 (578 km). With regard to k c , we observed that models with similar types of turbulence closure schemes have similar k c . There is no indication that the differences can be explained by different horizontal or vertical resolutions, or hydrostatic versus non-hydrostatic dynamics. Insensitivity to the latter choice is also reported by Zeman et al. (2021) for the IFS.
The impact of physical parametrisation on spectra and in fact the "spectra of physics tendencies" including turbulence have been illustrated in Malardel and Wedi (2016). It clearly shows the impact of sub-grid scale parameterisation on all scales, not just fine scales (cf their Figures 6 and 10), and the control exerted by the parametrisations on divergent motions. While beyond the scope of this paper (and in fact the data not being part of the DYAMOND portfolio) it would be interesting to compare the spectra of the physical tendencies from the different schemes.
Different crossing scales have important implications for the use of spatial averaging to decompose motions into background and waves, as is common practice in many applications. Our results imply that care must be taken when using such simple averaging for intercomparing storm-resolving simulations, even when their horizontal resolutions are nearly identical.
In contrast to integrated energies, changes in turbulence closure schemes do not seem to strongly affect spectral slopes at intermediate scales. At 8 ≤ k ≤ 50, the data sets agree well on both RW and IGW slopes and closely follow the canonical spectra. Towards smaller scales, the simulated RW spectra flatten while the IGW spectra steepen.
Despite their small contribution to the global horizontal plus potential available energy, the small scales are important for driving the global mean circulation. This is because they are associated with horizontally short gravity waves that, however, are associated with large vertical velocities and locally strong momentum flux. This effect is seen, for instance, in global simulations at 1-km resolution (Polichtchouk et al. 2021). The previous intercomparison study of DYAMOND models, which focused on a single height level (30 km) and gravity waves of 500-2000 km horizontal wavelengths, found gravity wave momentum flux amplitudes to differ by factors of 2-3 in the zonal mean (Stephan et al. 2019b). Present results support their conclusion that the strength of convection is a relevant factor for explaining these discrepancies. The models studied here produce the main features of the large-scale tropical precipitation patterns. However, particularly at large horizontal wavenumbers, there are substantial differences in the spectra of upper tropospheric vertical velocity, which is a good indicator for the strength of deep convection. Energy levels differ by factors of ∼3. High energies at small scales are mostly found in those models that do not use any convective parameterisation, which is expected (Müller et al. 2018;Stephan et al. 2019a;Polichtchouk et al. 2021). We showed that the simulation of convection, as represented by the slope of 200 hPa vertical pressure velocity ω, is important for shaping kinetic energy spectra. The spectral slope of tropical horizontal kinetic energy at 200 hPa at 50 ≤ k ≤ 180 associated with both RW and IGW modes is well correlated with that of ω. In case of RW modes, this is also true for the RW global energy spectrum. The problem of strong precipitation in isolated grid points or a group of grid points when not parameterising deep convection is well known. Among other model parameters, mixing, which is in turn sensitive to the turbulence scheme, has an impact on updraft strength. Stronger updrafts result in flatter sub-synoptic slopes of the vertical and horizontal kinetic energy spectra (personal communication with Tobias Becker, ECMWF). Limited area simulations in weather forecasting have investigated in the past when assumptions of 1D-turbulence break down. In the study by Honnert and Masson (2014) the critical horizontal resolution at which a 3D turbulence scheme becomes necessary is a function of the boundary layer height and the depth of the cloud layer. This would suggest a need to consider 3D turbulence at ≲ 1 km horizontal grid spacing, which matches practical experience in limitedarea weather modelling. However, even at sub-1 km resolution convection is still essentially single grid point biased and thus exposes a dominant vertical exchange of momentum in a 1D sense (Miyamoto et al. 2013).
If DYAMOND-type models shall be used for multidecade projections it is important to correctly represent convective gravity wave sources. A deeper discussion on making best use of observations for constraining small scales, including vertical velocities, is needed when touching km-scales. Notwithstanding the challenges of adapting the way convection is realised in models at km-scales, we believe that simulations in the greyzone of convection add valuable information. For example, the influences of differential heating, sloping terrain, and other topographic and land-use features on convective organisation are undoubtedly improved.

DATA ACCESSIBILITY STATEMENT
How to access the model output from the DYAMOND initiative is explained at the project website https://www.esiwace.eu/ services/dyamond-initiative. The IFS simulations are based on IFS cycle 47r1 (operational for NWP at ECMWF between 30/06/2020-11/05/2021