## Introduction

The climate system is a complex system characterised by turbulent dynamics. The time-energy spectra of instrumental and proxy climate data show a rich structure with energy cascades from timescales of millions of years to a few seconds and no spectral gaps (Lovejoy et al., **2001**). Moreover, atmospheric and oceanic motions feature specific characteristics which differentiate them from the homogeneous and isotropic turbulence of Kolmogorov (Pouquet and Marino, **2013**). Indeed, the rotation and stratification effects allow for an inverse energy cascade contributing to large-scale motions, such as the atmospheric planetary waves and ocean currents. The different components of the climate system – each with its own complex dynamics – further show a broad range of interactions. In this study, we will specifically focus on the interplay between the ocean and atmosphere. The former has slow characteristic timescales (up to thousands of years), while the latter has swifter temporal dynamics, with synoptic-scale features typically evolving over periods of days (Pedlosky, **2013**). These fast timescales limit our ability to predict the future evolution of the atmosphere: indeed, Lorenz (**1969**, **1982**) and Dalcher and Kalnay (**1987**) postulated a limit of mid-latitude weather predictability at 10–15 days. However, the ocean’s slow variability provides a possible predictability pathway beyond this range (Palmer and Anderson, **1994**; Baehr et al., **2015**; Vannitsem and Ghil, **2017**). This makes the study of the ocean’s low frequency variability (LFV) and its coupling with the atmosphere a topic of considerable scientific and practical interest.

The most famous example of ocean-led predictability is the alternance of El-Niño and La-Niña events and their effects on large-scale precipitation and temperature. This phenomenon has provided some of the earliest indications of the feasibility of annual and longer forecasts (Cane et al., **1986**). However, extracting the full predictability potential inherent to LFV features remains a challenge. Long instrumental time-series are scarce, and even reanalysis products only provide globally well-constrained data over the past few decades. Long-term reconstructions of coupled ocean/atmosphere variability must therefore rely on model simulations, documentary evidence or proxy data. The latter typically provide a time series representative of some feature of oceanic and/or atmospheric circulation on a regional or larger scale, with a time resolution of seasons to decades or longer (e.g. Bond et al., **2001**; Vinther et al., **2010**). This type of data is essential to verify that the coupled dynamics generated by climate models are compatible with those found in the real world.

An important question is whether it is possible to quantify the impact of the averaging procedure implicit in proxy records when performing such comparisons. In this paper, we address this question from a theoretical angle by using a conceptual coupled ocean-atmosphere model and investigating its dynamical properties. We apply dynamical systems theory to measure the dimensionality of the system, and compare the results for model output with a high temporal resolution versus a degraded dataset where the system is known only through long-term averages. This allows us to objectively quantify the modifications induced by the averaging. We conclude by applying our approach to a number of instrumental and reconstructed indices of large-scale climate modes and discussing the broader implications of our results for the analysis of climate data.

## A dynamical systems approach

Determining the attractor properties of complex systems has been a long-standing challenge in the field of dynamical systems theory. However, recent theoretical advances in our understanding of the limiting distribution of Poincaré recurrences now enable us to compute both mean and instantaneous (in time, and hence local in phase space) dynamical properties of complex systems. The framework is described in greater detail in Appendix A; the key finding is that, under suitable rescaling, the probability *p* of entering a ball in phase space centred on a point *ζ* with a radius *r* for Axiom A systems obeys a generalised Pareto distribution (Freitas et al., **2010**; Faranda et al., **2011**; Lucarini et al., **2012**, **2016**). To compute such probability, we first calculate the series of distances $\text{dist}(\zeta ,x(t))$ between the point on the attractor *ζ* and all other points *x*(*t*) on the trajectory. We then put a logarithmic weight on the time series of the distance to increase the discrimination of small values of $\text{dist}(\zeta ,x(t))$, which correspond to large values of $g(x(t))$:

The probability of entering a ball of radius *r* centred on *ζ* can now be expressed as the probability *p* of exceeding a threshold *q* of the distribution of $g(x(t))$. In the limit of an infinitely long trajectory, such probability is the exponential member of the generalised Pareto distribution:

*μ*and

*σ*are a function of the point

*ζ*chosen on the attractor. Remarkably, $\sigma =1/d(\zeta )$, where $d(\zeta )$ is the local dimension around the point

*ζ*. The attractor dimension $\u3008d\u3009$ can then be obtained by averaging

*d*for a sufficiently large sample of points

*ζ*on the attractor. Here, we use the quantile 0.98 of the series $g(x(t))$ to determine

_{i}*q*. We have checked the stability of the computed local dimensions against reasonable changes in the quantile ($0.95<q<0.99$). The universality of the convergence law implies that the above is akin to a central limit theorem of Poincaré recurrences. For further details, the reader is referred to Lucarini et al. (

**2016**). The computation of the dimension based on the extreme value theory has been successfully used to describe the evolution of sea-level pressure (Faranda et al.,

**2017b**) and geopotential height fields (Messori et al.,

**2017**) over the North Atlantic, as well as sea-level pressure, temperature and precipitation fields at hemispheric scale (Faranda et al.,

**2017a**). We provide the MatLab code for computing

*d*in Appendix B.

One of the advantages of this theory over other methods for computing the dimension of the attractor, is that it contains an inbuilt verification of the convergence toward the asymptotic dimension. This is performed by checking whether the sample distribution of exceedances matches the target exponential member of the generalised Pareto distribution. Here, we have verified that the 95% confidence interval of the shape parameters includes 0 (i.e. the exponential member of the Generalized Pareto Distribution (GPD)).

## Data and model specifications

### A conceptual atmosphere-ocean coupled model

The coupled ocean-atmosphere model we use here is the same as described by Vannitsem (**2015**). The atmospheric component is based on the vorticity equations of a two-layer, quasi-geostrophic flow defined on a *β*-plane, supplemented with a thermodynamic equation for the temperature at the interface between the two atmospheric layers. The ocean component is based on the reduced-gravity, quasi-geostrophic shallow-water model with the same first order approximation of the Coriolis parameter. The oceanic temperature is a passive scalar transported by the ocean currents, but it also interacts with the atmospheric temperature through radiative and heat exchanges. A time-dependent radiative forcing mimics the annual radiative input coming from the Sun at mid-latitudes. A low-order model version is built by truncating the Fourier expansion of the fields at the minimal number of modes that can capture the key features of the observed large-scale ocean and atmosphere dynamics. The truncation leads to 20 ordinary differential equations for the atmospheric variables, eight equations for the ocean transport variables, six equations for the temperature anomaly within the ocean, and two additional equations for the spatially averaged temperatures in the atmosphere and the ocean. The low-order model with a realistic range of parameter values displays a seasonal variability and, depending on the strength of heat and momentum fluxes between the ocean and the atmosphere (as controlled by the surface friction *C*), a LFV. For further details we refer the reader to Vannitsem (**2015**).

The parameter values used here are the same as in Figure 3 of Vannitsem (**2015**), except that we test other values of the friction coefficient *C* between the ocean and the atmosphere. Specifically, we consider four different runs, termed CD0002, CD0005, CD0007 and CD0008, corresponding to *C* = 0.002, 0.005, 0.007 and 0.008 kg m^{−2}s^{−1}, respectively. For the first three values, the dynamics is chaotic while for the last one it is quasi-periodic. The amplitude of the dominant Lyapunov exponent is decreasing as a function of *C*. The last two runs display LFV, while the first two do not. We retain 10,000 time-steps for analysis for all averaging windows considered here.

### Data

In addition to output from an idealised model, we also analyse a range of climate indices at different temporal resolutions. Specifically, we use daily NINO3 data provided by the NOAA climate prediction centre (Barnston and Livezey, **1987**; Reynolds et al., **2007**) over the period 01 January 1981 to 28 February 2018, monthly NINO3 data over 1854 to 2016 provided by Huang et al. (**2017**) and a yearly NINO3 dataset over the period 1049–1995 provided by Mann et al. (**2009**). We further analyse daily NAO data provided by the NOAA climate prediction centre (Barnston and Livezey, **1987**; Reynolds et al., **2007**) over the period 01 January 1981 to 28 February 2018, monthly NAO data over 1854 to 2016 provided by Jones et al. (**1997**) and yearly data over the period 1049–1995 (Trouet et al., **2009**).

## Dynamical implications of time-averaging

We begin by analysing the dependence of the phase portraits on the time-averaging of model output. To depict this we have to choose three of the 36 modes of the model to represent the attractor on a Poincaré section. We choose modes ${\psi}_{o,2},{\theta}_{o,2}$ and ${\psi}_{a,1}$ (Fig. 1). These are the dominant modes of the coupled ocean-atmosphere dynamics as discussed in detail in Vannitsem et al. (**2015**). Specifically, ${\psi}_{o,2}$ represents mass transport in the ocean; ${\theta}_{o,2}$ the meridional temperature gradient within the ocean and ${\psi}_{a,1}$ the amplitude of the atmospheric zonal flow. We consider a run with no LFV (*C* = 0.002, Fig. 1a,c,e,g) and a run with a marked LFV (*C* = 0.007, Fig. 1b,d,f,h). The colour scales show the values of the local dimension *d* (for readability each panel has a different colour scale). The effect of averaging depends both on the chosen averaging period and on the ocean-atmosphere coupling. The daily portraits show quasi-periodic cycles, associated with the annual cycle present in the system, in both simulations (Fig. 1a,b). These are partly destroyed by the monthly averaging (Fig. 1c,d). Longer averaging periods rapidly smooth all structures in the phase portraits of the no-LFV run (Fig. 1e,g), so that the Poincaré section looks like that of a noisy fixed point in three dimensions. For the LFV run, the slow signal associated with the ocean dynamics survives the averaging procedures, and is still evident after an 8-year averaging. We further note that at sub-annual time scales (Fig. 1a–d), the local dimension is in general higher during the winter period, i.e. when ${\psi}_{a,1}$ displays large values. We will return to this point below.

A more quantitative analysis of the changes in the attractor properties under averaging is reported in Figs. 2 and 3, which present the mean values and distributions of *d* for all *C* and averaging periods. Table 1 reports the values of the first four moments of the distributions. The first remarkable feature is the non-monotonic behaviour of the dimension with the averaging window. A naive hypothesis would be that, independently of the coupling, one might observe a decrease of the dimension with increasing time-averaging, since for infinite averaging periods a fixed point is reached with a dimension equal to 0. Indeed, averaging should suppress degrees of freedom as it suppresses information. However, this is only true for averaging periods larger than or equal to 1 year, for which the seasonal cycle is averaged out. Indeed, all four simulations analysed here show non-monotonic behaviour for shorter averaging periods. This feature reveals that the filtering through averaging tends to modify the frequency of specific categories of local dimensions. The analysis of the probability distribution functions (pdf) of *d*, shown in Fig. 3, provides further insights on this behaviour. Taking *C* = 0.002 as example, one can see a clear shift of the pdfs toward larger values in going from daily to monthly values. On the contrary, averaging beyond a 1-year window suppresses the extreme *d* values in the tails of the pdfs, which corresponds to a smoothing of the variability of the dimension, thus lowering $\u3008d\u3009$. This is particularly evident for the case of *C* = 0.007 and *C* = 0.008 (LFV runs) and should be expected since there is a smoothing of the variability on the attractor (Fig. 1b,d,f,h). This smoothing removes specific frequencies in the dynamics, as discussed in details in Nicolis and Nicolis (**1995**) and Vannitsem and Nicolis (**1995**, **1998**), and also reduces the local variability of the instability properties of the flow.

The pdfs further highlight the fact that, in some cases, the mode of *d* remains roughly constant but the positive tails of the pdfs change radically. This suggests that a decrease in $\u3008d\u3009$ due to averaging might change little in the system’s ground state while altering the configurations with the largest number of degrees of freedom.

A further notable aspect is that at monthly and seasonal scales the pdfs of *d* display a double peak for runs both with and without LFV (Fig. 4). This double peak is associated with the seasonal variability; there is a dominance of large *d* in winter and low *d* in summer. For instance for *C* = 0.005, there is a maximum around *d* = 8 for the winter conditions and *d* = 4 for summer conditions (Fig. 4). To interpret this feature one must recall that the large-scale winter dynamics in the mid-latitudes is driven by a larger gradient of equator-to-pole radiative input than in summer, with strong implications for the properties of the flow (Buizza and Palmer, **1995**; Goosse, **2015**; Vannitsem, **2015**, **2017**). This is also a property of the coupled ocean-atmosphere model used here, which displays lower averaged local Lyapunov exponents (and averaged local Lyapunov dimensions) in summer than in winter (Vannitsem, **2017**). The technique we adopt here successfully captures these variations in the complexity of the dynamics, as also shown by Faranda et al. (**2017a**, **2017b**) using reanalysis data.

One can further ask whether $\u3008d\u3009$ are determined predominantly by the oceanic or the atmospheric modes. To answer this, we compute the local dimensions of the oceanic and atmospheric components separately. This implies that we build two separate state vectors, one containing only oceanic modes (* _{o}*) and one containing only atmospheric modes (

*). The results are shown in Fig. 5 for different averaging periods. Except for the*

_{a}*C*= 0.007 daily and yearly cases (Fig. 5a,d), the atmospheric modes alone return almost the same value of the total dimension as the joint calculation. The analysis of the ocean variables instead gives a lower dimension, especially for short averaging periods. This likely reflects the fact that, although the ocean variables are coupled to the atmosphere, they only retain part of the complex structure of the system, in particular for the low values of

*C*. In our view, this equates to the dynamics in the ocean being only ‘weakly’ driven by the chaotic variability present in the atmosphere for small values of

*C*, due to the former’s large inertia that integrates the atmospheric forcing on long time scales. For large values of

*C*, LFV develops and this effect is considerably weakened; variables from both components then provide comparable results, especially for long averaging periods. The difference in behaviour between

*C*= 0.007 and

*C*= 0.008 can be ascribed to the fact that the first still gives a chaotic attractor, whereas the second is characterised by a quasi-periodic flow.

## Implications for ocean-atmosphere coupling and conclusions

In the present study, we have investigated the effects of time averaging on the ocean-atmosphere system as represented by a conceptual coupled model. The impact of averaging is quantified in terms of changes in the attractor properties of the system. When the averaging period is increased, the local dimension initially shows a non-monotonic behaviour, but ultimately decreases for windows longer than 1 year. For these averaging windows, the pdf of the local dimension becomes closer to Gaussian and the variability decreases. This corresponds to a progressive smoothing of the attractor. Time-averaging therefore has profound and sometimes counter-intuitive implications for the dynamical characteristics of climate data. Our results also suggest that, on longer time scales, the climate dynamics is smoother and closer to that of homogeneous, hyperbolic systems.

It is however necessary to verify whether the results from the idealised model presented above find a match in real-world data. Hence, we repeat our analysis for El Niño-Southern Oscillation Nino3 and North Atlantic Oscillation (NAO) indices. As a caveat, we note that this analysis has an important difference from that of the full coupled model. Indeed, the NAO and Nino3 indices do not represent the full climate attractor but can be interpreted as a projection (a special Poincaré section) of the full dynamics. In this sense, the analysis can still inform us on numerous aspects of the system (see for example Faranda et al. (**2017c**) for a similar argument on the Von Karman turbulent swirling flow). A separate problem to consider is the length of the time series, as our method for computing the local dimensions is dependent on processing a sufficiently long series. The shortest time-series we analyse are the yearly ones, for which we only dispose of 947 years; monthly and daily timeseries provide more data points. We therefore perform two different computations of the dimension: (1) for each dataset and temporal resolution, we use all available data points; (2) for each dataset and temporal resolution, we only use 947 data points. This provides some indication of the robustness of our conclusions. The results are reported in Fig. 6. The top panels show the box-plot of the local dimension pdfs when all the data are considered, whereas the lower panel presents the average dimension for the two cases described above. The box-plots suggest that the medians of *d* are relatively stable, while the extremes change with the averaging period considered. For the yearly time series, we obtain values of *d* up to 10. This may seem non-physical since we are only analysing two time series but, again following Faranda et al. (**2017c**), can be understood by considering the role of small-scale turbulence in increasing the effective dimension of the attractor. Sampling issues may be discarded because both the full and reduced datasets show comparable relative changes in extremes between the different temporal resolutions (not shown). This is slightly different from what seen in the idealised model, where for runs with weak atmosphere-ocean coupling both the extremes and the medians of *d* increased with increasing averaging period (cf. Table 1, Fig. 3a,b). Finally, the slightly non-monotonic behaviour of the average dimension for the climate indices follows the one found in the coupled model for weak coupling cases (cf. Figs. 2 and 6d).

We therefore conclude that the inferences drawn from the conceptual model, while presenting differences from the analysis of climate indices in both the interpretation of the data itself and the results of the analysis, can nonetheless provide some valuable insights into the behaviour of real-world climate data. The results discussed here should be taken into account when performing dynamical analyses of data with low temporal resolutions. The same approach implemented here could be fruitfully applied to more sophisticated climate models, in order to clarify the increase in variability of the local dimension with increasing averaging time found in Fig. 6. For this application, very long (control) runs should be considered in order to have enough data for the extreme value analysis, under the assumption that climate models can correctly reproduce the internal variability of the climate system on multiple timescales.