## Introduction

The nature of low frequency (periods longer than a year or two) global ocean physical variability remains somewhat obscure with much of the difficulty lying with the observational data sets. Quasi-global coverage by altimetry is now almost 30 years long, but it depicts the surface elevation field – and whose interpretation reflecting motions at depth, including some troubling aliases, can be complex (Wang et al., **2013**; LaCasce, **2017**). The Argo float upper ocean (1–2 km) profile temperature field becomes nearly global about 15 years ago. Theoretical estimates of oceanic adjustment and variability time-scales (e.g. Wunsch, **2015**, p. 338) cover a continuum at least out to 10,000 years and interpretations of the paleoclimate record support an hypothesis of change out to the entire age of the ocean (Cronin, **2010**). The 20–30 year quasi-global record that does exist already raises some interesting questions about how to interpret it, and its implications for such common assumptions as the quantitative existence of a meaningful ‘climatological ocean’.

Common physical experience suggests that as the time scales of numerous natural phenomena increase, the spatial scale also increases. Implicit in many oceanographic studies is an assumption that as the high frequency ‘noise’ is averaged out over *τ _{avg}*, that the physics will simplify into the large-scale elements of the steady-state theories, perhaps being slowly time-dependent.

One difficulty is the implicit ambiguity in much oceanographic dynamical theory. Almost all useful and interesting theoretical circulation constructs (ranging from the thermal wind, through Sverdrup balance, Stommel-Arons flow, and the various thermocline theories) are written for a steady-state ocean in a laminar-like state (textbooks by Pedlosky (**1996**), Huang (**2010**), Vallis (**2017**), Klinger and Haine (**2019**), and others). The observed ocean is known to be filled with an intense, comparatively short-time-scale, turbulence-like motion having very strong spatial dependences in structure and time-scale. Apart from an extensive literature on the role of geostrophically balanced eddies (McWilliams, **2008**), particularly as elements in parameterizations for mixing-like processes, little seems simple about the relationship between the physics of a hypothetical steady-state ocean, and one whose physical elements (velocity, temperature, pressure, etc.) have been averaged over a time-interval ${\tau}_{\mathit{avg}}>1$ yr. No spatial spectral-gap has ever been identified and no long-period cutoff is known, e.g. for time scales of the balanced (mesoscale) eddy field. All spatial and temporal scale interactions probably affect thermal variability patterns out to the longest accessible time intervals.

As global oceanic data sets and global general circulation models with quantifiable skill have emerged over the past few decades, the conclusion that the ocean has a very strong regional and temporal complexity has become inescapable. For example, Sonnewald et al. (**2019**), using a vertically integrated vorticity balance, divided the ocean laterally into 6+ distinct dynamical-type regions, of greatly varying area. Similarly, the altimetric wave number power-law results of Xu and Fu, **2012**) seem to imply a minimum of about 14 dynamical regimes; cf. Hughes and Williams (**2010**). Sverdrup et al. (**1942**) identified 45 distinct topographic districts based upon the bathymetry known at that time.

An increasing complexity with averaging time, *τ _{avg}*, is readily rationalized: as

*τ*grows, the high-frequency, high-wavenumber variability that dominates the flow fields is further suppressed, permitting the zero-frequency structures of continental boundaries, bottom topography, and large-scale long-lived spatial inhomogeneities in air-sea exchanges, to emerge. Disturbances such as those owing to the mid-ocean ridges do not diminish with time, but can be manifested as strong spatial structures in both the mean and low-frequency variability over

_{avg}*τ*.

_{avg}Upper ocean variability over decades is comparatively well-observed, including the ENSO cycle, and sea surface temperatures, although problems persist (e.g. Chan et al., **2019**). For the heat uptake estimation context, see the reviews by Abraham et al. (**2013**) or Meyssignac et al. (**2019**). In contrast, the goal here is to discuss the nature of full water-column low-frequency variability and its seemingly intricate spatial structure. Fields known to satisfy basic conservation equations as well as a very wide-variety of data, both meteorological and oceanographic are used (see Zanna et al., **2019**) for discussion of the full-water column over a much longer time interval, and Wunsch, **2016**). Much of what follows is descriptive, and is intended as an experiment in the depiction of decadal global variability using much-reduced volumes of values – and such that the result is comprehensible. The focus here is on the apparent spatial structures consistent with dynamics known from a self-consistent general circulation model and the diverse data, avoiding the issues forcefully raised by Bengtsson et al. (**2004**).

## The estimated state

As a provisional estimate of oceanic variability and its average, we use the ECCO version 4 release 4 (v4r4) ‘state-estimate’ as documented by Fukumori et al. (**2017**, **2018**, **2019**) in the 26-year period 1992–2017. This depiction comes from a nonlinear weighted least-squares fit of the MITgcm (Marshall et al., **1997**) and its evolutionary ECCO successors, to the diverse near-global data sets and meteorological forcing estimates that became available during and after the World Ocean Circulation Experiment. The single most important feature of this representation is that the model is *free*-running, but with its numerous control parameters having been previously adjusted so that the model trajectory takes it through all of the data points within (mostly) estimates of their uncertainties. The state estimate is represented on a geographical grid of $6.4\times {10}^{6}$ positions, including 50 different vertical layers; 26-yearly averages raises that number to $1.7\times {10}^{8}$ grid values. As the references make clear, the system includes a variety of parameterizations including those for the effects of the unresolved eddy field, and the special physics of the surface layer. Although the reliability of these parameterizations has been and continues to be much debated, the use of 26 years of data leads to adjustments in the parameters so that the state vector reproduces within estimated errors (a difficult subject in itself), the great bulk of the global data sets.

The free-running configuration is important because the results then satisfy basic *global conservation* of mass, energy, vorticity, etc. up to numerical accuracy – a feature not generally true of estimates commonly labelled ‘reanalyses’ (Bengtsson et al., **2004**). Adjusted control parameters include the initial state, the meteorological forcing/coupling fields, and interior values of parameterized mixing rates. In terms of volume of data, the result is dominated by altimetry, Argo float profiles, instrumented elephant seals, and 6-hourly meteorological reanalysis products, all weighted by estimated errors. In contrast with earlier decades, by 1992 the observation system had become qualitatively more homogeneous and dense – reducing the tendency for changing observation methods and patterns to produce artifacts in the state. Direct observation of the deep ocean (below about 2000 m) remains very thin, and the values calculated here rely heavily on the known physics in the model connecting the upper and lower oceans while still accounting for whatever deep CTD temperature and salinity data that are available. References should be consulted for a full discussion. Only the temperature field is described here – because it has serious climate-change implications, has the longest records albeit at scattered points, and is observationally relatively intuitive.

The state estimate includes the full global ocean including ice-covered regions, and spans the entire 26 years of the estimate. (Comparisons – not shown – with an earlier 20-year climatology, omitting first and last years, and using only latitudes equatorward of 60° Fukumori et al. (**2018**) showed only small changes in the results.) Annual averages, which suppress much high frequency variability, including the spatially complicated annual cycle of temperature, are used as a first-step in reducing the volume of numbers required to describe the ocean – a significant reduction compared to a 1-hour or smaller model time-step. A second major data volume reduction is obtained by working with the column average annual temperatures – an unorthodox variable, but nonetheless one describing a major variability aspect.

Figure 1 shows the model topography, whose numerous deep sub-basins are readily apparent, each having a physics determined by details such as proximity to large-scale boundaries, topographic slopes, presence of bottom roughness, latitude, structure of meteorological exchanges, etc. and which can be expected to generate full water column disturbances, not necessarily localized, as averaging periods increase.

Fourier spectra in space and/or time have proved in many ways to be the most productive representation of geophysical variability. An approach to the frequency-wavenumber spectrum was described by Wortham and Wunsch (**2014**). On the other hand, such descriptors are most powerful when fields are statistically stationary (in the ‘weak-’ or ‘wide’-sense), that is homogeneous in space and time. In the present situation, the spatial inhomogeneity is so strong that the estimated Fourier results are not discussed. Temporal non-stationarity is less obvious, but the trend-like behavior seen below in some elements would suggest that it is present as well.

The nature of the temperature field can be seen from the 1994 potential temperature average shown in Fig. 2 for 477 m and its temporal standard deviation in Fig. 3. That particular depth was chosen as a compromise generally lying below the mixed-layer, but above the main thermocline. As has been realized since oceanographic antiquity, little or no structure arising from the huge mid-ocean ridges and other topographic features can be perceived in the upper ocean levels. Apart from the continental margins, inferring topography from such a picture would fail. Inferences about large-scale wind structures are, however, possible. Determining an accurate spatial mean temperature from in situ measurements requires a large number of values, appropriately distributed, even were there no underlying temporal changes. Notice that at this depth, no particular excess variance is seen in the equatorial Pacific Ocean.

## Vertical average temperature

If $\varphi ,\lambda ,z,t$ are latitude, longitude, depth and time respectively, the underlying field is potential temperature, $T({\varphi}_{p},{\lambda}_{q},{z}_{r},{t}_{m}),$ on the model space-time grid where the subscripts are integers. The time mean vertical average temperature, ${\overline{T}}_{G}({\varphi}_{p},{\lambda}_{q}),$ is depicted in Fig. 4, the overbar denoting the 26-year time average. Some visual covariance with the bottom topography appears, as expected, with shallow regions necessarily being warmer – absent the deep cold water (the point pattern-correlation is 0.25). Most of the rest of this paper explores the temporal structure of the vertically averaged temperature as an ocean descriptor – with the understanding that ultimately every point (varying with latitude, longitude, depth, and time) is likely quantitatively unique. The resulting standard deviation of annual values with depth in the global spatial average can be seen in Fig. 5, with its near-surface domination.

Water-column average temperature anomaly averaged also by year, ${T}_{G}({\varphi}_{p},{\lambda}_{q},{t}_{m}),$ where *t _{m}* is the year, is the present focus.

*T*is interesting as being directly related to annual oceanic heat uptake, a subject not explored here. From the above references, the results below are expected to be dominated by the upper ocean, as indicated by the global average and time-depth variance of temperature (Fig. 5), noting however, that the upper ocean variability is partially compensated in the vertical average by the much-thicker model deep layers. (These particular values are not corrected for horizontal area changes in the state estimate model grid.) Vertical averaging also reduces the lateral inhomogeneity of the temporal variations.

_{G}A qualitative sense of the spatial variability in latitude and longitude of the anomaly over the full water column through time can be seen in Figs. 6–8. Some effects of basin-scale topography are visible, particularly in the Atlantic, southeast Pacific, and western Indian Ocean. Particularly striking is the anomaly for 1998, the midst of an El Niño episode, with its intense equatorial features that are not apparent in the two other years shown. Space-time sampling requirements are visibly daunting. Some elements of these figures are associated with the abyssal topography, but any such effect is subtle, and the underlying dynamics is dependent not just upon the water depth, but also its two-dimensional gradients, at least.

## Singular vectors, principal component analysis, EOFs, etc

Description and interpretation of any such field becomes an exercise in various forms of matrix decompositions. A brief discussion is provided below to obtain a notation and to lay out the necessary assumptions. Two of the best-known methods of reducing space-time patterns to the simpler representations are via principal oscillation patterns (POPs), and singular vectors (SVs), the latter often known as empirical orthogonal functions (EOFs) or principal component analysis (PCA). Both methods are fully described, e.g. by von Storch and Zwiers (**2001**). In practice here, only the second method proves to be effective at dimension reduction, and the POP discussion is consigned to the Appendix.

The three objects in the title of this section are basically the same. A singular value decomposition terminology will be used, as it appears to be the most nearly generic.^{1} Textbook discussions can be found in Jolliffe (**2002**), Lawson and Hanson (**1995**), von Storch and Zwiers (**2001**), Wunsch (**2006**) and in Jolliffe and Cadima (**2016**), amongst others.

Consider any two dimensional matrix **M** where each column represents variations over space coordinates, and with time varying between columns, e.g. ${\mathbf{M}}_{r}(\mathbf{r},t).$ The subscript *r* is used to show that the position, ${r}_{j},$ is an element of a vector, rather than being, e.g. latitude, and longitude separately as in **M**. Using the Eckart-Young-Mirsky theorem,

*K*, is less than or equal to the smaller of the number of rows and columns in ${\mathbf{M}}_{r}$ (in this paper, maximum dimension (rank) is limited by the number of years used,

*K*= 26). Each column, ${\mathbf{u}}_{rj}$ of the

*K*×

*K*orthogonal matrix ${\mathbf{U}}_{r}$ varies with the geographical location.

^{2}Columns, ${\mathbf{u}}_{rj},$ are readily remapped back to ordinary geographical space. Each column, ${\mathbf{v}}_{rj},$ of the $K\times K$ orthogonal matrix ${\mathbf{V}}_{r}$ is the time-history of the particular spatial pattern in ${\mathbf{u}}_{rj}.$ All vectors have a norm of 1 and the units of ${\mathbf{M}}_{r}$ are in ${\mathbf{\Lambda}}_{r}.$ (Because the SVD is used in several different ways here, Table 1 lists the differing labels.)

Useful extensions of singular vector analyses are possible, including application in the frequency domain. For example, Wills et al. (**2018**) discuss separating atmospheric patterns by forcing decoherence of large-scale variability. They did however, employ a 114-year long record, something far beyond what is oceanographically available.

## Vertical structures

Before describing the results for the vertically integrated field, it is useful for the reader to recognize some of the vertical structures that are being suppressed by integration and subsequently ignored. At each lateral position ${r}_{j},$ the potential temperature *anomaly* is a two-dimensional matrix of depth and year, $T({r}_{j},{z}_{k},{t}_{p}),$ with a singular value decomposition,

*K*= 25 because the vanishing temporal mean reduces the degrees of freedom by 1 from the 26 years available. Here, position

*r*is only parametric in the SVD with the independent variables being depth and time. Anomalies are calculated relative to the time-mean at each point and depth. Note the similarity to the results of Pauthenet et al. (

_{j}**2019**), but with their analysis is confined to the upper ocean and including salinity changes.

The SVD is here applied to the layer-depth-weighted potential temperature anomalies, a weighting that much reduces the vertical variance changes (Fig. 5) without removing it altogether. SVD elements are labelled ${\mathbf{u}}_{dj},$ etc. Because temperature structures in the near-surface boundary layer are dynamically distinct, the depth range in Eq. (2) is applied only to the region below the 10th layer, 105 m and greater. Fig. 5 shows the global average singular values as well as the fraction of the global average variance in each ${\lambda}_{j}^{2}/{\sum}_{q=1}^{25}{\lambda}_{q}^{2}.$ In the global median, the first three singular vectors describe more than 95% of the vertical variance – although the nature of those singular vectors varies widely over the ocean. Their global means, denoted by a bracket, $\u3008\xb7\u3009,$ are shown in Fig. 9. A quasi-linear trend is visible in $\u3008{\mathbf{v}}_{1}({t}_{p})\u3009$ representing an overall trend (net warming) in this mode. The corresponding $\u3008{\mathbf{u}}_{d1}(z)\u3009,$ truncated in the figure, reaches a maximum at a depth of 300 m. But note that even so, a non-negligible cooling exists at depth both in this and other modes. A sign convention has been imposed such that a positive near-surface (105 m) value in ${\mathbf{u}}_{dj}({z}_{p})$ corresponds to a warming when the corresponding ${\mathbf{v}}_{dj}({t}_{p})$ is increasing.

To show the variety of lowest modes, Fig. 10 depicts a sampling along 180° W in the Pacific Ocean. Vertical structures vary from having no zero crossings to having two or more.

Let *n _{z}* denote the number of zero-crossings in ${\mathbf{u}}_{d1}$ below 100 m. If ${n}_{z}=0,$ temperature change is of one sign with depth, and might be associated with a simple vertical heave or with the first linear baroclinic mode, although such an association is a very loose one. Similarly, ${n}_{z}=1,$ might be associated with the 2nd linear baroclinic mode, with one sign change with depth.

Fig. 11 displays the histograms of the number of zero crossings in all of the singular vectors, and of the number of zero-crossings in ${\mathbf{u}}_{d1}$ alone. The most common mode is that with no zero-crossings, and it constitutes approximately 40% of the lowest mode. That is, about 40% of the ocean area is described (insofar as the corresponding ${\lambda}_{d1}$ dominates) by a pure vertical heave of the isotherms (see the discussion in Bindoff and Mcdougall, **1994**). But globally, the structure is vertically very complicated in these terms, likely because of the large variety of topographic interaction physics. Fig. 12 shows the horizontal distribution of the variance accounted for by the lowest SVD pair, ${\lambda}_{d1}{\mathbf{u}}_{d1}{\mathbf{v}}_{d1}^{T},$ with the caveat that the structure of ${\mathbf{u}}_{d1}$ varies spatially. Apart from the eastern North Pacific, the western North Atlantic and the southern Indian Ocean, the lowest pair, of differing structures, accounts for 50% or more of the variability. An explanation of why those regions appear exceptional is not attempted here. Even there at least 40% of the variability is accounted for.

## SVD of vertical average temperature

Now mapping ${\varphi}_{i},{\lambda}_{k}$ onto a two-dimensional vector position, *r _{j}*, the result, ${\overline{T}}_{G}$${}_{r}({r}_{j},{t}_{p}),$ is a matrix with averages over a time-interval of 1 year. Water-column average potential temperatures anomalies relative to the overall 26-year time mean are computed at each grid point with each matrix column defining those temperatures at all lateral locations at one time,

*t*. Fig. 13 displays the ordered squared singular values and their normalized sum squared from the water depth average anomalies. Evidently, reproducing 90% of the space-time variance of ${\overline{T}}_{G}$

_{p}*requires of order 13-orthogonal patterns, with moderate domination of the lowest vectors. As many references remind readers (e.g. von Storch and Zwiers,*

_{r}**2001**, p. 294), the singular vectors do not necessarily have any dynamical mode significance, whereas the POPs (Appendix), can.

Patterns of the first 3 ${\mathbf{u}}_{Gi}$ are displayed in Figs. 14–16. Here the strongest pattern, accounting for about 25% of the 26-year variance, has a time dependence, ${\mathbf{v}}_{G1}(t)$ that is dominated by an overall visual near-straight line trend. It corresponds to the similar behavior of ${\mathbf{v}}_{1}(t)$ The same convention is made here that positive regions of ${\mathbf{u}}_{Gi}(\varphi ,\lambda )$ increase (warm) with increasing ${\mathbf{v}}_{Gi}(t)$ and corresponding negative regions decrease. The North Atlantic produces the most complex response, with strong cooling adjacent to strong warming including the Gulf of Maine region (e.g. Piecuch et al., **2017**). Evidently, the global pattern of full-water column temperature change, while quasi-linear in time, is made up of the difference between large-area cooling and warming regions. ${\mathbf{u}}_{G2}(\varphi ,\lambda ){\mathbf{v}}_{G2}{(t)}^{T}$ has a maximum following the 1997–1998 El Niño event, with an equatorial structure strongest in the western Pacific Warm Pool, but one also connected to a global pattern of change. That and the 3rd singular vector (Fig. 16) also contain major elements in the northern North Atlantic, supporting an hypothesis that the intricate changes there are the result of several different physical mechanisms and consistent with the complexity of the interannual changes in Figs. 6–8.

That a net warming of the water column average temperature has occurred over 20+ years is no surprise (Meyssignac et al., **2019**; and numerous others). Overall warming of the South Atlantic in ${\mathbf{u}}_{G1}$ is striking. Note the inference (e.g. Gebbie and Huybers, **2019**) that considerable parts of the deep ocean are still cooling from long-ago meteorological changes, superimposed here on fluctuations from ordinary internal variability at all depths with mixed signs.

The use of a vertical average temperature anomaly mutes the known strong variability at fixed depths. ${\mathbf{u}}_{G3}$ (Fig. 16) does contain a long zonal feature near the equator in the Pacific Ocean – likely a small residual of the strong ENSO cycle manifest there in the upper ocean. On the other hand, ${\mathbf{v}}_{G2},$ the temporal coefficient of ${\mathbf{u}}_{G2},$ shows a clear maximum value during the ENSO episode of 1998. It too has an equatorial Pacific feature, although not conspicuously stronger there than elsewhere.

### Uncertainties of the SVD

For singular vectors corresponding to nearly equal singular values, it is well-known that an instability exists of the detailed structures contained in the two sets of vectors. Thus, Jolliffe (1986, p. 39), the asymptotic properties of the singular values *λ _{i}* and ${\mathbf{u}}_{Gi}$ can be obtained if subject to a comparatively simple hypothesis about the statistical distribution of the field covariance, and then

*λ*, ${\mathbf{u}}_{i}$ are normally distributed. When the results here were compared to those from a shorter 20-year interval (ECCOv4r3), changes were only slight.

_{i}## Summary and discussion

An unorthodox use is made of the column-average vertical potential temperatures in the ocean to reduce the numerical volume, and thus describe the bulk ocean potential temperature variability over 26 years. These structures are computed from an energy, mass, etc. conserving state estimate. The state estimate used here is not truth: like any statistical best-estimate it has intrinsic stochastic and systematic uncertainties. Using singular vectors (SVDs or EOFs) requires about 13 globally uncorrelated vector pairs in the horizontal domain for the water-column vertical average temperature. The spatial structure vectors, ${\mathbf{u}}_{Gi},$ (notation defined in Table 1) markedly display the complex of time scales and structures of the northern North Atlantic – emphasizing the links between purely local and global-scale changes.

In the vertical domain, with a separately computed decomposition, a smaller number of uncorrelated vector pairs is required. Approximately 40% of the dominant (lowest) mode of vertical structure can be described as analogous to the zeroth dynamical mode (vertical heave), but overall vertical structures are complicated, in large-part probably because of the great variety of potential topographic interactions. An analysis of principal oscillation patterns, POPS, of the vertical mean potential temperatures (in the Appendix) is not successful in measurably reducing the number of required structures.

With extended duration, the geometry of oceanic boundaries, including the sea floor topography, and regional long-lived meteorological elements (e.g. regions of excess curl), begin to emerge from the background noise, giving rise to kinematically and dynamically distinct provinces. Analytic understanding of low-frequency thermal variability takes on a strongly regional character, while remaining embedded in, and subject to, larger basin and global scale changes. Results here for the northern North Atlantic Ocean, consistent with previous regional analyses, are a particularly clear example of geographical variation. Any dynamical assumption, implicit or otherwise, that the time-average ocean has a simplified large-scale behavior, needs to be justified observationally. The averaging time, *τ _{avg}*, required for that to be rigorously true, is likely much longer than 26 years, but whether it is 100 or 1000 years, or if it is valid at all, remains obscure and awaits either much longer observational records, or analysis of demonstrably skillful high resolution circulation models. Wunsch (

**2018**) and Meyssignac et al. (

**2019**) address some of the sampling issues associated with persistent spatial patterns in thermal fields. These consequences are not described here.