## Background

The topic of the large scale flow dynamics has attracted the attention of atmospheric and climate scientists since the middle of last century, for example, Rossby (**1940**), and has led to the emergence of, yet undergoing debate, two main lines of thought, namely that extratropical tropospheric planetary wave behaviour is essentially linear or that it is fundamentally nonlinear. Linearity involves a parametrisation of the effect of small scales and nonlinearity in terms of noise mostly represented by a white noise forcing (Penland and Sardeshmukh, **1995**; Toth, **1991**; Nitsche et al., **1994**). At the heart of nonlinearity lies essentially the theory of multiple equilibria, which was first mentioned by Rossby (**1940**) and given prominence in a seminal paper by Charney and Devore (**1979**) and Wiin-Nielsen (1979), see for example Sura and Hannachi (**2015**) for details. In probabilistic terms, this paradigm may suggest ideally a multimodal probability density function (PDF) of the prominent pattern amplitudes.

Though unrealistic, simple conceptual models, such as the low-order chaotic Lorenz (**1963**) model, can help shed light on some key features of the atmospheric dynamics (Palmer, **1993**). Figure 1a shows the model trajectory in the (*x*, *z*) plane with its two butterfly wings around the metastable fixed points. The kernel PDF of a long simulation of the model clearly reveals the bimodal structure of the attractor where the modes are located at or near the metastable fixed points. The neighborhood of the modes represents regions of small tendencies when the trajectory slows down and yields hence an increase of the occurrence frequency as shown in Fig. 1b.

While linear models have been used successfully to explain teleconnection patterns the concept of multiple equilibria has a number of attractive features. It can explain the recurrence phenomenon of weather patterns, also referred to as weather regimes, in addition to mirroring low-order chaotic models that have been used as a simple analog of the midlatitude atmosphere (Ambaum, **2008**). It has also been put forward that nonlinear models can be used to extend the limits of atmospheric predictability (Koo et al., **2002**).

These weather regime states are meant to mirror metastable fixed points of the full nonlinear equations hence mimicking the behavior of tropospheric large scale variability. This would mean that these metastable points will necessarily attract intermittently the system trajectory hence triggering the recurrent behavior and accounting for the longer residence time in their vicinity compared to neighboring flows, within the large-scale state space, for example Legras and Ghil (**1985**), Haines and Hannachi (**1995**), Hannachi (**1997**a), Hannachi (**1997**b), see also Hannachi et al. (**2017**) for a detailed review. One further consequence of this is that it should also yield higher probability of occurrence of the system trajectory in the vicinity of those metastable or quasi-stationary states in a similar manner to the simple low-order Lorenz model (Fig. 1). We note here that we do not promote the Lorenz model as a real paradigm for atmospheric regimes. We only used it as a conceptual model to show that low tendency (or persistence) is generally associated with high PDF (high occurrence frequency), an argument that is generally true irrespective of the existence of multimodality (this is true even with a unimodal distribution). It is true, however, to say that this issue is still debatable, and there is no consensus on the existence of multiple equilibria, see for example Hannachi et al. (**2017**), Stephenson et al. (**2004**) and Ambaum (**2008**).

An appropriate way to identify and quantify signatures of nonlinearity is by studying the system trajectory projected onto a low-dimensional state space. One way to achieve this is by considering the local mean phase space tendencies, which can be compared to those resulting from linear stochastic dynamics as in Hannachi (**1997**a,b) and Branstator and Berner (**2005**). The mean tendencies of a long simulation of the National Center of Atmospheric Research (NCAR) general circulation model (GCM) performed by Branstator and Berner (**2005**) and also Berner and Branstator (**2007**) revealed clear signatures of nonlinearity with two fixed points within the leading principal components (PCs). The structure of the PDF, however, did not show any signature of bimodality; a result that looks inconsistent with the theory of conceptual models outlined above. This is because PCs provide directions maximizing variance but are not particularly suited to study nonlinear features, for example, PDF, of the data.

In this article, we follow the above lines by analysing the trajectory and flow tendencies of a long simulation of a 3-level quasi-geostrophic (3LQG) model on the sphere. To overcome the weakness of PCs in identifying the PDF, and in consistency with the nonlinear character of the dynamics, we use kernel empirical orthogonal functions (EOFs) as a low-order state space to investigate the multimodality of the system and provide a consistent picture of the 3LQG model. An application to reanalysis data is then presented. The strategy followed in this paper is based on validating the association between quasi-stationarity and high-frequency of occurrence with the 3LQG. The objective of this strategy is twofold; (1) confirm the association between quasi-stationarity and occurrence frequency, and (2) show that the model produces quite reasonable flow regimes or multi-equilibria. This is made possible via the kernel PC (KPC) analysis as explained in Section 2. Basically, the above exercise shows that we can use the flow tendency (if we have a large sample size) or kernel PCs for the above purpose, and both give consistent results, and that is what we did with the reanalysis, i.e. we applied KPCs. And not surprisingly, we obtain results that are entirely consistent with the 3LQG model. Note that this model is not severely truncated nor has unrealistic forcing. The article is organised as follows. Section 2 describes the data and methods. Section 3 presents the results. A summary and discussion are provided in the last section.

## Data and methodology

### A three-level quasi-geostrophic model

The model considered here is the familiar 3LQG potential vorticity equations on the sphere. The model is similar to that developed by Marshall and Molteni (**1993**) and describes the evolution of the potential vorticity at three vertical levels; 200-, 500- and 800-hPa surfaces. It is a spectral model with a triangular truncation T21 resolution, that is, 32 × 64 lat × lon grid resolution. The model is symmetrical with respect to the equator, leading to slightly more than 2000 grid points or 693 (231 × 3) spherical harmonics or spectral degrees of freedom (d.o.f). It has been shown that the model reproduces faithfully midlatitude dynamical processes and is competitive to full GCMs in this regard, but with substantially smaller number of d.o.f, for example, Marshall and Molteni (**1993**); Gritsun (**2013**); Peters and Kravtsov (**2012**). The model variables are adimensionalised by scaling the spatial and temporal dimensions by the Earth’s radius and the inverse of Earth’s angular velocity as in Gritsun (**2013**) and Vannitsem (2017). The model is integrated using Galerkin method for spatial differentiation. It is integrated forward in time using a leap-frog numerical scheme with 36 minutes time-step for a little more than 1 million days.

A number of studies have investigated the regime and quasi-stationarity question in the 3LQG model. The results are disparate. D’Andrea and Vautard (**2001**) identified five regimes using a clustering procedure from simulations of the full T21 model as well as a low-dimensional truncation of it. D’Andrea and Vautard (2001) computed the quasi-stationary states of the low-order truncation of the full model, and reported three states. These three states compare with some of the clusters of the full model, and were identified respectively as the Arctic High, the North Atlantic Oscillation/Pacific North America (NAO/PNA) and its opposite phase. Molteni (**1996a**) associated cluster centroids of the 3LQG model with generalised neutral vectors, which correspond closely to quasi-stationary states. Five cluster-regimes were identified (Molteni, **1996a**). In a subsequent paper Molteni (1996b) considered a highly truncated versions of the model, for which the PDF showed bimodality. The procedure followed to obtain bimodality is quite reminiscent of the wave amplitude index (WAI) of Hansen and Sutera (**1986**), and could be an artifact of aliasing (Ambaum, **2008**). Kondrashov et al. (**2004**) run the model for 54000 winter days and considered the low-dimensional state space spanned by the leading three EOFs. They identified four clusters of the 500-hPa streamfunction using mixture modeling and k-means clustering. The obtained regimes represent both phases of the NAO and the Arctic Oscillation (AO).

### Reanalysis data

The reanalysis data considered here consist of the Japanese reanalyses, JRA-55 (Kobayashi et al., **2015**; Harada et al., **2016**). The original JRA-55 spans the period 1958–2012, but for comparison with a similar reanalysis data from the European Centre for Medium Range Weather Forecast (Hannachi and Iqbal, **2019**), we restrict the analysis over the period 1958–2001. Daily sea level pressure (SLP) and 500-hPa geopotential height anomalies are used. The anomalies are obtained by removing a climatological daily seasonal cycle. We focus mostly on the northern hemispheric flow, poleward of 20°N, to identify any signature of nonlinearity in the data. No filtering is applied to the daily data to avoid any signal distortion that can be artificially added (Proistosescu et al., **2016**).

### Mean tendencies and low-dimensional PDFs

Within the low m-dimensional state space of the daily time series ${\mathbf{x}}_{t}=({x}_{1}(t),\dots ,{x}_{m}(t)),t=1,\dots n,$ the local total tendencies ${\mathbf{x}}_{t+1}-{\mathbf{x}}_{t}$ are computed and conditionally averaged over regular grid boxes. To account for the linear contribution a multivariate first-order Markov model:

is fitted to the data. In Eq. (1), $\mathbf{A}={\mathbf{C}}_{1}{\mathbf{C}}_{0}^{-1},$ where ${\mathbf{C}}_{1}$ is the lagged-1 autocovariance matrix, and ${\mathbf{C}}_{0}$ is the covariance matrix of the time series **x**_{t}, $t=1,\dots n.$ The term ${\epsilon}_{t}$ represents a white noise process. The associated flow tendencies obtained from Eq. (1) are then computed by averaging over many realisations.

The AR-fitting is made within the reduced state space, that is, either PC or KPC space. Various dimensions have been explored ranging from two-dimensional (2D) to 10-dimensional (10D) spaces, which are found to be consistent with one another, and we report the case obtained from the 2D case. The averaged tendencies are calculated locally (conditional averaging) within each grid box of the reduced 2D state space. The contribution from the nonlinear tendencies are obtained by subtracting the linear tendencies, estimated using the AR model, from the model tendencies within the reduced state space.

To analyse the frequency of occurrence of planetary wave amplitudes the kernel method of Silverman (**1986**) has been used to estimate the PDF. The kernel estimate is obtained using:

*n*is the sample size, is used in the above kernel PDF estimate.

### Kernel EOFs

EOFs provide directions maximizing variance irrespective of the underlying dynamics generating the data or the manifold containing the trajectory (e.g., Jolliffe, **2002**; Hannachi et al., **2007**), and references therein). Given, however, the complex and nonlinear nature encompassing weather and climate one may expect the presence of nonlinear structures that may not be easily captured and revealed using linear methods.

A proper way to disentangle these complex structures, such as clusters, is to embed the data into another space referred to as *feature space* where structures become discernable, that is, ‘linearly’ in some way, and where complex relationships, for example, clusters, become easier to discriminate. For illustration, one can imagine a 2D space (*x*_{1}, *x*_{2}) and a wavy-like boundary separating two clusters. One can then envisage a nonlinear transformation $({y}_{1}={\varphi}_{1}({x}_{1},{x}_{2}),{y}_{2}={\varphi}_{2}({x}_{1},{x}_{2}))$ where the boundary separating the two clusters in this feature space (*y*_{1}, *y*_{2}) becomes simpler, for example line or parabola (Fig. 2). Consider for instance the following simple example of a 2D input space (*x*_{1}, *x*_{2}) containing quadratic nonlinearities. Then the five-dimensional space obtained by considering all monomials of degree smaller than 3, that is, $({x}_{1},{x}_{2},{x}_{1}^{2},{x}_{2}^{2},{x}_{1}{x}_{2}),$ would decimate/dismantle the initial complex relationships. These kind of transformations are possible in principle and are provided by the kernel EOFs (or kernel PCs).

Kernel PCs (Schölkopf et al., **1998**) are obtained as the EOFs, within the feature space, of the nonlinearly transformed data ${\zeta}_{t}=\varphi ({\mathbf{x}}_{t}),t=1,\dots n.$ They are derived by solving the eigenvalue problem:

**S**is given by the following scalar product in the feature space:

The mapping $\varphi ()$ lies in general in a very high-dimensional space, and a direct computation of the matrix **S** is prohibitively expensive, and in many cases lies in an infinite dimensional space. An elegant way to overcome this problem is to consider the so-called *kernel trick* (e.g., Boser et al., **1992**; Schölkopf et al., **1998**), which consists of choosing the right transformation through a kernel representation of the scalar product in Eq. (4):

In other words, we first construct the kernel, which can be used to learn implicitly the nonlinear transformation.

The kernels used in kernel PC analysis cannot be arbitrary. They have to satisfy a property known as reproducibility. A number of kernels exist such as the squared scalar product $S(\mathbf{x},\mathbf{y})={\left({\mathbf{x}}^{T}\mathbf{y}\right)}^{2},$ or the tangent hyperpolic kernel $S(\mathbf{x},\mathbf{y})=\mathrm{tanh}(a{\mathbf{x}}^{T}\mathbf{y}+b)$ frequently used in neural networks. The former expression provides an example of polynomial kernel with finite dimensional feature space. For example, in two dimensions it is easy to see that it corresponds to $\varphi ({x}_{1},{x}_{2})={({x}_{1}^{2},{x}_{2}^{2},\sqrt{2}{x}_{1}{x}_{2})}^{T}.$ Whereas the latter example corresponds to an infinite dimensional feature space. It is important to note though that in the polynomial case the dimension of the feature space increases as a power law of the degree of the polynomial. In the infinite case the transformation $\varphi (.)$ can be obtained through the spectral decomposition theorem of the operator (or kernel) $S$ in terms of the eigenfunctions of the integral operator $L\left(f\right)=\int S(\mathbf{x},\mathbf{y})f\left(\mathbf{y}\right)d\mathbf{y}$ over the space of integrable functions in ${\mathbb{R}}^{m}$ (e.g., Mikhlin, **1964**). What matters here is the local character of the smoothing. This is analogous to the moving average, in linear filtering, where the averaging window is replaced by a localised function, that is, the Gaussian kernel.

The spectral decomposition theorem yields the expansion:

*L*(.). In this article, we use a Gaussian kernel:

*σ*representing a smoothing parameter, which will be discussed in the application section. Using the data sample

**x**

_{t}, $t=1,\dots n,$ the kernel PCs are then obtained as the leading eigenvectors of the matrix

**S**given by $\left({S}_{\mathit{ij}}\right)=\left(S\right({\mathbf{x}}_{i},{\mathbf{x}}_{j}\left)\right).$

The Gaussian kernel is chosen because of its advantage compared to other kernels. One main weakness of polynomial kernels is the lack of a prior knowledge of the dimension of the feature space. And trying different dimensions is not practical and useful as the dimension increases as a power law of the degree of the polynomial. Another weakness is that polynomial kernels are not localised, which can be problematic given the importance of locality and neighborhood in nonlinear dynamics. These weaknesses are overcome via the use of kernels with localised structures such as kernels with compact support. A typical localised kernel is the Gaussian kernel. Despite being not truly compact support, the Gaussian kernel is well localised, because of the exponential decay, a nice and very useful feature. Furthermore, compared to other kernels (e.g. tangent hyperbolic used in neural networks), the Gaussian kernel has only one single parameter *σ*. Lastly, we should add that polynomial kernels are normally obtained by constructing polynomials of scalar products ${\mathbf{x}}^{T}\mathbf{y},$ which is also the case for the tangent hyperbolic and similar functions like cumulative distribution functions used in neural networks. The other advantage of the Gaussian kernel over the above kernels is that it uses distances $\left|\right|\mathbf{x}-\mathbf{y}|{|}^{2}$ (see Eq. 6), which translates into (local) tendencies when applied to multivariate time series as is the case here. The Gaussian kernel is used nearly everywhere such as in PDF estimation, neural network, and spectral clustering, etc. Note also that there are nonlinear structures such as the example of concentric clusters (not shown) that cannot be resolved using, for example, polynomial kernels.

## Results

### Application to the three-level QG model

#### Conventional PC space

We first explore the PDF of the trajectory of the model simulation within the space spanned by the prominent PCs. Figure 3a shows an example of the mean tendencies of the mid-level (500-hPa) streamfunction projected onto the 2D PC1/PC4 state space. The figure clearly reveals nonlinear signatures similar to those obtained by Branstator and Berner (**2005**) and Berner and Branstator (**2007**). The nonlinear signature can be identified by examining both the tendencies and their amplitudes. For example, linear dynamics yields antisymmetric tendencies with respect to the origin. In addition, the tendency amplitudes should have elliptical shapes prescribed by the structure of the eigenvalues of the linear operator of the system, as shown in Fig. 3b. Figure 3a clearly shows that both of these features depart markedly from linearity. Subtracting the linear tendencies (Fig. 3b) from the total tendencies (Fig. 3a) yields the nonlinear contribution (Fig. 3c). This figure clearly reveals two singular (or fixed) points representing (quasi-) stationary states, which are discussed later.

Figure 3d shows the PDF of the system trajectory, which is clearly unimodal (though non Gaussian). Unimodality of the PDF within the low-dimensional PC space is not surprising. Christiansen (**2009**), for example, used a projection pursuit to analyse the structure of the hemispheric large scale flow in the troposphere. He suggested that only evidence of non-Gaussianity can be found but no evidence for bi- or multi-modality. It has been argued that by construction the (linear) PCs provide a measure of the variability in the data, and therefore may not be suitable to provide the desired information about the system PDF as they are prone to ‘Gaussianity’ (e.g., Ambaum, **2008**), although this last point is debated in Christiansen (**2009**).

#### Kernel PC space

We now discuss the results when kernel EOFs are considered instead. The Gaussian kernel, Eq. (6), is used here because of the nice properties of the Gaussian function, for example, locality and smoothness. Sensitivity tests have been carried out to find the appropriate smoothing parameter. For example, for large values of *σ* the kernel matrix becomes close to singular, and for small values it becomes close to the identity matrix. The method has been applied to a number of examples of various complexity. The application to data generated from multivariate normal distribution as well as data generated by mixture of multivariate Gaussians reveals that kernel PC analysis is successful to identify the clusters. A more complex test clusters generated from data distributed on concentric spheres was also used, and the method was successful in identifying the clusters. Precisely, we generated three clusters distributed on three concentric spheres. Conventional PC analysis and polynomial-kernel PC analysis cannot separate the clusters. Kernel PC analysis with a Gaussian kernel, however, was able to separate the clusters (not shown). These examples are described in detail in Hannachi and Iqbal (**2019**).

In this study we found that a wide range of values of ${\sigma}^{2}$ centred around ${\sigma}_{0}^{2}=2\mathrm{min}(||{\mathbf{x}}_{i}-{\mathbf{x}}_{j}|{|}^{2}),$ provide robust results. The value of the smoothing parameter, however, is not universal, but is data dependent, making the kernel EOF analysis data-adaptive. The results presented below are discussed with reference to the smoothing parameter ${\sigma}_{0}^{2}.$

The leading 10 kernel PCs are explored here to study the flow tendency and PDF. But in general a rich structure is observed in 2D kernel PCs, where the leading kernel PC is included. Note that the kernel PCs are uncorrelated as they are Eigenfunctions of the symmetric (and semi-definite positive) kernel matrix, see Eq. (4). As in Fig. 3c, Fig. 4a shows the departure of the averaged total flow tendencies (not shown) from the linear component (not shown) and reveals again two fixed points.

Figure 4b shows the 2D kernel PDF within the KPC1/KPC4 state space, and reveals clearly strong bimodality where the modes are also shown. This is consistent with the fact that kernel PC analysis enables some kind of disentanglement of possible complex structures as pointed out in Section 2.4 above. Figure 4a,b clearly reveals an outstanding feature, namely the correspondence between the PDF modes and the fixed points provided by the flow tendencies. There is a slight departure between the right mode and the corresponding singular (or fixed) point, which we believe is due to sampling and also instability of the quasi-stationary state. This is consistent with conceptual models, for example, Fig. 1a,b.

Similar pictures are also obtained with KPC1/KPC3, KPC1/KPC5, KPC1/KPC6, etc. In KPC1/KPC2 plane with two (left and right) singular points the PDF is strongly nonlinear having a curved structure with increasing values as we move from the left (less stable) to the right (more stable) singular point. We note here that these leading KPCs also have substantial amount of variance. The KPCs of the 3LQG model are highly correlated with the leading PCs (not shown). The regression of each KPC on the leading 10 PCs (not shown) shows strong R-square values reflecting the large-scale nature of these KPCs. Interestingly, KPC2 has small R-square value compared to the others.

Figure 4 displays the maps corresponding to the anomalies (middle) and the total (bottom) flows of the two modes of the PDF. It is obtained by compositing over all states that fall within a neighbourhood of the two modes. It is clear that the anomalous stationary states are mostly localised over the North Pacific, the North Atlantic and North America and stretch to the Mediterranean region. The first stationary state shows a low over the North Pacific associated with a dipole over the North Atlantic reflecting the negative NAO phase (Woollings et al., **2010a**). The second anomalous stationary solution represents approximately the opposite phase, with a high pressure over the North Pacific associated with an approximate positive NAO phase. In both cases the anomalies over the North Atlantic are shifted slightly poleward compared to the conventional NAO counterparts.

The total flow of the first stationary solution shows a ridge over the western coast of North America associated with a diffluent flow over the North Atlantic with a strong ridge over the eastern North Atlantic. This latter represents a blocked flow over the North Atlantic. Note the stronger North Atlantic ridge compared to that of the western North American continent. The second stationary state shows a clear zonal flow over both basins. These solutions are reminiscent of the zonal and blocked flows of Charney and Devore (**1979**).

### Application to hemispheric reanalysis

Based on the results obtained from the 3-level QG model we use the kernel PDF to investigate tropospheric northern hemisphere (NH) nonlinearity. Averaged tendencies cannot be used here because the available sample size from reanalysis is too short to get a sensible picture. Very large samples, as for the 3-level QG model, are indeed required. We consider the winter (DJF) daily SLP anomalies over the NH. The kernel PCs are computed as outlined above. To check the sensitivity of nonlinearity with respect to the NH domain we analyse the SLP anomalies over the whole NH domain as well as poleward of 20°N and 27°N. The optimal value of the smoothing parameter ${\sigma}^{2}$ is found for the different regions to be similar and around ${\sigma}_{0}^{2}.$ The results, however, are quite robust for values of ${\sigma}^{2}$ between about $0.5{\sigma}_{0}^{2}$ and $5{\sigma}_{0}^{2}.$

The first 10 KPCs are considered as in the previous section and 2D PDFs are explored. It is found that a rich structure exists in most 2D state spaces involving specifically the leading KPC1. In particular, strong bimodality is found in many 2D KPCs. The bimodality is particularly stronger for the last domain (poleward of 20°N), compared particularly to the whole NH. This is consistent with the fact that nonlinearity is more prominent in the extratropical winter compared to the subtropics.

Figure 5a shows the daily PDF of SLP anomalies over the NH poleward of 20°N using KPC1/KPC7. Similar pictures are obtained using other KPCs but the strongest signal is obtained with KPC7. Strong bimodality stands out from this PDF, and the modes are significant at 10% level (Fig. 6a) and also at 5% level (not shown). To characterise the flow corresponding to the two modes a composite analysis is performed by compositing over the points within the neighborhood of the two modes A and B. The left mode (Fig. 5c) shows a polar high stretching south over Greenland accompanied with a low pressure system over the midlatitude North Atlantic stretching from eastern North America to most of Europe and the Mediterranean. This circulation regime projects strongly onto the –NAO. The second mode (Fig. 5d) shows a polar low with high pressure over midlatitudes North Atlantic, with a small high pressure over the northern north-west Pacific. The regimes are not exactly symmetrical of each other; regime A is stronger than regime B. We have also examined the hemispheric 500-hPa geopotential height. 2D PDFs (not shown) reveal again strong bimodality associated, respectively, with polar low and polar high. The mode associated with polar high is stronger however.

The leading two KPCs (not shown) also have strong bimodality, and the associated flow regimes also represent both phases of the AO, which are very similar to those of Fig. 5c,d. We simply chose KPC1/KPC7 because the figure looks a bit smoother than that of KPC1/KPC2. Bimodality is also obtained in KPC1/KPC3, KPC1/KPC4 and KPC1/KPC6, but it is weaker than that of KPC1/KPC2 and KPC1/KPC7. These KPCs project on large scale flow and explain non-negligible variance. In fact, Table 1 shows the correlations between the leading 10 KPCs and 10 PCs, along with the R-square obtained from multiple regression between each KPC and the leading 10 PCs. It is clear that these KPCs are large scale and also explain substantial amount of variance. Note, in particular, that the R-square is mostly explained by (large scale) PCs explaining substantial amount of variance.

Besides nonlinearity of hemispheric atmosphere another issue that has been discussed in the literature is the climate change signal of those large scale circulation regimes, see for example, Corti et al. (**1999**), Hannachi (**2007**), Christiansen (**2005**), Monahan et al. (**2000**). For example, Christiansen (**2005**) used the WAI to conclude that the last decade was dominated by a strong (or disturbed) regime whereas a weak (i.e. zonal) regime dominated earlier. Hannachi (**2007**) applied a mixture model to 500-hPa geopotential height anomalies, and suggested a regime shift projecting respectively onto + PNA and + NAO, that is, a deepening of the Aleutian and Icelandic lows. We investigate here any changes in the PDF. Figure 5b shows the change in the PDF between the first and last halves of the data. A significant regime shift is observed with a large decrease (increase) of the frequency of the polar high (low) between the two periods (Fig. 6). This figure shows the 10% significance test of the PDFs of SLP anomalies for the whole (6a), the first (6b) and second (6c) halves of the record and is based on a Monte-Carlo procedure. One hundred surrogates of SLP anomalies are generated based on a multivariate AR(1) model. The surrogates have the same leading EOFs and associated explained variances as the SLP anomalies, where the leading 20 EOFs are used. The PDF of the SLP anomalies are then compared to those obtained from the surrogates. Details of the test can be found in Hannachi and Iqbal (**2019**). It is observed, in particular, that in the first half of the record the left mode is significantly dominant whereas in the second half the right mode has become significantly dominant.

The change in the PDF modes projects onto the + NAO (and + AO) and is consistent with an increase in the zonal wind speed. This result seems to be inconsistent with the proposed hypothesis of polar amplification, which implies in theory a reduction of the meridional temperature gradient yielding a reduction of zonal wind speed. This issue needs further investigation by including, for example a longer record or by examining outputs from climate models, which is left for future research.

## Summary and discussion

In this article, we have discussed the question related to nonlinearity and regime behaviour of hemispheric winter atmosphere using an intermediate complexity model for the extratropics and reanalysis data. A three-level quasi-geostrophic model on the sphere was used to analyse the structure of hemispheric flow. The rationale behind is to check whether northern hemispheric flow mimics the dynamics of low-order systems whereby small tendencies are associated with high frequency (or probability) of occurrence of planetary wave amplitudes, and support hence the thesis of large-scale hemispheric stationary circulation regimes.

A very long simulation of the model (a little more than 10^{6} winter days) is first obtained. The local averaged tendencies of the mid-level streamfunction are computed within the leading EOFs and shows existence of two singular points within the large scale PC state space. The kernel PDF, however, is unimodal though slightly non-Gaussian. This result is inconsistent with conceptual models, but is not surprising as EOFs precisely find directions maximizing variance irrespective of the dynamics generating the data. To overcome this weakness we considered kernel EOFs (or kernel PCs), which operate in the feature space and attempt precisely to decimate possible complexity, for example, clusters. The computed tendencies within the 2D KPCs reveal nonlinear structures with two singular points, as for EOFs, but with strong bimodal structure of the PDFs. The bimodality is associated precisely with the regions of low flow tendencies; a picture that is entirely consistent with the rational outlined above. The flow associated with the singular points corresponds well with the modes of the PDF. These flow regimes represent respectively a zonal flow and a ‘wavy’ flow with ridges (or blocks) over western continental boundaries around the western coast of North America and eastern North Atlantic.

In the last part of the article we have attempted to identify nonlinearity using the kernel PDF and kernel PCs applied to daily SLP and 500-hPa geopotential height anomalies over the northern hemisphere winter using the Japanese reanalysis JRA-55. Previous analyses on nonlinearity and bi- or multi-modality of northern winter atmospheric variability were disparate and inconclusive. One main reason for this, we believe, is the lack of a proper method to disentangle the complex nature of the flow. In one of his critical studies, Christiansen (**2009**) suggests that in the troposphere no evidence of bimodality exists, although the distribution may be non-Gaussian. In addition, most of previous methods reduce the system dimensionality by using EOFs, which do not capture the fundamental feature of the system.

The 2D kernel PDFs using the leading kernel PCs of SLP anomalies reveal strong bimodality. This bimodality is slightly stronger for the extratropics (north of 20°N and 27°N) compared to the whole NH pointing to the stronger nonlinearity in the extratropics compared to when the subtropics are included. The first mode of SLP anomalies represents a polar high stretching south to Greenland associated with a low pressure centre over the North Atlantic midlatitudes. The second mode represents a polar low with high pressure over the North Atlantic and a small anomaly over the North Pacific. Apart from the polar anomalies the modes overlap with the NAO. It is clear from the analysis that the Arctic Oscillation plays a non-negligible role in controlling the nonlinearity (or bimodality) in the extratropical troposphere.

This analysis clearly suggests bimodality of large scale tropospheric flow. Another closely related issue is the mechanism. Two mechanisms are suggested. The first one consists of multiple equilibria as suggested in the present study. The second mechanism consists of synchronisation between sectorial regimes, which are linked to the latitudinal position of the eddy-driven jet stream (e.g. Woollings et al., **2010**b; Hannachi et al., **2012**). Evidence of synchronisation between the North Atlantic and the North Pacific was presented in Hannachi (**2010**). Stationary Rossby waves play an important role in the existence and maintenance of sectorial regimes, and this may provide the reason behind the synchronisation between both basins.

The SLP anomaly PDFs have also been examined for any change between the first and second halves of the data. There is a strong shift in regime frequency where the second (first) mode has increased (decreased) frequency of occurrence. This recent climate change signal is associated with an increase of the zonal flow and a decrease of blocked flow frequency particularly over the North Atlantic. This result is in apparent contradiction with the polar amplification hypothesis. In fact, this hypothesis would lead to a weakening of zonal flow and increase of blocking flow in the midlatitudes. This conclusion is obtained based on the thermal wind balance and is based on a linear argument. Note also that the thermal wind balance is a diagnostic equation of balanced flow. The regime paradigm discussed here presents a nonlinear perspective of the system dynamics. Considering now that the overall outcome of the system can be obtained as the sum of the linear response and the nonlinear response, the possibility is that the latter response somehow balances the former response. So the suggestion here is that the obtained increase in zonal flow and decrease of blocking frequency is the nonlinear response of the climate change signal whereas the reduction in the zonal wind speed is the linear response of the climate change signal. A similar (nonlinear) climate change signal was also obtained independently by the (leading) author using NCEP/NCAR reanalysis (Hannachi, **2007**) and using an entirely different approach.