1.

## Introduction

Mid-latitudinal weather features such as blocks and cyclones exhibit large interannual and decadal variability (e.g. Barnes et al., 2014; Rimbu et al., 2014; Welker and Martius, 2014; Hanna et al., 2015; 2016). Both blocks and cyclones can cause temperature and precipitation extremes (e.g. Buehler et al., 2011; Pfahl and Wernli, 2012a, 2012b; Martius et al., 2013; Pfahl, 2014; Grams et al. 2014; Messmer et al. 2015, 2017). Similarly, extreme events such as temperature extremes, storms, floods, or hail are connected with circulation types (CTs), which represent the most important recurring weather patterns over a specific region (e.g. Donat et al., 2010; Prudhomme and Genevier, 2011; Riediger and Gratzki, 2014; Nisi et al., 2016).

Decadal changes in CTs and weather systems are not well studied, however potential connections of circulation features with decadal-scale variability has the potential to improve predictability. Decadal variability in climate has often been linked to decadal modes of sea surface temperature (SST) variability: the Atlantic multidecadal oscillation (AMO; e.g. Schlesinger and Ramankutty, 1994; Kerr, 2000; sometimes also referred to as Atlantic multidecadal variability, AMV; e.g. Semenov et al., 2010; Ba et al., 2014) and the Pacific decadal oscillation (PDO; e.g. Mantua et al., 1997; hereafter M97; Mantua and Hare, 2002; Newman et al., 2016). It has been suggested that these modes influence the surface temperature at continental to global scales (e.g. Enfield et al, 2001; Knight et al., 2006; Sutton and Dong, 2012; Chen and Tung, 2014; Trenberth, 2015), as well as the hydroclimate in Asia (Shen et al., 2006; Qian and Zhou, 2014; Malik and Brönnimann, 2018), North America (McCabe et al. 2004), the Carribean (Winter et al., 2011) and Europe (e.g. Sutton and Hodson, 2005; Sutton and Dong, 2012). Furthermore, changes of the Intertropical Convergence Zone were related to the AMO (e.g. Chiang and Vimont, 2004) and decadal variability in the tropical tropopause temperature was linked to the PDO (Wang et al., 2016a).

Weather systems play an important role in linking climatic anomalies to SST modes, but the processes are not comprehensively understood. For instance, the PDO was shown to be partly driven by the Aleutian low while El Niño and variations of Kuroshio currents also play a role (e.g. Schneider and Cornuelle, 2005; Newman et al., 2016). The AMO is seen by some as an active driver of atmospheric circulation variability (e.g. Zhang and Delworth, 2006; Woollings et al., 2012; Peings and Magnusdottir, 2014; Veres and Hu, 2013, 2015; O’Reilly et al., 2017), while others see atmospheric circulation features as the primary driver of the AMO (e.g. Häkkinen et al., 2011; Mecking et al., 2014; McCarthy et al., 2015; Delworth et al., 2017). Sun et al. (2015) propose a delayed oscillator model with a two-way interaction between the North Atlantic Oscillation (NAO) and the AMO.

Independent of the question which component of the climate system is the active driver; it is undisputed that decadal ocean variability and mid-latitudinal atmospheric circulation interact with each other. Häkkinen et al (2011) proposed that decadal changes in atmospheric blocking are responsible for the AMO. Woollings et al. (2012, 2015) found that the North Atlantic storm track strengthens and penetrates deeper into Europe with a weakening of the Atlantic Meridional Overturning Circulation (AMOC), which is often seen as an important driver of the AMO (e.g. Ba et al., 2014; Delworth and Mann, 2000). Caesar et al. (2018) found that the AMOC leads variations of the NAO by 3 years. This is consistent with Gastineau et al. (2015) who showed that the positive phase of the AMO leads to the negative phase of the NAO in winter.

Ding et al. (2014) found that anomalous Rossby wave-train activity originating in the tropical North Pacific contributed to the negative trend in NAO during the last three decades, which leads to an enhanced warming over the Arctic and colder winters in Europe (e.g. Trigo et al., 2002). Hence, the North Pacific Ocean and thus the PDO may partly be responsible for decadal variability over the Euro-Atlantic sector.

The short observational record relative to the time scales of AMO and PDO and the sparseness of SST proxy data back in time render it difficult to find statistically significant relationships between atmospheric weather systems and decadal oceanic modes of variability. Centennial reanalyses that exclusively use surface observations (i.e. surface-input reanalyses; cf. Fujiwara et al., 2017) have become available only recently (Compo et al., 2011; Poli et al., 2016; Laloyaux et al., 2018). A previous study (Rohrer et al., 2018) compared the representation of mid-latitudinal atmospheric features in surface-input reanalyses with modern full-input reanalyses between 1980 and 2005 and found good agreement not only at the surface but also at the 500 hPa level. Therefore, these surface-input reanalyses are potentially a valuable tool for assessing the connection between AMO and PDO and weather patterns during the entire 20th century.

In this study, we use all four available centennial reanalyses (20CR, 20CRv2c, ERA-20C and CERA-20C, see also Table 1) and intercompare their representation of blocks, storm track activity (STA) and Alpine circulation types. The use of several reanalyses is important because reanalyses are subject to uncertainties (see Thorne and Vose, 2010), mainly due to changes in data availability and quality (e.g. Bengtsson et al., 2004) and spatio-temporal variations in the assimilation scheme such as the inflation of the covariance matrix (Compo et al., 2011). Additionally, to better understand whether the signal is forced by the ocean, we examine the connections between the AMO/PDO and mid-latitudinal circulation in the atmosphere-only ERA-20CM model ensemble (Hersbach et al., 2015) and compare the results with reanalyses. The ERA-PreSAT data set (Hersbach et al., 2017) is also used to examine whether the inclusion of early radiosonde data influences the representation of extratropical circulation in the mid-troposphere.

This paper is organised as follows: In Section 2, the methods and the datasets used in this study are briefly introduced. Section 3 compares the four centennial reanalyses with each other and examines the connection between AMO and PDO and weather patterns. Section 4 lists and discusses the main findings.

2.

## Data and methods

2.1.

### Datasets

Four reanalysis datasets covering the 20th century are used to assess fluctuations and trends in the 20th century. Following the naming convention of Fujiwara et al. (2017), all of these reanalyses are referred to as surface-input reanalyses, i.e. they only assimilate surface observations. A short summary is given in Table 1. Throughout the study, we use 6-hourly data for cyclones and blocks and daily-averaged data for CTs.

20CR and 20CRv2c originate from the National Oceanic and Atmospheric Administration (NOAA; Compo et al., 2011). We make use of their 56-member approach and analyse all members to obtain an uncertainty measure for the datasets. 20CR encompasses the years 1871–2010. The updated 20CRv2c extends the dataset to the period 1851–2014 and corrects issues concerning sea ice concentration. Both datasets assimilate surface pressure and sea level pressure measurements into an Ensemble Kalman Filter. Instead of the HadISST1.1 used in 20CR, 20CRv2c uses the 18 members of the Simple Ocean Data Assimilation with the Sparse Input version 2 dataset (SODAsi.2, Giese et al. 2016) to constrain ocean temperatures. While 20CR uses data from the International Surface Pressure Databank version 2 (ISPD, Cram et al., 2015), 20CRv2c takes advantage of the newer version 3.2.9. The Global Forecast System (GFS) 2008x NWP model is used with a triangular truncation of 62 and 28 vertical levels in both reanalyses.

Two additional reanalyses covering the 20th century (1900 and 1901, respectively, to 2010) are provided by the ECMWF. ERA-20C (Poli et al., 2016) uses the Integrated Forecast System (IFS) NWP cycle38r1 model with a triangular truncation of 159 and 91 vertical levels (T159L91). They use the HadISST2.1.0.0 dataset as the SST and sea ice boundary condition (Titchner and Rayner, 2014) and assimilate observations from the IPSD version 3.2.6 database and surface pressure and marine surface wind observations from the International Comprehensive Ocean–Atmosphere Data Set version 2.5.1 (ICOADS; Woodruff et al., 2011) into a 4-dimensional variational analysis (4D-Var) scheme.

The successor of ERA-20C is the only centennial reanalysis that couples an atmospheric model with an ocean and sea ice model (CERA-20C, Laloyaux et al., 2018). It additionally incorporates ocean temperature and salinity observations that are relaxed towards different realizations of HadISST2.1.0.0. The resolution is identical to ERA-20C (T159L91).

CERA-20C uses the IFS NWP cycle41r2 model and provides a 10-member ensemble. We use the individual members to derive our metrics and estimate the uncertainty of the dataset from the different CERA-20C realizations.

Additionally, we investigate the representation of mid-latitudinal phenomena in the ERA-PreSAT reanalysis (Hersbach et al., 2017), an experimental reanalysis encompassing the years from 1939 to 1967. They use the same assimilation system as ERA-20C but additionally assimilate upper air data to assess the value of early upper air measurements.

In order to evaluate the role of SSTs and of the assimilation of observations, the ERA-20CM (Hersbach et al., 2015) model ensemble is examined. The model setup is identical to ERA-20C with prescribed SSTs, but no observations are assimilated into the model. Thus, the 10-member ensemble provides an estimate of the atmospheric response of the IFS NWP cycle 38r1 model to the SST forcing during the 20th century.

2.2.

### Mid-latitudinal circulation metrics

The methods for defining blocks and CTs investigated in this study were used in Rohrer et al. (2018) and are only briefly described here.

2.2.1.

#### Blocks

Blocks are defined as quasi-stationary (overlap between two time steps larger than 0.7) and persistent (longer than 5 days) reversals of the 500-hPa geopotential height (Z500) gradient. These reversals are defined according to the two-dimensional extension of the Tibaldi and Molteni (1990) index by Scherrer et al. (2006):

((1) )
((2) )

All datasets are bi-linearly remapped to a 2°×2° resolution and the latitude φ varies from 36° to 76° in 2° intervals.

Note that this definition includes the subtropical high-pressure belt, which is important for assessing possible links with the tropical belt.

2.2.2.

#### Storm tracks

Following Blackmon (1976), we employ an Eulerian method to detect the location of the mid latitude storm track as a measure of cyclone activity. The Z500 field is band-pass filtered such that it retains periods between 2.5 and 6 days; thereafter, the seasonal standard deviation is computed (henceforth called storm track activity, STA). Monthly anomalies are computed by removing the long term annual cycle. No spatial interpolation was applied before the band-pass filtering, so all datasets are used in their original resolution.

2.2.3.

#### Circulation types

The two CT classifications GrossWetterTypes (GWT) and the Cluster Analysis of Principal components (CAP) are used. The COST733class software (Philipp et al., 2010; 2016) is employed to compute the circulation type classifications (CTC). As input, we use daily-averaged data remapped to a 1° × 1° resolution over the Alpine domain (41°–52° N, 3°–20° E; domain 6 in Philipp et al., 2010). Here, we examine GWT with 10 types on sea level pressure (SLP; GWT10SLP) and on Z500 (GWT10Z500) as well as CAP with nine types on SLP (CAP9). Table 2 provides a short synoptic description of the CTs used in this study.

In this study, we focus on the CAP classification. To ensure comparability between the different datasets, we used the CAP9 centroids from Weusthoff (2011) and assigned every synoptic situation to the closest centroid in terms of Euclidean distance (Rohrer et al., 2018). The centroids are shown in Supplementary Fig. S1.

Second, the GWT classification is used to assess whether to choice of the classification influences the results. GWT uses a correlation-based approach, where a specific synoptic situation is assigned to a class depending on its correlation with a zonal, a meridional, and a cyclonic pattern.

2.3.

### Oceanic variability modes

The PDO is derived according to the definition of M97. First, the monthly mean global SSTs are subtracted to remove the climate change pattern. Then, we compute empirical orthogonal functions (EOFs; Dawson, 2016) of the monthly SST anomalies in the Pacific Ocean north of 20°N at grid points without sea ice in the period 1901–1990 and project the EOF pattern on the remaining years.

The AMO index is based on a definition similar to that of Trenberth and Shea (2006, henceforth TS06). We calculate the monthly mean global SSTs between 60°S and 60°N and subtract them from the monthly SST time series in order to remove the seasonal cycle. Thereafter, the AMO time series is defined as the mean SST anomaly over the North Atlantic Ocean between 0° and 60°N.

We do not low-pass filter the AMO and PDO time series to avoid spurious correlation that may result from the reduction of the degrees of freedom (Trenary and DelSole, 2016). Seasonal indices are derived from the monthly time series.

The AMO and PDO indices are derived separately in every dataset to ensure their robustness. Both indices correlate well across all reanalysis datasets (0.93–1.0 for PDO, 0.93–0.99 for AMO), as well as between reanalyses and the corresponding original time series (M97 for PDO, correlations 0.92–0.94; unsmoothed TS06 for AMO, correlations 0.93–1.00)

2.4.

### Statistical analyses

We examine the relation of circulation features to the AMO and PDO by separating their time series into terciles corresponding to positive, neutral, and negative phases. Table 3 summarizes the different regions used in this study and Fig. 1 depicts the locations of these regions. Although terciles may be a rather subjective way to separate the groups, using higher percentiles to define the negative and positive phase of the AMO and PDO did not significantly change the results.

Fig. 1.

Annual mean distribution of the storm track (given by the mean storm track activity [m] in the early 20th century (1901–1930,a) and present (1981–2010,b). Panel c and d show the annual mean distribution of blocks (given by the blocking frequency [%/year]) in the early 20th century and present, respectively. The filled grey-scale contour denotes the climatological fields in 20CR. The coloured contours denote the 6% blocking frequency contour and the 50 m band-pass filtered 500 hPa geopotential height variability contour. The rectangles depict the subdomains used in this study. For blocks: Greenland (20°–60°W, 60°–75°N), Scandinavia (15°W–40°E, 50°–70°N), NPA (130°–230° E, 60°–75°N). For storm tracks: NAE (70°W–10°E, 40° –60°N), NPA (150°–230°E, 40°–60°N).

For composite maps, we used bootstrapping with 20,000 iterations to randomly sampling one third of the data without replacement (e.g. Wilks, 2006) to determine the significance of a signal. We employ the false discovery rate (FDR) method by Wilks (2016) to omit the problem of field significance. According to their recommendation for spatially correlated data, we use a significance level of which corresponds to a global test level of ${\alpha }_{\mathit{glob}}=0.05.$ This results in a local test level (i.e. of an individual grid point) We only consider the mid latitudes (i.e. 35°–75° N and 35°–75° S).

For CTs, significance at the 95% CI is tested by an augmented Welch’s t-test that accounts for autocorrelation (i.e. the one-year lagged autocorrelation AR1) of seasonal CT frequencies by reducing the effective sample size from n to if ${\mathit{AR}}_{1}>0$ (Zwiers and von Storch, 1995). The significance of trends in time series is computed on the basis of a Student t-test applied to a least-squares linear regression.

We provide estimates for the similarity of anomaly patterns using map correlations that are obtained by calculating the correlation of latitude weighted grid points within a domain.

3.

## Results

3.1.

### Climatology and trends during the 20th century

3.1.1.

#### Blocks

A comparison of early 20th century blocking climatology (1901–1930, Fig. 1a) shows that all datasets capture the main blocking regions over the North Pacific, Greenland and Scandinavia. 20CR and 20CRv2c contain more blocks over Greenland and northern Canada than ERA-20C and CERA-20C. Supplementary Fig. S2 provides a more detailed view on the differences in the early 20th century. These differences vanish when a more recent period (1981–2010) is investigated (Fig. 1b). Here, differences are comparably small. Regional time series for the Greenland region (Fig. 2), the Scandinavian region (Supplementary Fig. S3), the high latitude North Pacific region (NPA; Supplementary Fig. S4) and the South Pacific region (SPA; Supplementary Fig. S5) confirm this finding. CERA-20C generally shows fewer blocks, but the difference is greatest in the early 20th century, particularly in the SPA region (see also Supplementary Fig. S2).

Fig. 2.

Seasonal blocking frequencies [fraction of blocked time steps in %] over the Greenland domain (60°–75°N, 20°–60°W) between 1851 and 2014. A 5-year running mean is applied for better readability. For multi-member datasets, the mean and the 10th and 90th percentiles are shown.

The model ensemble ERA-20CM generally agrees with the constrained reanalyses in terms of blocking locations and frequencies (Fig. 1a and 1b). Differences between ERA-20C and ERA-20CM cannot be attributed to the oceanic forcing, but must originate from the assimilation of surface observations in ERA-20C opposed to the unconstrained ERA-20CM run. Northern hemispheric winter (DJF) and summer (JJA) blocks tend to be underestimated during the late 20th and early 21th century compared to centennial reanalyses, while spring (MAM) and autumn (SON) are modelled adequately (Figs. 1a, 1b, and 2, Supplementary Figs. S3 and S4).

The experimental ERA-PreSAT reanalysis shows an interannual variability similar to that of surface-input reanalyses. In summer, ERA-PreSAT contains 10% (20CR) to 30% (CERA-20C) more blocks than surface-input reanalyses over Greenland and Scandinavia, as shown in Fig. 2 and Supplementary Fig. S3. In winter, the blocking frequency is comparable to CERA-20C and ERA-20C but 7% lower than in 20CR and 20CRv2c.

Trend analyses reveal differing trends for Greenland summer blocks between 1901 and 2010 with a significant increase (decrease) in ERA-20C and CERA-20C (20CR and 20CRv2c, not shown). Between 1950 and 2010, Greenland summer blocks significantly increase in 20CR and 20CRv2c (Table 3). This underlines the effect of the high blocking frequency over Greenland prior the 1950s. Over Scandinavia, we report a significant increase of summer blocks in all reanalyses except 20CR for the period 1901–2010. In contrast, no significant trend is found for the period 1950–2010 (not shown). Over NPA, a consistent increasing trend in JJA blocks is found between 1950 and 2010. Over SPA trends depend strongly on the reanalysis. The sign of (significant) trends is found to differ among reanalyses for all seasons.

Correlation coefficients of seasonal blocking frequencies between 1901 and 2010 between centennial reanalyses are generally highest in winter (DJF; Fig. 3a). The median seasonal blocking frequency among member is used in correlations for multi-member reanalyses. Correlation coefficients for the first half of the 20th century (not shown) are not much lower (mostly 0.1–0.3 lower) than for more recent times, showing that surface-input reanalyses are able to capture interannual variability of blocks in the early 20th century.

Fig. 3.

Correlation coefficients of the seasonal blocking frequency between two datasets over the Greenland domain (a) and the seasonal storm track activity over the NAE domain (b) between 1901 and 2010. The four numbers denote the correlation coefficient in winter (DJF, upper left), spring (MAM, upper right), summer (JJA, lower left) and autumn (SON, lower right).

The spread among the ensemble members indicates the level of constraint of the reanalysis obtained by the data assimilation for a particular region. The ensemble spread for blocking frequency almost uniformly increases prior to 1950 (denoted by the Q010–Q090 range in Fig. 2, although a 5-year running mean was used for better readability) in CERA-20C, 20CR and 20CRv2c over the Scandinavian domain, and is always lower than the ensemble spread in ERA-20CM. Between 1871 and 1920, the ensemble spread of both 20CR reanalyses is larger than in ERA-20CM. The ensemble spread decreases by a factor of roughly two before 1870 compared to 1871–1920 for 20CRv2c over NPA (Supplementary Fig. S4), a value that is comparable to the ensemble spread of the model simulation (ERA-20CM in the 20th century). The timing of the sudden changes in the ensemble spread coincides with changes in the blocking frequency over Greenland (Fig. 2; most pronounced in summer) and with changes in the temporal variation of the covariance inflation parameter in the data assimilation of 20CR (see Table 1 in Compo et al., 2011), indicating that these adaptions alter the mid-tropospheric circulation.

Uncertainties are lowest over Scandinavia, consistent with the fact that more observations are available for this region. This is also true in the early 20th and late 19th century. The ensemble spread is generally larger over SPA (Supplementary Fig. S5) and interannual variability is not captured as well as in the Northern Hemisphere.

3.1.2.

#### Storm tracks

The climatology of STA in the early 20th century (1901–1930, Fig. 1c) shows two particularly active areas over the North Atlantic and the North Pacific. This is in line with many other studies (e.g. Chang et al., 2002; Hoskins and Hodges, 2002; Neu et al., 2013). 20CR and 20CRv2c show a higher STA in the high latitudes than ERA-20C and CERA-20C. In 20CR and 20CRv2c, STA is up to twice as high in 1901–1930 over the Arctic compared to 1981–2010, while the main storm track regions show a comparably small difference (North Atlantic ∼0%, North Pacific up to 15%, Fig. 1d). CERA-20C has an up to 10% (3%) higher STA over the North Atlantic (North Pacific) than ERA-20C.

For CERA-20C and ERA-20C the STA is up to 60% higher between 1981 and 2010 than the period 1901–1930 over the main storm track regions (Fig. 1d). This explains the small areas with a band-pass filtered 500 hPa geopotential height variability being larger than 50 m in Fig. 1c compared to Fig. 1d.

Trends in STA between 1901 and 2010 depend strongly on the reanalysis. While both centennial ECMWF reanalyses show significant positive trends in STA over NAE in all seasons, 20CR (20CRv2c) shows a significant negative trend in summer and winter (only summer). Trends between 1950 and 2010 agree well across all reanalyses and show an increase of STA in winter over NAE, but disagree on the sign of the trend in summer (Table 3). Differences between reanalyses are larger over NPA and SPA and particularly large during summer months (Table 3, also Supplementary Figs. S6 and S7). A large increase in STA is visible over SPA after 1950, particularly in austral summer (DJF) for 20CR and 20CRv2c.

All four reanalyses agree on the interannual-to-decadal variations of STA over NAE (Fig. 4). Correlations of seasonal STA between reanalyses for the period 1901–2010 are positive, except in summer. They are generally very high for reanalyses from the same institution (>0.9) and otherwise between 0.40 and 0.75 (Fig. 3b). In summer only weak correlation (−0.3 to 0) between NOAA/CIRES and ECMWF reanalyses are found; however, these correlations are obfuscated by the large step-like increase in STA in the 1940s in CERA-20C and ERA-20C, evident in Fig. 4. If only the period 1901–1940 is examined, high correlations (>0.54) between reanalyses for summer STA are evident. This indicates that the interannual variability of STA is captured in all seasons over the entire 20th century, with the exception of the large increase in STA in ECMWF reanalyses for the 1940s. This step-like increase in STA is also present in the unfiltered Z500 variability (not shown).

Fig. 4.

Storm track activity (500 hPa geopotential height band-pass filtered variability [m]) over the North Atlantic/European domain (40°–60°N, 70°W–10°E) between 1851 and 2014. For multi-member datasets, the multi-member mean and the 10th and 90th percentiles are shown. Note that the y-axis is different for each panel in order to present results in greater detail.

The model simulation ERA-20CM generally shows a comparable STA to the reanalyses in terms of magnitude and location compared to reanalyses, although winter (summer) STA is about 7–11% higher (4–5% lower) in the North Atlantic compared to surface-input reanalyses between 1981 and 2010. It is worth noting that this step-like increase during the 1940s is not detectable in the ERA-20CM model ensemble, which suggests that the data assimilation is responsible for this increase.

ERA-PreSAT closely follows the two centennial ECMWF reanalyses, including the rapid increase in STA in summer in the 1940s. Except in winter (DJF), the absolute STA is generally higher in ERA-PreSAT than it is in CERA-20C and ERA-20C, particularly after 1950. For reference, full-input reanalyses such as ERA-interim (Dee et al., 2011) agree well with surface-input reanalyses over their common time periods (usually after 1980), but indicate a slightly (on the order of a few percent) higher STA in all seasons (not shown).

NPA shows large disagreement in terms of absolute STA before 1950, especially in summer, where STA can be up to double as high in 20CR and 20CRv2c compared to ERA-20C and CERA-20C. Interannual STA variability between surface-input reanalyses after 1950 is highly correlated, while only weak correlation is found before 1950 in winter, spring, and autumn (Supplementary Fig. S6). Over SPA we report good agreement between surface-input reanalyses from the same institution, but substantial variation between ECMWF and NOAA reanalyses (Supplementary Fig. S7).

This disagreement among reanalyses in the representation of STA before the 1950s corresponds well to the disagreement found for blockings, e.g. the increased blocking frequency over northern Canada matches the high STA over high latitudes. The increased variability in the mid-troposphere in the high latitudes indicates a northward shift of weather systems in NOAA/CIRES reanalyses before 1950.

3.1.3.

#### Circulation types

Figure 5 shows the annual frequency of CAP9 CTs over the Alpine region. Both 20CR reanalyses are characterized by a higher frequency of easterlies for the entire period in line with van den Besselaar et al. (2011) and Rohrer et al. (2018). The interannual variability is very similar to the two centennial ECMWF reanalyses. Prior to 1960, agreement of CTs among reanalyses decreases. Most notably, the representation of high pressure situations over the Alps (A Alps) and flat cyclonic west-southwesterlies (WSWcf) substantially differs in the late 1910s. Nonetheless, agreement among surface-input reanalyses is better for CTs than for storm tracks during the first half of the century.

Fig. 5.

Annual circulation type frequency over the Alpine domain (41°–52°N, 3–20°E) between 1851 and 2014 for CAP9 (Cluster Analysis of Principal components with 9 types). A 5-year running mean is applied for better readability. For multi-member datasets, the mean and the 10th and 90th percentiles are shown.

The ERA-20CM model ensemble captures the frequency of CTs over the Alpine region. There is a tendency to simulate the type ‘westerlies over southern Europe’ (W SEUc) too often at the expense of the type ‘flat cyclonic west-southwesterlies’ (WSWcf). The two other CTCs applied, GWT10SLP and GWT10Z500 confirm the finding that ERA-20CM captures the frequency of CTs very well, despite a slight overestimation of northerlies at the surface in GWT10SLP (not shown).

3.2.

### Links between low-frequency oceanic modes on mid-latitudinal flow and weather systems

Because of the large discrepancies between reanalyses before the 1940s, we focus on results for the period 1950–2010. Later an example is shown, how the inclusion of the years prior to 1950 affects the results.

Figure 6 shows the Z500 anomalies for positive (>67% quantile), negative (<33% quantile), and neutral (in between) phases of the AMO and PDO in winter (DJF). Reanalyses agree on the Z500 anomaly pattern during different phases of AMO (Fig. 6a). The positive phase of the AMO is characterized by positive Z500 anomalies over Eastern Europe and Greenland and a negative Z500 anomalies over Scandinavia. During the negative phase of the AMO, the opposite is true with negative Z500 anomalies over Eastern Europe and Greenland and a positive Z500 anomaly over Scandinavia. No significant links between Z500 and the AMO are found after using the FDR method. For ERA-20CM we find virtually no changes of Z500 depending on the AMO phase over the North Atlantic and Europe. Calculating pattern correlations between the Z500 anomalies of different datasets over the North Atlantic (40°–75°N, 70°W–10°E) reveals a very high spatial correlation between the different reanalyses for each phase of the AMO (>0.975). ERA-20CM shows positive spatial correlation coefficients for the positive AMO phase (0.37 to 0.49), while the neutral phase is anti-correlated (−0.53 to −0.56) with reanalyses. The negative AMO phase shows virtually no correlation between reanalyses and ERA-20CM.

Fig. 6.

The winter (DJF) 500 hPa geopotential height anomalies during the negative (left), neutral (middle), and positive (right) terciles of the AMO (upper half) and PDO (lower half) between 1950 and 2010. Four surface-input reanalyses (20CR, 20CRv2c, ERA-20C and CERA-20C) and the ERA-20CM model simulation are shown. In case of multi-member datasets, the multi-member mean is shown. No significant anomalies (determined by the 95% CI) were found in any of the panels using the FDB (Wilks, 2016) and 20 000 bootstrap iterations.

For a negative PDO, a positive Z500 anomaly is found over the North Pacific, and a negative anomaly ranging from the Rocky Mountains over the Bering Strait to East Siberia (Fig. 6b). The positive anomaly shifts poleward during the neutral phase of the PDO and the negative anomaly is now centred over Canada and second one over Northern Central Siberia. During the positive PDO phase, we find a negative Z500 anomaly over the North Pacific and positive Z500 anomalies over Siberia and Canada. ERA-20CM agrees well with these patterns, although the magnitude of the response is smaller and more symmetric. Consequently, pattern correlations of ERA-20CM with reanalyses are high (>0.5) for the positive and negative PDO phase over the North Pacific (40°–75°N, 150°–240°E) but not over the North Atlantic. Similar to the AMO, the anomaly patterns of the different reanalyses over the North Atlantic und North Pacific are highly correlated (r > 0.95) among reanalyses.

The agreement between reanalyses is high also in summer (JJA, not shown). A quasi-hemispheric increase (decrease) is found for the positive (negative) phase of AMO for ERA-20CM. Reanalyses agree on this, although only ERA-20C and CERA-20C show significant signals over Greenland, the European mainland and the Indian ocean.

3.2.1.

#### Blockings

We now address how the Z500 anomalies translate to blocking anomalies. Figure 7 shows that the signal is again independent of the reanalysis used. No significant anomalies are found for any AMO or PDO phase. We find fewer winter blocks in an arc ranging from Western Europe over Scandinavia and Siberia to the Bering Strait and over Northern Greenland and more blocks over Southern Greenland and more winter blocks over Scandinavia during the positive AMO phase. During the negative AMO phase, the opposite can be observed.

Fig. 7.

The winter (DJF) blocking anomalies during the negative (left), neutral (middle), and positive (right) terciles of the AMO (upper half) and PDO (lower half) between 1950 and 2010. In case of multi-member datasets, the multi-member mean is shown. No significant anomalies(determined by the 95% CI) were found in any of the panels using the FDB (Wilks, 2016) and 20 000 bootstrap iterations.

Unlike the AMO, the imprint of the PDO on blocking not symmetric. For example, in general blocks are less frequent over continental Eurasia during the neutral phase of the PDO and more frequent during the negative and positive phase. In contrast, there is an increase of Greenland blocks during the neutral phase and a decrease of Greenland blocks during the negative and positive PDO phase. However, we report no significant changes for the period 1950–2010. ERA-20CM shows no changes in blocks related to the PDO.

Pattern correlations of block anomalies over the North Atlantic related to the AMO/PDO are again high for reanalyses, although not as high as for Z500 anomalies (>0.74). The model simulation ERA-20CM does not capture the patterns obtained from centennial reanalyses (−0.47 to 0.49) over the North Atlantic and North Pacific.

Consequently, Z500 anomalies do not necessarily translate to block anomalies. In the ERA-20CM model ensemble, the different phases of the AMO and PDO do not have any influence on the frequency of blocks. This holds also true in summer (not shown) when no significant changes are detected. We conclude that although our sample with 61 years is small in relation to the time scale of the AMO and PDO oscillation, a linear response between phenomenon (AMO or PDO) and circulation may not be presumed.

3.2.2.

#### Storm tracks

For STA, we do not find any significant changes depending on the phase of the AMO between 1950 and 2010, but the signal is very similar in all reanalyses (Fig. 8). Over the North Atlantic (Greenland) an increase (a decrease) in STA can be seen during the negative AMO phase indicating a strengthening of the North Atlantic storm track. During the positive AMO phase the central North Atlantic experiences a STA decrease, while a belt ranging from Greenland over southern Scandinavia to western Central Europe tends towards a higher STA. An increase of STA can be noted over the Iberian Peninsula during the neutral phase, whereas STA decrease over Greenland and Scandinavia. The ERA-20CM does not simulate any large change to the AMO.

Fig. 8.

The winter (DJF) storm track anomalies during the negative (left), neutral (middle), and positive (right) terciles of the AMO (upper half) and PDO (lower half) between 1945 and 2010. In case of multi-member datasets, the multi-member mean is shown. The stippling denotes areas with significant anomalies(determined by the 95% CI) after applying the FDR (Wilks, 2016) and 20 000 bootstrap iterations.

In case of the PDO, we detect a significant signal in STA over the North Pacific between 1950 and 2010. During the negative (positive) PDO phase there is a northward (southward) shift and extension (contraction) of the storm track (Fig. 8). This signal is also significant in the ERA-20CM model ensemble. Over the North Atlantic the negative and neutral PDO phase is associated with a more active storm track around 70°N (Greenland and Scandinavia) while the STA decreases over the mid-latitudes. For the positive PDO phase the opposite is true.

Supplementary Fig. S8 shows how the inclusion of the first half of the 20th century drastically changes results. Here, large and significant differences between NOAA and ECMWF centennial reanalyses are discernible. They are an imprint of the decrease in STA in ECMWF reanalyses before the 1950s and of the increase in STA over the high-latitudes in NOAA reanalyses combined with the uneven distribution of these years into the different AMO/PDO phases, highlighting the fact that care must be taken when analysing decadal variability or long-term trends in long reanalysis data sets.

Pattern correlations over the North Atlantic and North Pacific are not only very high (>0.88) for reanalyses in all AMO/PDO phases, additionally ERA-20CM reproduces the STA anomaly patterns during the positive (∼0.7) and negative (∼0.5 for the North Atlantic, ∼0.9 for the North Pacific) PDO phase.

The connection between the PDO and summer STA over the North Pacific is visible, but an order of a magnitude smaller compared to winter (not shown). All reanalyses and ERA-20CM show a similar pattern, and the pattern in ERA-20CM is partly significant. An increase (decrease) over higher latitudes is visible for the negative (positive) PDO phase.

3.2.3.

#### Circulation types

Finally we discuss the influence of the AMO and PDO on the frequency of CAP9 CTs (Fig. 9), i.e. the atmospheric circulation over Central Europe. We find no significant differences of CT frequency between different AMO or PDO terciles that are found consistently throughout all reanalyses. Albeit not significant, reanalyses consistently show similar connections between CTs and AMO and PDO, respectively. Results for GWT are very similar (not shown).

Fig. 9.

Normalized circulation type frequencies for CAP9 (Cluster Analysis of Principal components with 9 types) during high (triangle facing up), neutral (circle), or low (triangle facing down) AMO (upper half) and PDO (lower half) indices between 1950 and 2010 in winter (DJF). Different reanalyses (20CR orange, 20CRv2c red, CERA-20c blue, and ERA-20c black) and the ERA-20CM model simulation (cyan) are depicted. For multi-member datasets, the multi-member mean is shown. The lower panel shows significant differences between the neutral and negative terciles (upper row), the positive and neutral terciles (middle row), and the positive and negative terciles (lower row). Significance is tested with a Welch’s t-test augmented to account for autocorrelation (e.g. Wilks, 2006). See Table 1 for a short description of the circulation types.

In the model simulation ERA-20CM only weak connections of the CTs and the oceanic modes of variability are found compared to the reanalyses. Although the 610 years (10x61 years) only capture a few full AMO and PDO cycles, this indicates that these two oceanic modes do not play a significant a role in changing Alpine CTs or that their role is masked by interannual variability. Results do not change, if the full 1901–2010 period is taken into account (not shown). This indicates that sea level pressure may be more reliably represented in centennial reanalyses than 500 hPa geopotential, which seems plausible considering that they only assimilate surface pressure data.

Summer results (not shown) do not show many significant results. Only cyclonic NorthEasterlies CT is significantly more frequent during the neutral phase of the PDO compared to the positive and negative phase. Hence, we report an asymmetric response of this CT to the PDO, in agreement with blocking and STA results.

4.

## Discussion

Our analysis extended the work of Rohrer et al. (2018) and examined how well centennial surface-input reanalysis agree with each other in terms of mid-latitudinal flow features between 1901 and 2010.

Blocks, storm track activity (STA) over the North Atlantic/European (NAE) region and Alpine CTs are represented similarly in surface-input reanalyses in the 20th century in terms of interannual variability. Large differences between reanalyses in terms of blocking frequency and STA of up-to 100% are detected prior to 1950, leading to differing trends in different reanalysis products. We conclude that currently available centennial reanalyses are generally not suitable for deriving linear trends, at least before 1950 and in the mid-troposphere. Over the North Pacific domain (NPA), the agreement after 1950 is comparable to that of NAE but declines before 1950, especially in boreal summer (JJA). The South Pacific region (SPA) shows contradicting results among reanalyses also after 1950, especially in austral summer (DJF).

Discrepancies among centennial reanalyses are not exclusive to mid-latitudinal and mid-tropospheric circulation features. Similar differences between reanalysis products are obtained for temperature and precipitation extremes by Donat et al. (2016) and for Eurasian snow depth by Wegmann et al. (2017a). Brands et al. (2017) mentioned large disagreement between 20CR and ERA-20C in the early 20th century for atmospheric rivers, particularly over the US West Coast. Chang and Yau (2016) and Wang et al. (2016b) provided an intercomparison of extra-tropical cyclones in 20CR and ERA-20C focusing on the late 20th century, when intercomparison with full-input reanalyses is possible. To our knowledge, only Befort et al. (2016) compare extra-tropical cyclones throughout the whole 20th century. They found substantial differences in the low-frequency variability between ERA-20C and 20CR especially prior to 1950, while the high-frequency variability shows better agreement. These findings are consistent with those of this study, which also incorporates the new CERA-20C reanalysis. Good agreement in the second half of the 20th century does not imply good agreement prior to the 1950s. Bloomfield et al. (2018) investigated the long-term changes in the wintertime surface Arctic Oscillation and Northern Hemisphere storminess in the ERA20C reanalysis. They found a significant increase in wintertime storminess over the North Atlantic and North Pacific in ERA-20C due to a significant change in the Artic Oscillation pattern. This is consistent with results presented in this study for ERA-20C and also its successor CERA-20C.

We suspect that inhomogeneities at least partially arise from temporal changes in the assimilation: In the case of 20CR some inhomogeneities (i.e. step changes) coincide with changes of the inflation of the covariance matrix (Compo et al., 2011; their Table 1). Large changes are particularly evident in 1870 for summer blocks over Greenland (Fig. 2c) and STA over the North Pacific in 20CRv2c (Supplementary Fig. S6c) or in 1920 for summer blocks and STA over the North Pacific (Supplementary Figs. S4c and S6c). In accordance, Brönnimann et al (2012) noted that trend analyses on surface winds in 20CR are best performed after 1950, when the covariance inflation parameter remains globally constant. Also Wang et al. (2013) point out that inhomogeneities in cyclone counts in 20CR are related to temporal changes in the density of observations and to changes in the covariance inflation matrix. They however find that storm track activity derived in 20CR is simulated in accordance with observations. On the other hand, Krueger et al. (2013) found differing trends in storminess in 20CR and in observations. We show that temporal changes in the covariance inflation parameter have a large impact on circulation metrics in the mid-troposphere, far away from the data assimilation at the surface. In the case of the ECMWF reanalyses, the temporally varying background error may lead to the reduced STA and blocking frequencies seen prior to 1950. A hint for this may be the large decrease in the background error of surface pressure between 1940 and 1960 shown in Poli et al. (2013) for ERA-20C.

In short, our results suggest that although synoptic time-scales might be represented adequately, the low frequency variability differs substantially among centennial reanalysis with every reanalysis containing artificial jumps in the temporal evolution of block frequency and STA. Due to these large discrepancies before 1950, we shortened the investigated period to 1950–2010 for our analysis of the connection between the AMO and PDO with mid-latitudinal features. Supplementary Fig. S8 showed how the inclusion of the years 1901–1949 drastically changes results in the case of STA.

We found only few significant connections between the two low-frequency oceanic modes and atmospheric circulation. Generally centennial reanalyses show very similar anomalies (pattern correlations of at least 0.8). Consistent with literature (e.g. Newman et al., 2016), a deepening (shallowing) of the Aleutian low is found during the positive (negative) phase of the PDO both in reanalyses and the ERA-20CM model ensemble, although it is not significant (Fig. 6). These PDO related Z500 anomalies do not translate into a significant change of blocks (Fig. 7). A significant southward (northward) shift of the STA is discernible during the positive (negative) phase of the PDO (Fig. 8). The atmosphere-only model ensemble ERA-20CM displays the atmospheric response, which may point to the conclusion that the ocean drives the atmosphere in the Pacific. Studies such as Alexander et al. (2002) argued that the tropical Pacific interacts with the mid-latitudinal atmospheric circulation via the atmospheric bridge, which describes the atmospheric response to anomalous tropical Pacific SSTs.

Compared to e.g. Woollings et al. (2012) or Yamamoto and Palter (2016), we find weaker, non-significant relationships between AMO and mid-latitudinal features. The summer Z500 response is mainly thermodynamic with a thermal expansion (contraction) during the positive (negative) phase of the AMO, which is in line with O’Reilly et al. (2017). Häkkinen et al. (2011) found a link between blocks over the northern North Atlantic and the AMO. We do not find such a link. However, they looked at multi-decadally filtered time series while we used the unfiltered AMO time series and they included March their winter season. Furthermore, we only focused on the second half of the 20th century.

Establishing relations between oceanic variability modes and mid-latitude circulation features might be of interest for better interpreting climatic events such as the early 20th century warming (e.g. Thompson et al., 2015; Wegmann et al., 2017b; Hegerl et al., 2018). However, our assessment of the reliability of circulation features in reanalyses prior to 1950 precludes further analyses.

Despite their large differences, centennial reanalyses are important for climate science. By combining different data sources and using a state-of-the-art NWP model, they provide the best spatio-temporally complete estimate of the state of the atmosphere that we have. The ensemble approach of 20CR, 20CRv2c and CERA-20C, as well as the intercomparison of these datasets, provides a rough estimate of the uncertainties of a climate variable. For the period after 1950 surface-input reanalyses can provide an accurate state of the atmosphere at least up to the mid-troposphere when enough observations (of adequate quality) are available. This view is shared by e.g. Dell’Aquila et al. (2016), who intercompared mid-latitude wave activity in 20CR and ERA-20C.

Initiatives gathering and digitizing historical observations (Allan et al., 2011; Stickler et al., 2014; Buizza et al., 2018) are of great interest to potentially reduce the uncertainty of centennial reanalyses as well as enable an extension back in time. Likewise, the experimental product ERA-PreSAT shows the feasibility of the incorporation of early upper-air observations, although the sharp decline in STA present in CERA-20C and ERA-20C is still noticeable before 1950.

The novel CERA-20C reanalysis, the first centennial reanalysis to incorporate SSTs and salinity measurements, is very similar to the ERA-20C both in terms of response to the AMO/PDO and in terms of climatology of the mid-latitudinal features.

The atmospheric ERA-20CM model ensemble is able to simulate the climatology of mid-latitudinal circulation features. The westerly wind bias over the North Atlantic/European region often found in model simulations (e.g. van Ulden and van Oldenborgh, 2006; Otero et al., 2018; Rohrer et al, 2018) is not found in ERA-20CM. The large changes of the STA by the PDO found over the North Pacific indicate that here the ocean forces the atmosphere. Over the North Atlantic, no significant signal is detectable in any season. Because SSTs are prescribed in ERA-20CM, this may imply that atmospheric circulation over the North Atlantic is not substantially driven by the SSTs. Kavvada et al. (2013) reported that GCMs struggle to correctly simulate the extratropical response of atmospheric circulation to the AMO.

In contrast to the reanalyses, ERA-20CM does not simulate trends in blockings, cyclones, or Alpine CTs. This indicates that either trends in centennial reanalyses prior to 1950 are unphysical or that the model simulation does not succeed in capturing the long-term trend.

5.

## Conclusion

We investigated (i) the representation of mid-latitudinal features such as blocks, storm tracks and CTs between 1901 and 2010 and (ii) their response behaviour with respect to the two most important multi-decadal oceanic modes, the AMO and the PDO, between 1950 and 2010. Our main conclusions are that:

• Centennial reanalyses are very similar to one another in the representation of blocks and storm tracks after the 1950 in the Northern Hemisphere. Given enough observations surface-only reanalyses succeed to capture atmospheric circulation at least up to the mid-troposphere.
• Before 1950 discrepancies between reanalyses become larger, especially for storm track activity and blocks at high latitudes.
• These discrepancies can affect the outcome of studies depending on which reanalysis is employed. Here, we demonstrated that in the case of storm track activity and their connection to the AMO and PDO. Studies focusing on year-to-year variability or case studies are more feasible than studies on trends before 1950.
• We find a significant connection between storm track activity and the PDO over the North Pacific and partially over the North Atlantic. Otherwise no significant connections between the AMO or PDO and mid-latitudinal circulation is detected over applying the False Detection Ratio method.
• The ERA-20CM model ensemble with prescribed SSTs adequately simulates the climatology of blocks, storm track activity and CTs, but in most cases spatial anomaly patterns related to the AMO do not correlate with reanalyses. This may be related to the missing feedback of the atmosphere to the ocean in ERA-20CM, which may be needed to model the impacts of the AMO or it indicates that there is no physical link exists between the AMO and the atmosphere. The strong and significant anomaly patterns of the storm track activity over the North Pacific related to the PDO is visible in centennial reanalyses and ERA-20CM, indicating that the (tropical) North Pacific may drive the atmosphere.

Based on these results, we note that centennial reanalyses provide a consistent estimate of the state of the atmosphere back to roughly 1950, including the low-frequency variability in storm tracks and blocks. Before 1950 inhomogeneities and substantial differences in the trend and low-frequency variability of storm tracks and blocks become apparent. Therefore we recommend the use of different reanalysis products from different providers (i.e. ECMWF and NOAA for now) to study climate variability and to obtain a (incomplete) measure of uncertainty. If feasible validation with independent sources of evidence (e.g. with homogenized observational data or indirect proxies) is recommended.