^{*}

^{a}

^{a}

We examine the perturbation update step of the ensemble Kalman filters which rely on covariance localisation, and hence have the ability to assimilate non-local observations in geophysical models. We show that the updated perturbations of these ensemble filters are not to be identified with the main empirical orthogonal functions of the analysis covariance matrix, in contrast with the updated perturbations of the local ensemble transform Kalman filter (LETKF). Building on that evidence, we propose a new scheme to update the perturbations of a local ensemble square root Kalman filter (LEnSRF) with the goal to minimise the discrepancy between the analysis covariances and the sample covariances regularised by covariance localisation. The scheme has the potential to be more consistent and to generate updated members closer to the model’s attractor (showing fewer imbalances). We show how to solve the corresponding optimisation problem and discuss its numerical complexity. The qualitative properties of the perturbations generated from this new scheme are illustrated using a simple one-dimensional covariance model. Moreover, we demonstrate on the discrete Lorenz–96 and continuous Kuramoto–Sivashinsky one-dimensional low-order models that the new scheme requires significantly less, and possibly none, multiplicative inflation needed to counteract imbalance, compared to the LETKF and the LEnSRF without the new scheme. Finally, we notice a gain in accuracy of the new LEnSRF as measured by the analysis and forecast root mean square errors, despite using well-tuned configurations where such gain is very difficult to obtain.

The ensemble Kalman filter (EnKF) has been shown to be a successful data assimilation technique for filtering and forecasting in complex chaotic fluids (see Evensen,

Localisation assumes that correlations between spatially distant parts of the physical system decrease at a fast rate with the physical distance, e.g. exponentially. As a consequence, one can make the assimilation of observations local or, alternatively, artificially taper distant spurious correlations that emerge from sampling errors (Hamill et al.,

Domain localisation consists of a collection of local updates, e.g. centred on the grid points using nearby observations (Houtekamer and Mitchell,

The second type of localisation is covariance localisation which is enforced through a direct tapering of all sample covariances. This is usually implemented using a Schur product of the sample covariance matrix with a correlation matrix with fast decreasing entries with the distance. The Schur product output is mathematically guaranteed to be a covariance matrix and, with a proper localisation correlation matrix, is likely to make the regularised covariance matrix full-rank.

Even though based on the same diagnostic, the two types of localisation are distinct in their philosophy, and in their algorithmic and numerical implementation. Domain localisation does not allow assimilating non-local observations such as radiances without ad hoc approximations, but the scheme is embarrassingly parallel by nature. Covariance localisation is mathematically grounded in the tapering of the background covariance only and could hence be seen as a well understood scheme, but its numerical implementation, relying on a single global analysis, is much less simple, especially for deterministic EnKFs. In practice, the two schemes have been shown to coincide in the limit where the analysis is driven by the background statistics, i.e. weak assimilation (Sakov and Bertino,

Note that a third route for localisation is through the statistics technique known as

From a theoretical standpoint, the localisation schemes seem ad hoc in spite of their remarkable practical efficiency. There could be room for improvements based on theoretical considerations. For instance, localisation can be made multiscale (Buehner and Shlyaeva,

In this paper, we would like to revisit the perturbation update step of the EnKF when relying on covariance localisation. We especially focus on the local ensemble square root Kalman filter (LEnSRF). Traditional EnKF schemes offer a

In

In this study, the main focus is covariance localisation within deterministic EnKFs, and in particular the LEnSRF, defined as the ensemble square root Kalman filter with covariance localisation. Nonetheless, some of the results or remarks are likely to be valid for other variants of the EnKF.

The ensemble is denoted by the matrix

To fix this, covariance localisation uses a localisation (i.e. correlation) matrix

The mean analysis in the EnKF is then typically carried out using the Kalman gain matrix

Note that an alternative way to implement the mean update is to use the

With the local stochastic EnKF (Houtekamer and Mitchell,

The perturbation update with a local deterministic EnKF is not as straightforward since localisation must also be enforced in the square root update scheme besides the mean update. However, there are deterministic EnKFs where this operation is actually simple. In the DEnKF (Sakov and Oke,

In the absence of localisation, the perturbation update of a deterministic EnKF is rigorously implemented by a transformation on the right of the prior perturbation matrix (Bishop et al.,

The ^{1}

From now on, we shall omit the rotation matrices

The right-transform

An approximate update formula extrapolates

Similarly to

It is numerically challenging to apply

Independently from how it was obtained, this mode expansion can be written as

With such a mode expansion, the updated perturbation matrix reads:

This update was later rediscovered in Bishop et al. (

As proven in

Note that both

This type of updates can make the LEnSRF numerically affordable, especially with parallelisation (Farchi and Bocquet,

In

The regularised background error covariance matrix _{a,r} is the anomaly matrix of the _{x} updated perturbations. It is an exact (hence consistent by definition) representation of the uncertainty since it is readily checked that

Of course, this is only theoretical, since, in practice, we can only afford to generate and propagate

The reason behind this apparent paradox is that in a cycled LEnSRF experiment based on

In retrospect, this clarifies why

The perturbations of the serial LEnSRF, the DEnKF and the local stochastic EnKF follow the same paradigm. By contrast, the local update perturbations of the LETKF are meant to capture most of the uncertainty within each local domain. Hence, the anomalies of the forecast ensemble are representative of the main uncertainty modes, as opposed to the other EnKF schemes. However, even though the local updated perturbations of the LETKF may offer better samples of the conditional pdf, this property could eventually fade away in the forecast because of their local validity.

Incidentally, this suggests that the LETKF could be better suited for ensemble short-term forecast, which could be investigated in a future study. Numerical clues supporting this idea are nonetheless provided at the end of

We have just seen that the widespread view on the local EnKF perturbation update which assumes a low-rank extraction

For the LEnSRF update, we believe that it would be more consistent with how the perturbations are defined to look for a low-rank perturbation matrix

With the definition

This problem is similar to the ^{2}

Hence, we expect that our problem

The minimisation problem

Sequence of steps of a deterministic EnKF with covariance localisation, where the updated perturbations are obtained using the new scheme. Note that

With a view to efficiently minimising

This yields the matrix gradient

Instead of minimising

A parametrised minimisation can easily be implemented using the function

We use this parametrised minimisation in all the numerical experiments. However, the plain method using the unparametrised minimisation on

In

Finally, coming back to the definition of

Because we have offered a novel view on the posterior perturbations and how they are generated in the analysis, we now need to examine how the forecast step of the scheme is affected by this change of standpoint. If not, there would be a risk of breaking the consistency in the forecast step of the cycle.

As previously explained at the end of

By contrast, with the new LEnSRF scheme, recognising that the posterior error covariance matrix is

With the assumption that the error dynamics are linear, which would only be valid on short time scales, Bocquet (_{k}

The tangent linear model _{k}

In the case where the dynamics can be approximated as hyperbolic, and in the limit where space is continuous, a closed-form equation can be obtained for

The key point is that in practice and for moderate forecast lead times, _{k}

This suggests that

In this section, we analyse the cost of computing the cost function and the gradient. Indeed, both would be required by a quasi-Newton minimiser and both involve

The cost function

As a consequence, the cost of evaluating

Thus, we need to consider the cost of evaluating both terms in the right-hand side. The normalising factor

Hence, for both the cost function and its gradient, we need to evaluate a first term in the form

It can be shown that

The numerical complexity of

If

If

Let us now consider the complexity of computing the second term. Assuming

If

If we assume that we have extracted

Note that

If the observations are assumed to be local, i.e. each one of them is only correlated to nearby model variables, then the main diagonals of

Hence, the evaluation of

Of course, one of the primary reasons for using covariance localisation is its ability to assimilate non-local observations. Hence, the assumption of locality made here defeats one of the key purpose of using covariance localisation. Nonetheless, we shall see that even with local observations, the update scheme developed in

At first, we are interested in comparing the shape of the updated perturbations from a standard scheme compared to those of the new scheme. We also wish to explore how much can

We compare the shape of

random draws

extracting the main

extracting

Density plots of the covariance matrices discussed in the text, except for

For the same realisation,

Plot of the

It is clear from ^{3} realisations. The results are reported in

Averaged Frobenius norm that measures the discrepancy between the target covariance matrix

For the sake of comparison note that, on average,

As seen in

The performance of the new scheme is tested in a mildly nonlinear configuration of the discrete 40-variable one-dimensional Lorenz–96 (L96) model (Lorenz and Emanuel,

We consider twin experiments where synthetic observations are generated from the true model trajectory every

We test the following data assimilation schemes:

The standard LETKF as defined by Hunt et al. (

The LEnSRF as defined in

The new LEnSRF with the new updating scheme. The

When

For each configuration, 10 data assimilation experiments are run. Each run is

Comparison of the LETKF, the LEnSRF and the LEnSRF with the new update scheme, applied to the L96 model (left column) and to the KS model (right column). The RMSE, optimal localisation and optimal inflation are plotted as functions of the ensemble size

First, the LETKF and the LEnSRF show similar RMSEs, and optimal inflation for all ensemble sizes, but the LETKF has the edge for both the RMSE and the inflation. The optimal localisation lengths for the three schemes are similar, in particular thanks to the approximate correspondence between the way the observation precision matrix is tapered in the LETKF and the way the background error covariance is tapered in the LEnSRF. Nonetheless the localisation length of the traditional LEnSRF is smaller than the other two EnKFs, especially for larger ensemble sizes.

Second, the new LEnSRF with the new update shows lower RMSEs, and significantly lower optimal inflation than the other two schemes. The improvement in the RMSE is in the range

Focusing on the multiplicative inflation requirement, we have computed the RMSE as a function of the inflation, with the localisation length optimally tuned so as to minimise the RMSE, for three ensemble sizes

Time-averaged RMSE as a function of the multiplicative inflation, the localisation length being tuned so as to minimise the RMSE. The L96 results are displayed on the left panels while the KS results are shown on the right panels, for

It shows that the requirement of the new LEnSRF for inflation is actually very small. In the case

By construction,

We speculate that this lesser need for multiplicative inflation in the new LEnSRF may also be interpreted as a reduced imbalance of the updated perturbations. If true, this implies that for the L96 model in this standard setup, the residual inflation required in the LETKF and LEnSRF does not so much originate from the sampling errors but from the imbalance generated by localisation. This, however, can only be validated on physically more complex,

We have performed similar experiments with the Kuramoto–Sivashinsky (KS) model (Kuramoto and Tsuzuki,

For each configuration, 10 data assimilation experiments are run. Each run is

The results are very similar to those of the L96 model. The LEnSRF with the new update scheme outperforms the other two schemes, with a much lower need for inflation, and optimal localisation lengths similar to that of the LEnSRF without the new update scheme. For this model, the optimal localisation length for the LETKF is however larger than for both LEnSRFs.

The requirement for multiplicative inflation is further studied similarly to the L96 case. The right column of

Localisation schemes can behave differently in presence of sparse and inhomogeneous observations. Moreover, we have conjectured that the new perturbations update scheme could generate an ensemble with less imbalance (closer to the attractor), which could be evidenced with longer forecasts in the EnKF than those considered so far. Hence, in this section, we consider:

A first set of experiments where the state vector entries are uniformly and randomly observed with a fixed density

A second set of experiments where the observations are spatially densely observed (

We choose to focus on the L96 model and an ensemble size of

For the first set of experiments, we plot in

Comparison of the LETKF, the LEnSRF and the LEnSRF with the new update scheme, applied to the L96 model, for a fixed ensemble size

The results are very similar to those obtained in the previous subsection: the new LEnSRF scheme yields a typical 5% improvement in the RMSE, while using a significantly lower multiplicative inflation. In the left panel of

For the second set of experiments, we plot in

In the right panel of

Comparison of the LETKF, the LEnSRF and the LEnSRF with the new update scheme, applied to the L96 model, for a fixed ensemble size

Time-averaged RMSE for the L96 model as a function of the multiplicative inflation, the localisation length being tuned so as to minimise the RMSE in the two configurations where the observations are sparser (

We have also computed the ratio of the analysis RMSE over the ensemble spread, as

Note that we have also considered time-averaged forecast RMSE and spread for a range of forecast lead times. They follow the same trend as the analysis RMSE and analysis spread but are progressively amplified with increasing lead time.

All of these experiments have also been conducted with the KS model. The results are qualitatively very similar and yield the same conclusions for both the sparse and infrequent observation experiments.

In this paper, we have looked back at the perturbation update scheme in the EnKFs based on covariance localisation. We have argued that updated perturbations in the local EnKFs based on covariance localisation do not represent the main modes of the analysis error covariance matrix, in contrast to the updated perturbations of the LETKF. In particular, we have focused on the LEnSRF. We have explained why

Using these considerations, we have proposed a perturbation update scheme potentially more consistent in the sense that the perturbations

We have tested this idea and defined a new LEnSRF with the new perturbation update scheme. We have compared it numerically to the LETKF and to a vanilla LEnSRF based on an implementation of

We plan on testing this new scheme on two-dimensional models and more sophisticated physics. We also plan to study the potential benefit of such update scheme in an hybrid setup (i.e. using hybrid covariances).

The authors are thankful to two anonymous reviewers for their insightful comments and suggestions. CEREA is a member of Institute Pierre-Simon Laplace (IPSL).

There are actually two main definitions of a matrix square root. The main one in mathematics defines a square root of

It is actually known to be NP-hard.

It generalises the classical Cauchy’s intregral formula using the Jordan decomposition of matrices. See for instance, Eq. (2.7) in Kassam and Trefethen (2005).

In this appendix, we (i) give an alternate and rigorous derivation to the heuristic one proposed in appendix B of Bocquet (

Let

Let ^{3}

In particular, let us apply

Choosing

Here, we have assumed that

This two-step update (update in observation space followed by a linear regression in state space) generalises the algorithm of Anderson (

In this appendix, we study the dependence of the new perturbation update on the choice of the matrix norm. A generic

The case

We generalise the perturbation update function

Once again, we have chosen to apply a logarithm function to the Schatten

It turns out that it is possible to analytically compute the gradient of

Using this lemma, we obtain the matrix gradient:

Note that in the limiting case of the spectral norm (

We have tested the choice of these Schatten norms in the range

Average analysis RMSE as a function of the norm