Data assimilation (DA) is the process by which observational data are incorporated into numerical models to improve knowledge of the trajectory, and uncertainties, of the state. In meteorological models, DA is used to generate an optimal initial state to be used in subsequent forecasts through the use of prior knowledge of the state (e.g. a previous forecast) and observational data obtained from satellites/weather stations, etc. This combination of prior information and observational data is commonly referred to as state estimation.
Models have parameters that are often unobservable quantities, such as latent heat flux, entrainment rate and surface albedo. Using observations to infer the values that these parameters take is known as parameter estimation. Parameter estimation is a more complex problem than state estimation, as even for simple, linear models, the parameter estimation problem becomes non-linear (Evensen et al., 1998). The majority of parameter estimation methods are based on some form of state augmentation, where the state vector is augmented, or extended, with a vector of parameter values. This allows the parameters to be estimated via the DA analysis equations in the same manner that any unobserved prognostic variables would be updated. In a sequential DA scheme, this would involve updating the parameters at every observation time during the model run.
State augmentation is an effective method of parameter estimation and is used to estimate optimal parameters with varying degrees of success (e.g. Annan et al., 2005; Kondrashov et al., 2008; Smith, 2010). However, the process of changing the parameters during a model run may lead to parameters being modified to a region of dynamical instability which may cause the model run/state estimation to fail (Yang and Delsole, 2009; Williamson et al., 2014). Another common problem with state augmentation is that the parameter ensemble tends to collapse before an optimal parameter value is found (Santitissadeekorn and Jones, 2014). This can be mitigated by adding a small perturbation to the parameter in order to stop the ensemble from collapsing before the optimal parameter is found. However, the method of generating the perturbation to add to the parameter value is not well understood, with suggestions for improving the parameter ensemble in Liu and West (2001). State augmentation methods, however, cannot be used to estimate systematic functional errors in the parameterisations.
Parameterisations are functions that represent simplified forms of processes in the models that are either sub-grid scale processes or are too complex (either due to lack of scientific understanding or not enough computational power) to represent explicitly. For example, in numerical weather prediction (NWP) models, parameterisations are used to approximate processes such as convective processes, cloud microphysics and radiative processes.
Parameterisation estimation is even more complex than parameter estimation, as it requires not only the estimate of correct parameters, but also the estimate of the functional form of the parameterisation itself. Furthermore, as parameters/parameterisations do not necessarily represent true physical qualities that occur within a system, rarely are they observed quantities.
Parameterisation estimation is often done in an ad-hoc way by running models using the current parameterisations and compensating for errors in the functional forms by adding new parameters or functions when they arise. However, these new additions may not improve a fundamental error with the parameterisation. This is because forecasts are often verified against reanalysis data every 1–6 hours, hence errors have time to accumulate making it difficult to infer the source of the errors.
In this study, DA is used to obtain information about errors in the numerical models, specifically errors that come from incorrectly specified parameterisations. A new method of parameterisation estimation is proposed that makes use of the model equations that will allow for an estimate of the structure in the model errors to be found. This new method uses the differences between a DA analysis and an analysis forecast to estimate the errors in the parameterisations each space-time point. In this study, this parameterisation estimation method is applied to an advection model with additive model error as it is a simple model where it is well known how changes in the functional form affect the dynamics of the system.
This paper shall describe the DA methods used in Section 2. Section 3 outlines the new parameterisation estimation method that is proposed by this study, followed by numerical results and conclusions in Sections 4 and 5, respectively.
In this section, the data assimilation method used in this study, the ensemble Kalman smoother (EnKS), is described, and the notation used throughout this paper is introduced.
The ensemble Kalman filter (EnKF) was developed by Evensen (1994) and has many different variations that are mainly split into two groups. There are the stochastic EnKFs, such as developed by Burgers et al. (1998) and Houtekamer and Mitchell (1998); and the Ensemble Square Root Filters [EnSRF, Tippett et al. (2003)], including the ensemble adjustment Kalman filter [EAKF, Anderson (2001)] and the ensemble transform Kalman filter [ETKF, Bishop et al. (2001) and Majumdar et al. (2002)]. This study shall use the stochastic EnKF [from Burgers et al. (1998)] but any of the other methods can be used.
The EnKF is a sequential DA method that uses Monte Carlo methods to approximate the error statistics of the state. The evolution of the state is given by the following equation:
The EnKF approximates the forecast error covariance matrix, ${\mathbf{P}}_{\mathbf{t}}^{\mathbf{f}}$, by using an ensemble of M distinct state vectors at time t, X(t), such that:
The k^{th} set of observations (occurring at timestep t_{k}) denoted y_{k} with an observation error ε_{k} is assumed to be Gaussian, with zero mean and a covariance matrix given by ${\mathbf{R}}_{\mathbf{k}}\in {\mathbb{R}}^{{p}_{k}\times {p}_{k}}$ (where p_{k} is the dimension of y_{k}). The observations are related to the state by:
The initial ensemble is created by sampling ensemble members from a (normal) distribution centred on the initial background state with covariance matrix, ${\mathbf{P}}_{\mathbf{0}}^{\mathbf{f}}$, generated from prior information. The ensemble is then propagated forward using the model equations to the observation time, at which point the state is updated by the formula:
The result is a new analysis ensemble of state vectors which are then propagated forward by the model. The process is repeated every time there is an observation.
When using the ensemble to estimate the forecast error covariance matrix, spurious correlations can arise in the ensemble covariance matrix that would not occur in the full forecast error covariance matrix. The method used to reduce the impact of these spurious correlations is localisation [see, for example Houtekamer and Mitchell (2001) and Janjic et al. (2011)]. To do this, a localisation function, $\mathrm{\mathcal{L}}$ is defined, typically a function of distance in the spatial domain. The localisation matrix, ${L}_{i,j}=\mathrm{\mathcal{L}}(i,j)$ is defined, where i and j are points in the domain and is applied to ${P}_{k}^{f}$ via the Schur, or Hadamard, product such that:
The EnKS was developed by Evensen and van Leeuwen (2000) and is an extension of the EnKF that aims to update the state vector between observations. This is done by considering the temporal cross-covariances between the model state at the analysis timestep and the observation timestep to update the state accordingly. The first step of the EnKS is to run the EnKF to the observation time, t_{k}, to obtain the analysed state vector, ${\mathbf{x}}_{\mathbf{m}}^{\mathbf{a}}\left({t}_{k}\right)$. The ensemble forecast cross-covariance matrices between timesteps t_{l} and t_{k} are then calculated as:
The m^{th} ensemble member of the state vector at t_{l} is then updated using the following equation:
In this section, we discuss a new algorithm for parameterisation estimation using a DA scheme. Traditionally, new parameterisations are chosen by making a large number of forecasts with different parameterisation versions in the model and testing which verifies the best. The method that we propose identifies errors in the functional form of the model (i.e. the model equations) based upon results obtained from a sequence of DA steps. The methodology is general and will apply to any numerical model of any dimension. Let the true state vector at timestep t be denoted by x(t). Hence, the evolution of this state over one timestep can be written as:
The method consists of two parts: (1) finding an estimate of the model error at every space-time point and every variable and (2) extract the structural part of the model error and generate a new parameterisation.
Part I: Estimation of model errors via state estimation
The most important ingredient of this new method is that concentrates on estimating parameterisations at the level of the model equations directly, and not after errors have accumulated after several timesteps. In the latter case, it will be very difficult to unravel the cause of the accumulated model error in non-linear models.
The first step of the parameterisation estimation method is to run a DA scheme with a prior estimate of the parameterisation, G^{b}, generating a time series of analysis state vectors, x^{a}(t) which represent the best estimate of the true state through time.
Using the time series of x^{a}(t) values generated, a new time series representing a forecast of the analysis state, $\tilde{x}\left(t\right)$, is generated such that:
The terms chosen in Step 3 can be arranged into a function of the form:
Using this form, the parameterisation estimation problem has been translated into a parameter estimation problem, also known as regression analysis. By using an optimisation scheme to calculate the optimum coefficients, $\stackrel{\u02c6}{\mathrm{\beta}}$, for the test function, a functional estimate for the errors in the parameterisation will be found. The choice of optimisation scheme is free; for simplicity, we have chosen linear regression, but non-linear regression or sparse regression techniques, etc. may be used.
This is done sequentially by using the optimisation scheme to initially estimate the coefficient for the first term in g, ${\beta}_{0}{f}_{0}(\mathbf{x})$. In other words, the first test function is defined as:
The test function is then updated to include the first and second terms of $g(\mathbf{x},\mathrm{\beta})$ and the optimisation scheme is used again to fit ${\mathbf{x}}^{\mathbf{a}}-\tilde{\mathbf{x}}$ to:
This is repeated until functional estimates are obtained for all g_{i}, $i=0,\dots ,n$.
To verify the quality of the functional estimates calculated in the above step, the BIC (Schwarz, 1978) is used. The Akaike Information Criterion AIC, Akaike (1974) can also be used and produces similar analysis. The BIC is given by:
When the BIC has been calculated for all g_{i} functions in Step 6, the terms whose addition to the test function coincide with the biggest decreases in BIC from g_{i–1} to g_{i} (where the decrease from g_{−1} to g_{0} is defined as 0) will contain the most information about the structure of ${\mathbf{x}}^{\mathbf{a}}\left(t\right)-\tilde{\mathbf{x}}\left(t\right)$.
In Step 8, the terms in g are then reordered in descending order of terms with the greatest decrease in BIC to create a new functional form for the model error, h(x). h(x) is then split into sub-functions in the same way as g(x) and the optimisation scheme is again run for each h_{i}(x) and the coefficients are calculated as before. As the terms are now ordered such that the terms with most information in them are first in the test functions, the BIC will decrease until it reaches the optimal function and then increase for all functional forms after that. Therefore, the minimum in the BIC calculated for each h_{i}(x) will correspond to the optimal functional form of the model error, within the set that is being considered.
By calculating the coefficients for all h_{i}(x) individually, we obtain a set of functions with various degrees of confidence based upon the BIC (i.e. the lower the BIC is for an estimate, the higher the confidence in that functional estimate).
To calculate the uncertainty in the coefficients obtained from the method, the uncertainties in the coefficients obtained from each ensemble member are averaged over the ensemble.
For example, for least-squares (used in the results in this paper), the error covariance matrix for the parameters is obtained as (Friedman et al., 2001):
In this section, experiments are performed on an imperfect 1D advection model described by the following continuous equation:
The advection model is applied to a sine wave advected for 100 timesteps over a periodic domain on the interval [0,1] with 101 gridpoints spread uniformly across it, such that Δs = 0.01.
The initial true state is defined as:
For subsequent timesteps, the model error, ${\xi}_{s}(t)$ is distributed by an uncorrelated Gaussian distribution with zero mean and error covariance, $\mathbf{Q}={0.01}^{2}{\mathbf{I}}_{\mathbf{1}01}$$\forall t$.
The initial state covariance of the ensemble is also given by a SOAR function, but with a smaller correlation length scale of 0.05. SOAR functions are used in order to introduce cross-covariances between gridpoints.
Observations at timestep, k, are taken of the true state directly such that:
The EnKS is used with 40 ensemble members and localisation applied to the P^{f} matrix in the space dimension, using the localisation function:
The initial test function for use in the parameterisation method is defined as:
The optimal functional difference, as calculated by our parameterisation estimation method for each experiment is presented with uncertainty specified on each coefficient calculated as the standard deviation of the parameter errors averaged over the ensemble, as described in Section 3.1.
In this section, the method outlined in Section 3 will be applied to a simple parameter estimation problem where the only structural/deterministic error in the model comes from an error in a scalar parameter.
A true state trajectory, x^{t} is generated using:
The background state/initial estimate of the true state trajectory, x^{b} is generated using the parameterisation of the form:
Using the method outlined in Section 3, the optimal functional form of the parameterisation, G, is given by:
In addition to the simple adjustment of parameter, it can be seen that the parameterisation estimation method also calculates the numerical diffusion that occurs in the system. Analytical calculations show that the expected difference in numerical diffusion produced by the upwind scheme between the true and background models (with no model error) is $0.0019\frac{{\partial}^{2}x}{\partial {s}^{2}}$, which is within the uncertainty estimated by the parameterisation estimation method.
In this experiment, the true state is no longer generated by the linear advection equation, instead it is generated with:
The background model shall continue to assume that the truth is generated by a linear advection model such that the background parameterisation is given by:
The initial conditions are the same as the previous experiments on the linear advection model in Section 4.1. The value of Δt is changed to fulfil the CFL criterion of the non-linear advection model and is now given by
The parameterisation estimation procedure gives an estimate of the true parameterisation in the form
In this section, the new method of parameterisation estimation is compared with an existing method of parameter estimation, state augmentation, when applied to the estimation of parameterisations. The true and background models and observation density are the same as in Section 4.2. It is unknown how to define a localisation matrix such that the augmented forecast error covariance matrix is positive definite. This is due to the bordering entries in the augmented localisation matrix, corresponding to the cross-covariances between the state and parameters; setting these to be one can destroy the definiteness of the matrix. This results in non-positive eigenvalues occurring in the ‘localised’ augmented forecast covariance matrix. Therefore, a 2000-member ensemble is used in the EnKF such that localisation is not required to account for spurious correlations in the forecast covariance matrix.To extend state augmentation to the estimation of parameterisations, we again assume that the true model error terms are unknown and add the full test function defined in (23) to the background model. So that the augmented background model for state augmentation becomes:
The aim of this section is to show the sensitivity the parameterisation estimation method to the quality of the state estimation. This is done by reducing the observation density in space and/or time resulting in a less accurate DA trajectory.
The experiments in this section, shown in Table 1, will have the same true and background parameterisations as in Section 4.2, such that the true state is generated from the non-linear advection eq. (33) and the background state is generated from the linear advection eq. (18). The observation error covariance matrix and stochastic model error covariance matrix and the localisation function and radius are the same as used in Section 4.1.
It can be seen in Table 1 that when the observation density decreases, the parameterisation estimation method produces less accurate estimations of the true parameter values. This is due to the observations not being able to fully constrain the DA analysis to the true state. When observations are taken every three gridpoints, the diffusion terms become more prominent. However, they are always within the error bounds specified by our methods. When observations are taken every gridpoint and every three timesteps, the parameterisation estimate falls outside the one-standard deviation error bound from the truth, but only slightly. In all other cases the estimates are consistent.
In this section, we study the sensitivity of the method by varying the relative magnitude of model error covariance matrix, Q, and observation error covariance matrix, R.
The true model and background models are both generated using eqs. (33) and (18), from Section 4.2, respectively and $\mathrm{\Delta}t=\frac{\mathrm{\Delta}x}{22.5}$ to ensure that the CFL criterion is conserved. The experiments are performed in the well-observed case where observations occur at all space-time points and the same localisation function is used as in Section 4.1, given by eq. (22) with localisation radius, ${R}_{loc}=0.1$.
Recall that in the previous experiments, Q=0.01^{2}I and R=0.05^{2}I (i.e. Q=4R) and that the experiments in this section should be compared directly with the experiment in Section 4.2.
Table 2 shows that the parameterisation terms become less accurate when R increases compared to Q, but estimated errors also increase in a consistent manner. Additionally, the diffusive terms become more prominent as R increases.
This is because as the observation error covariances increase relative to the model error covariances, the state estimation will have more confidence in the prior model with respect to the observations meaning that the analysis state will be closer to the prior model equations, hence the signal in the observations will be much weaker. As a result, the diffusion terms become more prominent in the analysis state. This is to be expected, as the observations do not contain enough information to get the functional estimate closer to the desired solution.
This study has presented a new method of parameterisation estimation that uses the model equations directly to estimate the underlying errors in the model. This is done by comparing the DA analysis with an analysis forecast at every space-time point for every model variable to retrieve errors in the model and to use this information to improve model parameterisations. It is important to realise that the method uses the errors at the level of the model equations directly, instead of accumulating errors over a number of model timesteps. In the latter case, it is difficult to find the cause of the errors in a non-linear model, as model equation errors will interfere with each other during the accumulation. As the method uses the DA analysis as the best estimate of the truth, this method is dependent on the quality of the DA analysis.
In this paper, experiments have been shown using the linear advection equation to estimate parameters. Existing methods of parameter estimation are mostly based on state augmentation, which is an online method where the parameters are updated along with the state. While this means that state augmentation is potentially more computationally efficient, there is a risk that the updated parameter values could cause the numerical model to become unstable (e.g. for the advection models, an advection velocity could be updated such that the CFL criterion is violated) in the middle of computation. For simple parameter estimation, state augmentation may give a better estimate of the true parameter values. However, our new method has been developed to find overall structural errors in the model where state augmentation cannot. There is no risk of numerical instability with our method, as it is an offline method. Model runs are only computed with background parameterisations.
In the numerical experiments conducted in this paper, when the DA analysis was poor, due to less accurate observations or due to a less dense observations network, poorer estimates of the model error were obtained. This meant that ${\mathbf{x}}^{\mathbf{a}}-\tilde{\mathbf{x}}$ was less representative of the true model error present in the system, hence the optimal equations obtained from the regression were less representative of the true model error. It is worth noting that the diffusion terms become more prominent when there are fewer observations in space, and that our method is also able to estimate the scale of numerical diffusion present in system. Our method does provide an error estimate on the new parameterisation which is related to the spread in the ensemble. So a less accurate DA result will lead to a larger error in the parameterisation estimate, leading to consistent estimates even if the DA is poor.
The method introduced in this study will only find parameterisations within the pre-determined functional form. Our method appeals to the BIC to select the most influential terms. The optimal equation for the true model error corresponds to the equation with the lowest value of the BIC.
Work still needs to be done to reduce the sensitivity of the method to the quality of the analysis state in order to produce better/more consistent estimates of the model error. However, the new method of parameterisation estimation presented in this study has been shown to estimate simple parameterisations and has the potential to estimate functional forms of model errors and hence parameterisations in more complex models. A natural application of this method would be to infer parameterisations for a low-resolution model when a higher resolution model is available. In this case, all variables can be observed which is favourable for this method. We are currently planning to test the method on NWP models.
This work is funded by the Natural Environment Research Council under NE/J005878/1 and via the National Centre for Earth Observation (PJvL).
AkaikeH. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 1974; 19(6): 716–723.
AndersonJ. L. An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev. 2001; 129(12): 2884–2903.
AnnanJ. D., HargreavesJ. C., EdwardsN. R., MarshR. Parameter estimation in an intermediate complexity earth system model using an ensemble Kalman filter. Ocean Model. 2005; 8(12): 135–154.
BishopC. H., EthertonB. J., MajumdarS. J. Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev. 2001; 129(3): 420–436.
BurgersG., van LeeuwenP. J., EvensenG. Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev. 1998; 126(6): 1719–1724.
EvensenG. Sequential data assimilation with a nonlinear Quasi-Geostrophic model using Monte-Carlo methods to forecast error statistics. J. Geophys. Res. Oceans. 1994; 99(C5): 10143–10162.
EvensenG., DeeD. P., SchröterJ. ChassignetE. P., VerronJ. Parameter estimation in dynamical models. Ocean Modeling and Parameterization. 1998; Springer, Netherlands. 373–398.
EvensenG., van LeeuwenP. J. An ensemble Kalman smoother for nonlinear dynamics. Mon. Wea. Rev. 2000; 128(6): 1852–1867.
FriedmanJ., HastieT., TibshiraniR. The Elements of Statistical Learning. 2001. 2nd ed. Springer, Berlin.
HoutekamerP. L., MitchellH. L. Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev. 1998; 126(3): 796–811.
HoutekamerP. L., MitchellH. L. A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev. 2001; 129(1): 123–137.
JanjicT., NergerL., AlbertellaA., SchröterJ., SkachkoS. On domain localization in ensemble-based Kalman filter algorithms. Mon. Wea. Rev. 2011; 139(7): 2046–2060.
KondrashovD., SunC., GhilM. Data assimilation for a coupled ocean-atmosphere model. Part II: Parameter estimation. Mon. Wea. Rev. 2008; 136(12): 5062–5076.
LiuJ., WestM. DoucetA., de FreitasN., GordonN. Combined parameter and state estimation in simulation-based filtering. Sequential Monte Carlo Methods in Practice. 2001; 197–223. Statistics for Engineering and Information Science. Springer, New York.
MajumdarS. J., BishopC. H., EthertonB. J., TothZ. Adaptive sampling with the ensemble transform Kalman filter. Part II: Field program implementation. Mon. Wea. Rev. 2002; 130(5): 1356–1369.
SantitissadeekornN., JonesC. Two-state filtering for joint state-parameter estimation. arXiv preprint arXiv:1403.5989. 2014
SchwarzG. Estimating the dimension of a model. Ann. Stat. 1978; 6(2): 461–464.
SmithP. Joint state and Parameter Estimation using Data Assimilation with Application to Morphodynamic Modelling. 2010; University of Reading, UK: PhD Thesis.
TippettM. K., AndersonJ. L., BishopC. H., HamillT. M., WhitakerJ. S. Ensemble square root filters. Mon. Wea. Rev. 2003; 131(7): 1485–1490.
WilliamsonD., BlakerA. T., HamptonC., SalterJ. Identifying and removing structural biases in climate models with history matching. Clim. Dyn. 2014; 45(5–6): 1299–1324.
YangX., DelsoleT. Using the ensemble Kalman filter to estimate multiplicative model parameters. Tellus A. 2009; 61(5): 601–609.