Start Submission Become a Reviewer

Reading: An overview and illustration of sensitivity of forecasts with respect to initial conditions

Download

A- A+
Alt. Display

Original Research Papers

An overview and illustration of sensitivity of forecasts with respect to initial conditions

Author:

Claude Fischer

Météo-France/CNRS CNRM/GAME UMR 3589, 42 avenue Coriolis, 31057 Toulouse, FR
X close

Abstract

Sensitivity of forecasts with respect to initial conditions is an intermediate step in studies dealing with the impact of features in the initial conditions, or of observations in a data assimilation system, on the quality of the forecast. The sensitivity diagnostics however are obtained as approximations, and various formulations have been proposed over recent years. Since the sensitivity is assessed by computing a distance measure, the corresponding cost function can be estimated by a Taylor expansion. More recently, quadrature formulations have also been proposed. In this paper, these two approaches are re-assessed and attention is paid to specific assumptions made in those developments: the sometimes conflicting use of assumptions of linearity, the way non-linear aspects are taken into account and the relative formal merits of each technique. The presence, omission or absence of the second-order model derivative is also addressed. Eventually, a proposal for extending sensitivity diagnostics in a non-linear iterative context is made. The various diagnostics are illustrated and discussed in the framework of the Lorenz 96 toy model, for which both the first- and second-order model derivatives have been implemented. Several initial condition states are obtained by the means of a four-dimensional variational assimilation (4D-Var) algorithm. The sensitivity diagnostics are presented with respect to the observed dynamical model regime as verification time, beyond the assimilation window, is increased: respectively for a quasi-linear, a weakly non-linear and an early-phase, starting non-linear regime. The numerical results suggest that quadrature-based approximations of cost function reduction are more robust than Taylor-expansion-based ones. Furthermore, the quadrature-based diagnostics do not require the usual omission or simplification of high-order derivatives. A straightforward extension to an iterative context also is possible and illustrated.

How to Cite: Fischer, C., 2014. An overview and illustration of sensitivity of forecasts with respect to initial conditions. Tellus A: Dynamic Meteorology and Oceanography, 66(1), p.20496. DOI: http://doi.org/10.3402/tellusa.v66.20496
  Published on 01 Dec 2014
 Accepted on 2 Dec 2013            Submitted on 24 Jan 2013

1. Introduction

This note focuses on recently published material about observational sensitivity computations in a data assimilation system. The focus is however kept on the sensitivity to the initial conditions, and the steps for extending the sensitivity analysis towards observations are not taken here. What is kept is the framework of a data assimilation algorithm, for computing the variations of initial conditions by the means of analysis increments. The sensitivity of a forecast to the initial conditions is defined as a distance measure of the forecast state with respect to a verifying state:

(1 )
e(xf)=(xf-xt)TC(xf-xt)
where C is a symmetric, positive definite matrix and xf and xt stand for the test forecast and the verifying state, respectively. Various recent papers have introduced this definition or a similar one, like Langland and Baker (2004, hereafter LB04) and Errico (2007, hereafter E07). Both LB04 and E07 are now reference papers for applied sensitivity studies in real numerical weather prediction (NWP) and assimilation systems (Gelaro et al., 2010, or observation impact studies for the CONCORDIASI field campaign, Nathalie Saint-Ramond and colleagues, personal communication). The sensitivity analysis consists then in assessing which features in the initial condition, if modified, would bring a significant reduction in the function e. Or, put in different words, which features, if changed, would bring a benefit in the quality of the forecast as compared to the verification data. Thus, a key aspect in sensitivity analyses is to be able to compute the reduction δe, which is not in general available exactly. Daescu and Todling (2009, hereafter DT09) performed an overview of various methods of approximation of the forecast distance reduction, by separating them into two families based on either Taylor expansions or quadrature formulations. Their proposal of quadrature formulations is a more novel aspect for sensitivity computations in the NWP context, though DT09 demonstrate for instance that LB04's method actually is one case of a quadrature approximation.

In Section 2 of this note, the two families of approximation of the forecast distance reduction are studied. The sensitivity analysis performed by E07 is re-assessed with a focus on two critical steps, namely the assumption of a linear evolution of the perturbation and the substitution of the second-order model derivative by a difference of two first-order model derivatives. Furthermore, the quadrature-based sensitivity formulations introduced by DT09 are discussed. Finally, in Section 2, sensitivity diagnostics for iterative processes are proposed. These expressions seem consistent with iterative non-linear assimilation algorithms, and bear similarities with the proposals made by Trémolet (2008).

The various sensitivity diagnostics are illustrated and compared in the framework of the Lorenz 96 (Lorenz, 1996, hereafter L96) toy model (Section 3). A four-dimensional variational assimilation (4D-Var) is used as a tool for computing the variations of the initial conditions. The 4D-Var benefits from observations over a time window [t0, t1]. The forecast verification period is considered to start after this assimilation period (tvt1). Because the accuracy of the sensitivity formulas depends on the order of magnitude of errors and the importance of non-linear model features, the results are interpreted in terms of dynamical model regime for perturbation growth. A separation is made between quasi-linear, weakly non-linear and early-phase non-linear behaviour, as the verification forecast range tv is extended. The results are summarized in Section 4.

2. Considerations on sensitivity computations

2.1. A Taylor expansion-based case: Errico (2007)

As a first step, the sensitivity calculations based on Taylor expansions, as presented in Errico (2007, E07), are re-derived and discussed. E07 handles the forecast error measure e(xf) using the summation along individual components of the state vectors:

(2 )
e(xf)=i,j(xif-xit)cij(xjf-xjt)
where cij are the elements of matrix C. The sensitivity analysis in E07 consists in developing the Taylor expansion of e with respect to a small change in the initial condition x0. The forecast state xf is, in the general case, obtained by integrating a model M: xf=M(x0) where M is a full, non-linear forecast model. Eventually, we look for appropriate estimates for the variation δe(δx0)=e(M(x0+δx0))−e(M(x0)). The first-order term of the Taylor expansion of δe is obtained by introducing the resolvent model, denoted M. M is by definition the tangent linear model integrated in time. The tangent linear model is obtained by a linearization of the full model M along a trajectory defined as the forecast started from x0. The resolvent model can be represented by a matrix of coefficients mij, thus leading to the following expression, which corresponds to eq. (6) of E07:
(3 )
i,jmijδxj0=Mδx0
With these notations, the first-order derivative of e with respect to any coordinate of the initial perturbation is readily obtained:
(4 )
k,exk0=2i,jmikcij(xjf-xjt)
which corresponds to eq. (7) in E07. Equation (4) provides the vector 2MTC(xfxt) and thus:
(5 )
δe1=kδxk0exk0=2(δx0)TMTC(xf-xt)
is the first-order variation of measure e with respect to δx0. This relation corresponds to eq. (14) of E07.

The variation of e may be extended to order two, which then provides the term δe2 of E07. The second-order derivative of M can be obtained by a further derivation of eq. (4) with respect to a second component of the initial perturbation vector:

(6 )
n,k,2exn0xk0=2i,jmikxn0cij(xjf-xjt)+mikcijmjn

This is eq. (8) in E07. We now write for the sake of simplicity:

(7 )

The first term in the expression of δe2 simply is δe1. After some rearrangements using eq. (6), εn reads:

(8 )

The first term comes from the first-order variation [refer to eq. (4)] while the second and third terms above come from the second-order variation [see eq. (6)]. After rearrangements, εn also reads:

(9 )
εn=i,j(min+kmikxn0δxk0)cij(xjf-xjt)+i,jmincij(kmjkδxk0+xjf-xjt)

At this stage, E07 uses the following simplifying assumption [this is eq. (6) in E07]:

(10 )
δxf=M(x0+δx0)-M(x0)=Mδx0

To be more specific, it is assumed that the linear integration term kmjkδxk0, for any component j, is a fair estimate of the final forecast difference δxjf. This partial assumption of linearity of the forecast model is in principle questionable, since it is otherwise assumed that the forecast model M is non-linear, so that it should be worth to extend and analyse its Taylor expansion beyond order one. Moreover, this assumption is stated all along the model integration from t0, the initial time, through tv, the verifying time, since the substitution is performed at verifying time. We also note that this replacement is first-order accurate with respect to δx0 for the non-linear model M, since it also can be interpreted as a truncation of the Taylor expansion of M at order two and above. Therefore, after substitution, the expressions following below for δe2 will be accurate at order two, because of the extra dot product with δx0 [see eq. (7)]. In other words, the error bounds for δe should be of order three with respect to δx0. Continuing E07's derivation, we now introduce the analysis state with the superscript a and write: xjf-xjt+δxjf=xjaf-xjt. In terms of an assimilation system, xf is the forecast state obtained from an integration started with the background state xb while xaf is the forecast state started from the analysis xa.

Back to δe2, we now can further rearrange its expression:

(11 )
δe2=nδxn0[i,j(min+kmikxn0δxk0)cij(xjf-xjt)+i,jmincij(xjaf-xjt)]=i,j[nminδxn0+k(nmikxn0δxn0)δxk0]cij(xjf-xjt)+i,j(nminδxn0)cij(xjaf-xjt)

Noting that:

(12 )
nmikxn0δxn0=δmik
which is eq. (5) of E07, and reusing the partial assumption of linearity of the forecast model discussed previously:
(13 )
δxif=nminδxn0
we can further write:
(14 )
δe2=i,j(δxif+kδmikδxk0)cij(xjf-xjt)+i,jδxifcij(xjaf-xjt)=i,j[δxifcij(xjf-xjt+xjaf-xjt)+(kδmikδxk0)cij(xjf-xjt)]

Since δmik contains the derivatives of the resolvent with respect to the initial state components, we eventually obtain the following expression for δe2:

(15 )

This expression represents the second-order accurate variation of e in the lines of E07. It involves the second-order derivative of the forecast model, present in the bilinear term M2(δx0,δx0). This bilinear term also involves an integration along a trajectory, which is precisely the same as for the resolvent model, namely the non-linear forecast M(x0) started from x0. The prevalence of this second-order model derivative is a well-known feature of sensitivity computations [see for instance in Le Dimet et al., 2002, or in Trémolet, 2008, his section 3.3]. It is in general ignored, since its computation is out of range in NWP systems.

To proceed further, E07 has substituted in the following way:

(16 )

By this substitution, the second-order model derivative disappears and is replaced by a perturbed model resolvent M+δM which is interpreted as the resolvent matrix integrated along the trajectory started from the analysis xa, as opposed to the resolvent M which is integrated from the background initial condition xb (at verifying time: xf=M(xb) and xaf=M(xa)). Introducing the notations Mb=M and Ma=M+δM and substituting in the above relationships leads to:

(17 )

which is equivalent to eq. (17) of E07.

It is worth to stress here that the substitution of eq. (16) corresponds to a first-order approximation for the resolvent model. This assertion can be seen by the relationship: +, where M˜ is the resolvent along the trajectory started from x0+δx0. Indeed, M2(δx0,δx0) will by definition converge to the finite difference (˜-)(δx0) when δx0 becomes infinitesimal. Thus, the substitution of eq. (16) is a new case of partial assumption of linearity of the forecast system, but the substitution now consists in replacing the (bi-)linear term by a (non-linear) finite difference. Therefore, this substitution has the opposite effect compared to the substitution in eq. (10) where high-order terms are removed. In this respect, the two substitutions are of conflicting nature and an error is in principle introduced. The substitution in eq. (16) again is taken to be valid all along the model integration from t0 through tv. By construction, the error bound of this replacement in the non-linear forecast model formulation is of third order with respect to δx0 (i.e. its precision is of second order). Thus, the substitution actually leads to a fourth-order error bound in δx0 for δe2, as can be seen for instance with the left product by δx0 in eq. (17).

In E07, the replacement of the second-order derivative by the perturbed resolvent model is an essential step for introducing the effect of a varying trajectory of linearization. As we have illustrated above, this interpretation is correct, but its derivation in E07 makes an explicit use of assumptions of quasi-linear behaviour for several simplifications [based on eqs. (10) and (16)] that do have opposite impacts in terms of accounting for or ignoring high-order terms. In addition, these assumptions of quasi-linear model behaviours must hold all along the time integration of the models (from t0 through tv), because the simplifying substitutions are done at verifying times. This aspect will be illustrated in Section 3. As an intermediate conclusion, the sensitivity formulations of first- and second-order given in E07 have been re-derived, with an emphasis on cornerstones of E07's developments. The presence of the second-order model derivative has been explicitly formulated. Those steps where some type of quasi-linear assumption is made, with conflicting effects on accounting for or discarding high-order terms, have been highlighted. These effects should not become sensitive at the same time, however, since they apply on error truncations of different order: the substitution in eq. (10) affects δe at order three (accuracy is second order) while the substitution in eq. (16) affects δe at order five (fourth-order accuracy). The first-order variation δe1 and the second-order variation δe2 of E07 are further illustrated for the L96 model (Lorenz, 1996) in Section 3.

2.2. Quadrature-based estimates of a forecast distance difference: Daescu and Todling (2009)

An alternative approach for the calculation of forecast distance sensitivities has been proposed by Daescu and Todling (2009, DT09). Besides Taylor expansion calculations, DT09 introduce a quadrature-based estimation by considering the variation δe=e(M(xa))−e(M(xb)) as a parametric function along the segment [xb,xa]. The parameter is a single scalar s∈[0,1] such that any initial state x0 on this segment can be defined as x0=xb+s(xaxb). A new parametric distance function is then derived as: eˆ(s)=e(M(x0)) and the change of e between background and analysis forecasts becomes:

(18 )
eˆ(1)-eˆ(0)=01eˆ(s)ds

where the fundamental theorem of calculus is invoked. Several high-order approximations of the integral computation can be obtained, by using various quadrature estimates. Each estimate is based on a specific rule of approximation (Trapezoidal, midpoint, Simpson) and is associated with a specific formulation of its error bound. In particular, DT09 show that the term [(xfxt)TCMb+(xafxt)TCMa](xaxb) corresponds to a numerical quadrature approximation of δe following the Trapezoidal rule. Furthermore, DT09 identify this expression with the sensitivity formulation proposed by Langland and Baker (2004, LB04). It is also worth to notice that this expression is based on two computations of the gradient of e with respect to the initial condition variable x0, denoted g, respectively, at points xb and xa, and using as trajectory of linearization the forecast started from xb and xa, respectively. This assertion is demonstrated by the property: g(x0)T=2(M(x0)−xt)TCMx0 with Mx0 the resolvent model along the trajectory M(x0). This relationship is valid at any point x0 along the segment [xb,xa], and furthermore eˆ(s)=g(x0)T(xa-xb). Thus, an equivalent formulation for the Trapezoidal rule quadrature estimate, as first proposed by LB04, reads: δeLB=1/2(gb+ga)T(xaxb). Note also the close resemblance between the formulations of δe2 [eq. (17)] and δeLB, with only a difference through swapping Mb with Ma. They are very similar, but not equivalent. DT09 use fundamental properties of integral calculus for single-variable functions to prove that the estimate is second-order accurate (the residual can be expressed as a third-order term in xa–xb).

Several other alternative estimates of δe are proposed in DT09. The midpoint rule quadrature estimate is second-order accurate and reads:

(19 )
δemid=2(M(xmid)-xt)TCMmid(xa-xb)

In eq. (19), the point is that the gradient is evaluated at the midpoint xmid=(xa+xb)/2. The resolvent accordingly is linearized along the trajectory M(xmid). Simpson's rule quadrature estimate provides a fourth-order accurate formula:

(20 )

δeSimp actually is readily computed if δeLB and δemid are known, since it is a linear combination of both. An advantage of the quadrature-based estimates over Taylor-based formulations like in eq. (15) certainly is that no second-order model derivative M2 appears, since the accuracy of each formulation is obtained by computing and combining first-order gradients at prescribed positions. One possible drawback, conversely, is that the expressions and their error bounds in principle only hold for variations along the given segment [xb,xa]. We illustrate these various sensitivity formulas in the framework of the L96 (Lorenz, 1996) toy model in Section 3.

2.3. Varying trajectories of linearization as a result of an iterative process

Yet, other possibilities for addressing the dependency of sensitivity diagnostics with respect to the trajectory of linearization do exist. In the spirit of Taylor expansions, the trajectory is naturally provided by the integration of the non-linear model started from the reference initial condition. Successive small corrections of a given initial state provide then new Taylor expansions, and the associated sensitivity is obtained in iterative steps. Let us consider [x0, x1,…,xm] a set of such initial conditions, with i[0,m-1],δxi=xi+1-xi and each δxi is a small variation. We consider each distance measure ei as the difference of the respective forecasts with a verifying state xt: ei=e(M(xi))=(M(xi)-xt)TC(M(xi)-xt); δei=ei+1-ei and the total variation δe=em-e0.

Keeping only approximations at order one with respect to the small successive variations of the input δxi:

(21 )

In eq. (21), Mi−1 stands for the resolvent model linearized along the trajectory started from xi−1. Thus, in this iterative process, the dependency of the sensitivity with respect to successively changing trajectories of linearization becomes visible through the cumulative terms added at each new loop. Such a sensitivity diagnostic might apply in the context of multiple analyses, where x2 would be a re-analysis with respect to x1. Another possible application of this approach is in a multi-incremental assimilation, where each xi+1 is an iterative correction of the previous solution xi. Equation (21) bears similarities with the formulation of the first-order variation denoted I1 in Trémolet (2008), as F1m is obtained by iteratively adding first-order contributions computed as simple vector multiplications. Each vector gi is obtained by applying the adjoint model linearized along the trajectory started from xi−1 to the iteratively forecasted departure M(xi-1)-xt (weighted by C). A difference in the derivation for eq. (21), with respect to Trémolet (2008), is that the data assimilation system itself has not been made explicit here, while Trémolet was interested in making the operators of the incremental assimilation process visible.

A second-order evaluation of the variation of e can be obtained by furthermore considering the square products between all first-order terms of the type Mi-1δxi-1, since e is quadratic. In addition, one also should take into consideration the terms involving the second-order model derivative, which is done here formally although these terms would not be accessible in an NWP context (these terms are provided in the framework of our toy model, for illustration, in Section 3). Omitting these terms, as discussed for eq. (15), is for instance done in Trémolet (2008) in the context of an NWP data assimilation system. In its complete form, the second-order term of the variation δe reads:

(22 )

leading to the second-order approximation of the variation of e at step m:

(23 )
em-e0=F1m+Sm=F2m

Equation (23) bears similarities with the formulation of the second-order variation denoted I2 in Trémolet (2008), as F2m is obtained by iteratively adding a second-order variation of the functional e to a corresponding first-order contribution. In Trémolet (2008), however, the functional e actually represents the distance function to the background state of the assimilation Jb and an explicit formulation involving the operators of the assimilation process is searched. Note that for m=1, if δx0 is taken as the (single-step) analysis increment, then the RHS of eq. (23) is equivalent to eq. (29) in DT09. Note also that, at order one, F11 is equivalent to eq. (27) in DT09 and eq. (14) in E07.

3. Results with the Lorenz 96 model

3.1. Description of the model implementation

The diagnostics of Section 2 are illustrated using the L96 toy model (Lorenz, 1996) with K=40 degrees of freedom. The non-linear evolution of the state reads:

(24 )
j=1,K,dXjdt=(Xj+1-Xj-2)Xj-1-Xj+F

with the periodic boundary conditions: X-1=XK-1,X0=XK,XK+1=X1. The timestep scheme is a fourth-order Runge-Kutta scheme, with a timestep dt=0.01. The forcing term F=8 is a usual value for this model, and means that a highly chaotic (non-linear) behaviour is chosen. The resolvent model is obtained by integrating the tangent linear equation. For the L96 model, the tangent linear model can be described by its Jacobian matrix representation. The j-th line of the Jacobian f then reads:

(25 )

where xtraj stands for the state of the trajectory of linearization (or any of its individual components Xjtraj,j=1,K). The non-zero values above span from column j–2 to column j+1. These elements are similar to, for instance, eq. (7) of Lorenz and Emanuel (1998). Periodic boundary conditions are applied in accordance with the non-linear version. The time integration of the resolvent model is obtained by a fourth-order approximate expansion of the matrix exponential formal definition of the resolvent M(t0,t1)=t0t1exp(f.dt)=I+dt.f+dt22.f2+dt36.f3+dt424.f4 where an integration from t0 to t1 is assumed.

The second-order model derivative can be represented as a series of elementary K×K (Hessian) matrices, which eventually compose the tensor representation of this model. The j-th ranked elementary matrix has zeroes everywhere except on four elements:

(26 )
(fjXk)k=1,K=(.-1..-1..+1.....+1..)

The –1-values are located at the lines and columns (j−2, j−1) and (j−1, j−2); the 1-values are located at the lines and columns (j−1, j+1) and (j+1, j−1). Note that for the L96 model, because the non-linear terms are simple second-order cross products between individual elements of the state vector, the second-order derivative (tendency) does not explicitly depend on the trajectory xtraj. This dependency however does exist for the second-order model derivative, because its time integration requires combinations with the resolvent model (see below, chain rule). For each elementary matrix, periodic boundary conditions are applied in accordance with the tangent linear and non-linear models. The time integration of the second-order model derivative is based on the chain rule: (g2°g1)=(g2°g1)(g1)2+(g2°g1)g1.

The data assimilation system is a 4D-Var. The minimization is carried out by a conjugate gradient method.

The numerical results discussed in this note are obtained with the following experimental setup:

  1. The truth is run over 40 d, where 1 d is defined as 20dt and is often considered as one “day” in the context of the L96 model. The initial condition is j,Xj= F=8 plus a perturbation of F/1000 on X20. According to individual time series of various model points Xj, transient behaviour vanishes after about 13 d. In past studies, Lorenz and Emanuel (1998) used 90 d to ensure satisfactory convergence on the attractor; Yang and Delsole (2009) used about 375 d.
  2. A background state xb is defined at day 39, as the true state at this range plus a random perturbation of standard deviation σb applied on all elements of the state vector. This background state is the start of a background trajectory from day 39 through day 40. The value σb=0.2 is used in this study, which corresponds to a size of background errors about two orders of magnitude smaller than the variability of the L96 state (see for instance in Fig. 1).
  3. Observations are defined for every second point j=1,K,2 (every odd valued position of the state vector), and for every 5dt in the assimilation window from day 39 through day 40 (5dt=0.05 is considered as a pseudo-6 hour time interval, as suggested in previous studies with L96). The observed values are defined as randomly perturbed values of the true state, with a standard deviation σo=0.2. The choice σo=σb=σ is always kept. Thus, a total amount of 5*20=100 observations is assimilated.
  4. The background and observation error covariance matrices B and R are simple diagonal matrices with the constant values σb2 and σo2, respectively. A consequence of this choice is that the assimilation system is specified to be optimal.
  5. A total of 10 random occurrences of the above defined assimilation problem have been run successively,in order to assess the robustness of the results and interpretations in a more systematic manner. There is obviously a certain variability of the true distance function reduction δe over the set of realizations, but this variability is not of interest per se. When relevant, a mean percentage of difference between various sensitivity measures is displayed in the follow-on of this paper (see in Table 1).

Fig. 1  

Time series of L96-values at point j=40. Time t0=0dt corresponds to the start of the assimilation window; the end of the assimilation window is at t1=20dt; the final forecast range displayed is at tf=80dt. The time unit is in model iterations. The black, red and brown curves, respectively, represent the trajectories computed from the truth, the background and the analysis states for run 1 out of the 10 random realizations when σob=0.2. The green and blue curves show trajectories obtained during the non-linear minimization process for m=1,2, resp.

With these settings, the total cost function is fairly close to a quadratic function, and the conjugate gradient minimization has converged after m=10 iterations. The forecast models are assumed to be perfect, which means that the 4D-Var problem is of strong constraint type and model error is ignored.

3.2. Illustrations of sensitivity diagnostics

3.2.1. Aspects about various successive trajectory runs.

Figure 1 displays several L96 model integrations from time t0=0dt (day 39 of the truth run) through time tf=80dt=0.8 (day 43 of truth run) for point X40. The forecasts started from the analysis (brown curve), the background (red) and the true state (black) are shown, along with two intermediate trajectories obtained in the course of the minimization. It is readily seen on Fig. 1 that all five trajectories remain very close in the first day of integration, from t0 through t1=20dt=0.2. This period corresponds to the one where observations are assimilated. Beyond t1, the trajectories progressively diverge and after three more days of forecast, at time tf, the dispersion between all five runs is of about one L96 model unit. The period [t1,tf] is considered to be the verification period, where the sensitivity diagnostics are evaluated and discussed.

3.2.2. Interpretation in terms of the dynamical model regimes.

The dynamical regime of perturbation growth along the period [t0,tf] can be qualified for instance by checking the individual components that contribute to the Taylor expansion of the distance cost function reduction δe. Note that in this study, the norm C associated with e simply is taken as the canonical norm: xTCx=j=1K(Xj)2. The formulations of eqs. (21) and (22) are taken as the basis. These formulations provide one first-order term, F1m, and two second-order terms, S1m and S2m. In addition, one third-order and one fourth-order term are accessible since they are composed with the first- and second-order model derivatives only (third- or higher-order model derivatives have not been considered in this study):

(27 )

At time t1=20dt=0.2, the budget of δe is mostly given by F1m+S1m (here, our reference value for m is after all 10 iterations have been collected, so m=10). Thus, only the first-order term and the square term of second order involving the resolvent model M, are significant. Terms S2m, T3m and T4m are about 2–3 orders of magnitude smaller than F1m and S1m (among all 10 test runs). Because the higher than one-order model derivatives are negligible, the model behaviour at t=0.2 can be considered as quasi-linear.

At time t=40dt=0.4, the three extra non-linear terms S2m, T3m and T4m only are 1–2 orders of magnitude smaller than F1m and S1m. The approximation F1m+S1m provides an estimate of δe accurate to about [10–20%] only. The other three terms, involving the second-order model M2, are of about equal order of magnitude, and their sum is enough, in any of the 10 test runs, to obtain a total estimate of δe up to about [1–3%]. These values indicate that the second-order model derivative has a small but significant contribution, while higher (than two) order model derivatives are still negligible. Because of the relative magnitudes discussed above, this dynamical model behaviour can be considered as a weakly non-linear one.

The fact that the observed dynamical regime progressively shifts from quasi-linear to weakly non-linear between tv=0.2 (1 d forecast range) and tv=0.4 (2 d) is a consequence of the growth of the differences between the compared forecast states. The observed time range of this growth seems consistent with for instance the values of the leading Lyapunov exponent: Lorenz and Emanuel (1998) give 2.1 d, Carrassi et al. (2009) give about 3 d. Thus, differences of about 2.10−1 between various states at t0, as prescribed when σ=0.2, would result in differences of about [0.5–1] after about 2 d. These estimates are in accordance with Fig. 1 and indicate that indeed, non-linear aspects should progressively become visible.

At time t=80dt=0.8, the truncated expansion F1m+S1m+S2m+T3m+T4m (denoted (F2m)+) does not provide anymore a highly reliable estimate of δe. Model non-linearities of high-order start to become significant. This aspect is seen from the increasing relative role played by the terms S2m, T3m and T4m, whose relative ratios of absolute values with respect to F1m and S1m now are of about [0.2–0.4] in the test runs. Thus, t=0.8 can be considered as corresponding to the starting phase of the non-linear model regime.

3.2.3. Results in terms of sensitivity diagnostics.

As concerns the various diagnostic estimates of δe discussed in Section 2, a clear numerical distinction is seen in Table 1 between the first-order accurate approximations like δe1 (E07) and F1m, and the second (or higher)-order accurate formulas (δe2 of E07, δeLB of LB04, midpoint and Simpson's rules estimates of DT09, F2m and (F2m)+). The first-order terms are of opposite (negative) sign with respect to their (positive) second-order correction: compare δe1 with δe2δe1 or F1m with Sm=F2m-F1m in Table 1 for tv=0.4 and tv=0.8. In absolute values, the first-order diagnostic terms are about twice as large (or even more) compared with the values of their second-order corrections. This property was found to be even more pronounced in experiments with σ=0.01, where the various forecasts of interest would be much closer one to another.

This particular feature is due to the quasi-linear or weakly non-linear model regime combined with the fact that background and analysis are relatively close to the true state in these experiments. Therefore, using the linearized approximation M(xa)=M(xb)+Mb(xa-xb), one may write:

(1 )
e(M(xa))-e(M(xb))=2(M(xb)-xt)TCMb(xa-xb)+(Mb(xa-xb))TCMb(xa-xb).

Here, the second and higher order model derivatives are omitted and only the dynamically predominant resolvent model is considered. The first term can be approximated with another linear assumption, by introducing the analysis error εa so that:

xt=M(xa-εa)=M(xa)-Maεa.

By substitution and a further linearization:

M(xb)-xt=M(xb)-M(xa)+Maεa=-Mb(xa-xb)+Maεa.

Eventually, an approximate relation for e(M(xa))e(M(xb)) is obtained, involving only the dynamically predominant resolvent model:

-2(Mb(xa-xb)-Maεa)TCMb(xa-xb)+(Mb(xa-xb))TCMb(xa-xb).

The first scalar term is derived from the first-order variation of δe, and it is equal to minus two times the second-order-derived contribution (second scalar term), up to an additive correction depending on the orthogonality of Maεa with Mb(xa-xb). Trémolet (2007) showed that the equality is exact when the model is linear and the forecast M(xa) minimizes the distance cost function e. Indeed, under these assumptions, it can be shown that Maεaker(MbTC). Similar proportions in sensitivity results were furthermore obtained with the GEOS-5 system in Gelaro et al. (2007), with an atmospheric data assimilation system. As a numerical consequence, the first-order estimates of δe exhibit relative errors of about 100% or more at tv=0.4 and tv=0.8.

At verification time tv=0.4 (left side columns in Table 1), differences between the various diagnostic formulas are seen when the numerical values are checked in more detail. Thus, F2m provides an estimate which is of slightly poorer average quality than δe2 (E07), δeLB (LB04) or the midpoint/Simpson's rule quadrature estimates (DT09). A significantly better estimate is obtained when the two extra terms, that can be computed because M2 is available in this toy model, are added ((F2m)+=F2m+T3m+T4m in Table 1). This result is consistent with our definition of a weakly non-linear model regime: for the extended estimate (F2m)+, the extra high-order aspects that count in the Taylor expansion of δe are those that are derived using both the first-order (Jacobian/tangent linear, or adjoint) and second-order (Hessian) model derivatives.

For tv=0.8 (right side columns in Table 1), the incremental estimates are of clearly poorer quality than the simple ones. This deterioration is due to cumulated errors with iterations m, because of the missing higher than two-order terms (or higher than one-order terms for F1m). Remember that this forecast range is in the start phase where non-linearities become significantly sensitive. It is nevertheless worth noticing that the Simpson's rule estimate outperforms any other approximation for δe.

In the formula for δe2 of E07, the substitution of the bilinear integration by a finite difference of the resolvent model (which we called a ‘partial assumption of linearity’ in Section 2) implicitly allows us to take into account some weakly non-linear behaviour of higher order. In the cases of LB04 and the other quadrature estimates of DT09, the non-linear effects remain present since no truncated development in a Taylor expansion sense is performed. As introduced in DT09 and re-discussed in Section 2, diagnostics like δeLB (LB04), δemid (midpoint rule quadrature) and δeSimp (Simpson's rule quadrature) are obtained by recomputation of gradients and their combination. In this process, the trajectory of linearization has to be adapted and chosen as the forecast started from a given initial state (background, analysis, their midpoint, in these cases). The major ingredient for a fourth-order estimate in the Taylor expansion-based F2m diagnostic term is the inclusion of higher order model derivatives. In the context of the present toy model study, these higher order terms are those that are accessible through the availability of the Hessian of the dynamical system, leading to the extended formula of (F2m)+.

In quadrature-based diagnostics, the number of recomputations basically controls the accuracy of the estimate: δeLB combines two gradient informations computed using the background and the analysis, and is second-order accurate; δemid is based on one recomputation only at midpoint position xmid=(xa+xb)/2 and also is second-order accurate; δeSimp=13(δeLB+2δemid) is based on three gradients and is fourth-order accurate. The toy model results, for rather weakly non-linear conditions, show that the Simpson's rule quadrature indeed provides estimates which on average are more robust and accurate than the other tested diagnostics. Only as long as non-linearities remain significantly weak, do some other sensitivity formulas propose comparable alternatives, in particular δeLB, δemid and (F2m)+.

For verification times that would lie within the more settled non-linear regime (t>0.8 or other experiments not shown here), the Simpson's rule quadrature approximation provides the most acceptable values. Checking directly the form of the function eˆ(s) for various values of s (with a step of 0.1), it turns out that this cost function is monotonously decreasing along the interval [0,1] in most cases where the quadrature-based estimates still provide acceptable values (not shown). On the opposite, the piecewise values of eˆ(s) indicate that this function possesses multiple maxima/minima in those cases where for instance the Simpson's rule quadrature fails. This finding is consistent with increased values of higher order derivatives, leading to an inflation of the error bounds for the quadrature-based estimates. This practical result nevertheless suggests that quadrature estimates could be robust over a certain range of verification forecast times, for various dynamical and assimilation systems possibly of the complexity of atmospheric NWP systems. This finding is in line with the practical application of these diagnostics given in Section 4 of DT09 (with the GEOS-5 system of NCEP and NASA/GMAO).

3.2.4. Illustration of the sensitivity diagnostics with iteratively changing trajectories of linearization.

Finally, Fig. 2 displays values of F2m, (F2m)+ and δeSimp for (tv=0.8, σ=0.2), with respect to the iterates m of trajectory recomputations. For F2m and (F2m)+, the incremental formulation is embedded in their definition [see eqs. (21–23)]. For δeSimp, each successive couple of initial and final states [xi,xi+1] is considered to define one segment over which the quadrature approximation is evaluated. Then, like for F2m and (F2m)+, the successive estimates of δeSimp are summed. In this stepwise estimation of the quadrature approximation of δe, the precision of each individual approximation should lie within the theoretical error bound of the respective quadrature (for Simpson's rule, fourth order). Thus, the iterative sum is expected to provide an estimate of the total reduction that has a comparable error with the single-step estimate computed in one go over the interval [x0=xb,xm=xa], except if errors add up. This numerical behaviour is indeed observed, both for δemid and δeSimp, at tv=0.4 and tv=0.8 (see in Fig. 2 and in Table 1). However, for tv=0.8, the iterative estimates perform significantly poorer than their one-go counterparts (see in Table 1). For the term (F2m)+, the accumulation of error is related to the term T4m which is positive and tends to bring a too large contribution to the total. Here, extra non-linear terms that could counter-balance this contribution are missing when the model regime starts to leave the weakly non-linear phase.

Fig. 2  

Successive values of the cumulative variations F2m, (F2m)+ and DT09's Simpson rule quadrature estimate, for the 10 successive loops m=1,10 when σob=0.2 (blue, red, green, resp.). The curves are for run 1 of the 10 random occurrences of the assimilation problem. Verification time is tv=80dt=0.8.

Figure 2 nevertheless illustrates a practical possibility to evaluate the impact of a changing reference trajectory when computing a sensitivity diagnostic based on either a Taylor series (and truncation) or a quadrature estimate, in the context of an iterative correction process. In the case of Fig. 2, the first two iterations are those that drive the solution closest to the minimum, among all minimization steps. Accordingly, the largest amount of reduction for the forecast distance measure is obtained in the first two loops m=1,2. The next loops provide fairly negligible contributions. These results are consistent with the observed reduction of the total cost function J with each iteration in the non-linear minimization (not shown). Figure 2 illustrates that the iterative estimates provide a consistent picture of how the progressively adjusting trajectory allows one to capture, little by little, a fairly good estimate of the true reduction, even in a weakly non-linear model regime. In the case of the δeSimp-diagnostic, the panel also illustrates the difference that has to be made between the (three) different trajectories used to compute the diagnostic itself and the (ten) different trajectories obtained in the non-linear minimization process.

4. Conclusions

In this paper, we have readdressed sensitivity diagnostics recently proposed in Errico (2007, E07) and Daescu and Todling (2009, DT09). In particular, we have shown that the formula δe2 in E07 is obtained after two crucial steps of quasi-linear assumptions: (1) when considering that the non-linear forecast can be approximated by a linear one for some terms, and (2) when approximating the second-order model derivative by a finite difference of two model resolvents integrated along slightly differing trajectories. The first substitution is in contradiction with considering higher order variations for the sensitivity analysis in a non-linear system. It nevertheless allows one to continue the developments, with an overall second-order accuracy for the distance measure reduction δe. The second substitution has an opposite effect to the previous one, since it consists of reintroducing higher order dynamical behaviour in the approximate diagnostic. It is otherwise within the bounds of fourth-order precision for δe, thus not too problematic as long as the model regime is quasi-linear. Nevertheless, these two conflicting assumptions should be considered as a theoretical weakness of formulas like E07's δe2.

DT09 have proposed sensitivity formulations for δe that only require multiple trajectory and gradient recomputations. Quadrature formulations, considering the distance measure e as a single parameter functional on the segment [xb,xa], are then invoked to obtain second-, fourth-, or higher-order estimates. In contrast to Taylor expansion-based sensitivity approximations like E07's, where the second-order model derivative does quite naturally appear, the absence of this or higher order derivative terms in quadrature-based approximations is an advantage. Indeed, high-order model derivatives require extra model configurations to be developed (higher order individual derivatives) and they require extra computations or storage (of gradients for instance). Therefore, they definitely are out of the range of NWP developments. In this paper, though, the second-order model derivative has been displayed and used for assessing dynamical model regimes and for comparing Taylor expansion and quadrature-based sensitivity formulations, in the framework of the simple Lorenz (1996, L96) model. For completeness, we also recall that DT09's Trapezoidal rule estimate of δe is equivalent to the sensitivity diagnostic proposed by Langland and Baker (2004, LB04).

In addition to the above-described analyses of previously published sensitivity diagnostics, the proposal is made to assess the impact of changes in the trajectory by considering an iterative process. This iterative approach provides incremental formulations of sensitivity with respect to initial conditions and the trajectories of linearization, which formally can apply to problems such as reanalyses or a non-linear assimilation. These expressions are computable within the usual practical choices, like for instance to ignore the second-order model derivative when it is not accessible. They are furthermore close to proposals formulated by Trémolet (2008) in the context of multi-incremental outer loops in a data assimilation system [eqs. (21–23) of this note].

The various δe-estimates have been illustrated in the framework of the L96 model, with a separation of the results in terms of type of model regime for perturbation growth: quasi-linear, weakly non-linear and early non-linear. In the quasi-linear regime, first and second-order estimates clearly are distinguished, and the knowledge of the resolvent (or tangent linear) model is enough to provide accurate values of any of the second-order diagnostic formulas for δe. In the weakly non-linear regime, an extra accuracy can be gained by taking into account extra second- and higher-order terms for δe provided the non-linearities remain sufficiently small. This extra non-linear information is implicit in DT09's quadrature-based estimates, because gradients along differing trajectories are consistently combined. It is, on the opposite, explicitly computed for second-, third- and fourth-order terms that only require the knowledge of the second-order model derivative M2, in the extended Taylor expansion-based diagnostic (F2m)+. As non-linearities grow, the accuracy of the estimates decreases. In particular, rather systematic cumulative errors in high-order terms deteriorate the accuracy of the incremental estimates based on Taylor expansions. Conversely, the midpoint and Simpson's rule quadrature estimates seem to provide the most robust results, into forecast ranges where non-linear effects become non-negligible. This property however is not systematic, as other runs with the L96 model have shown, and the error bound certainly is expected to be significant in any case with prevailing non-linearities.

The iteratively processed sensitivity calculations, with a varying trajectory of linearization, are shown to basically mimic the behaviour of the underlying adaptive process of adjustment. In a non-linear minimization, the first iterations may provide the largest steps towards the minimizing solution, like in the L96 4D-Var case presented in this study. Here, the first two loops provide the largest reduction in the variational cost function. In the context of the sensitivity analysis, the largest successive reductions δe also are obtained for the forecasts run from the two initial states corresponding to the solutions of these first two loops. The next loops then only provide some more and more marginal convergence, and this numerical impact also is reflected in the iterative estimates of δe: F2m, (F2m)+ and the incremental evaluation of δeSimp. These findings suggest that incremental sensitivity diagnostics can be used to assess the benefit of trajectory recomputations in an incremental algorithm. It is furthermore consistent with the positive illustrations provided by Trémolet (2008) in an NWP data assimilation context, using Jb as distance measure. Trémolet indeed computed incremental first and second-order diagnostics of a rather similar aspect to F1m and F2m in this paper, though not for a forecast distance measure. However, it is worth stressing again that the incremental versions of quadrature-based sensitivity estimates seem more robust when non-linearities become significant (δemid or δeSimp), when compared to the truncated Taylor expansion sensitivity approximations (F2m or (F2m)+), which quickly break.

Acknowledgements

The author is grateful to Gérald Desroziers for several fruitful brainstorming discussions at an early stage of this work. The three anonymous reviewers helped improve the content of the paper. The codes have been developed under Scilab, free open source software for numerical computations. An original piece of code for the L96 non-linear and tangent linear models was provided by Olivier Pannekoucke.

References

  1. CarrassiA, VannitsemS, ZupanskiD, ZupanskiM. The maximum likelihood ensemble filter performances in chaotic systems. Tellus A. 2009; 61: 587–600.  

  2. DaescuD, TodlingR. Adjoint estimation of the variation in model functional output due to the assimilation of data. Mon. Wea. Rev. 2009; 137: 1705–1716, notes and correspondence.  

  3. ErricoR. Interpretation of an adjoint-derived observational impact measure. Tellus A. 2007; 59: 273–276.  

  4. GelaroR, LanglandR, PellerinS, TodlingR. The THORPEX observation impact intercomparison experiment. Mon. Wea. Rev. 2010; 138: 4009–4025.  

  5. GelaroR, ZhuY, ErricoR. Examination of various-order adjoint-based approximations of observation impact. Meteor. Zeit. 2007; 16: 685–692.  

  6. LanglandR, BakerN. Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus. 2004; 56: 189–201.  

  7. Le DimetF.-X, NavonI, DaescuD. Second-order information in data assimilation. Mon. Wea. Rev. 2002; 130: 629–648.  

  8. LorenzE. Predictability: a problem partly solved. ECMWF Seminar on Predictability. 1996; Reading, UK: ECMWF. 1–18.  

  9. LorenzE, EmanuelK. Optimal sites for supplementary weather observations: simulations with a small model. J. Atmos. Sci. 1998; 55: 399–414.  

  10. TrémoletY. First-order and higher-order approximations of observation impact. Meteor. Zeit. 2007; 16: 693–694.  

  11. TrémoletY. Computation of observation sensitivity and observation impact in incremental variational data assimilation. Tellus A. 2008; 60: 964–978.  

  12. YangX, DelsoleT. Using the ensemble Kalman filter to estimate multiplicative model parameters. Tellus A. 2009; 61: 601–609.  

comments powered by Disqus