The choice of the time integration scheme is an important consideration in the design of an atmospheric model. Accuracy is an obvious requirement for any method, but this has to be balanced by stability and efficiency concerns. Particularly for numerical weather prediction (NWP) in an operational context, we would like to be able to use time-steps which are long enough to reduce computational expense, while maintaining sufficient accuracy.
The existence of vastly differing time-scales in atmospheric phenomena, from slow large-scale modes to very fast gravity and sound waves, poses a challenge for time integration techniques. In particular, purely explicit schemes are restricted by severe time-step limitations. A number of approaches, such as semi-implicit and split-explicit schemes, have been used throughout the history of NWP to circumvent these constraints. Details of the various methods may be found in Durran (1999).
An alternative strategy for solving systems of time-dependent differential equations, which has attracted attention in a number of fields, is the use of exponential integration methods. The extensive review of Minchev and Wright (2005) traces their use back to 1960 and provides the general definition of an exponential integrator as ‘a numerical method which involves an exponential function (or a related function) of the Jacobian or an approximation to it’. These methods are exact for linear systems of ordinary differential equations (ODEs) with constant coefficients and are thus trivially A-stable.
While exponential methods offer the potential for high accuracy, the appearance of matrix exponentials in the solution can be very computationally expensive. This is of particular concern in the solution of partial differential equations (PDEs), where the spatial discretisation results in very large, albeit sparse, matrices. This rendered the methods somewhat unattractive in the past. However, a number of researchers began to investigate the use of Krylov subspace methods as an efficient solution to this problem. These are commonly used in conjunction with implicit time integration methods, where we typically have to approximate expressions of the form ${(I-\tau M)}^{-1}v$, for a given matrix M and scalar $\tau >0$. Results in the literature have shown, in fact, that exponential methods have an advantage over the traditional implicit approach, due to faster Krylov convergence of exponential functions, compared with the rational functions arising from the implicit approach (Hochbruck and Lubich, 1997; Loffeld and Tokman, 2013). This has provided strong motivation for the further development of these methods; c.f. Hochbruck et al. (1998), Tokman (2006) and the numerous references therein.
In this paper, we investigate the potential use of exponential integrators in NWP models. A preliminary testing with a shallow water model was outlined in Bonaventura (2010). Here we provide a comprehensive examination of the performance of exponential schemes in the shallow water framework by comparing them to alternative time integrators, both explicit and semi-implicit.
In the next section, we provide a general introduction to the theory of exponential integration methods, along with some details of specific schemes and the Krylov methods commonly used. In particular, we discuss the adaptive Krylov subspace algorithm of Niesen and Wright (2012), which we will use for the efficient handling of matrix exponentials. In Section 3, we introduce the shallow water model and describe the implementation of the exponential methods. Numerical results are presented in Section 4 followed by discussion and conclusions in Section 5.
We consider the following system which may arise after the spatial discretisation of a system of PDEs:
where $U\in {\mathbf{R}}^{N}$ is the state vector and N denotes the number of degrees of freedom. There are two main approaches. Firstly, we assume that we can split the right-hand side forcing into linear and nonlinear parts; that is, we rewrite the above as:
We wish to step the solution from ${U}_{n}=U(n\mathrm{\Delta}t)$ to ${U}_{n+1}$. By multiplying through by an integrating factor ${e}^{-Lt}$ we arrive at
A number of exponential-based schemes using this idea of separating the linear terms were considered in Beylkin et al. (1998) and Cox and Matthews (2002).
An alternative to this separation is described in detail in Tokman (2006). The forcing in eq. (1) is expanded about time t_{n}=nΔt to give
where A_{n} is the Jacobi matrix $\frac{\partial F}{\partial {U}^{T}}$ evaluated at t_{n} and $R(U(t))$ represents the remainder terms:
After multiplying eq. (4) by the integrating factor ${e}^{-{A}_{n}t}$, we obtain
where I is the identity matrix.
Note that eqs. (3) and (6) represent exact forms of the solution of their respective ODE systems. The method of evaluation of the integrals determines a particular numerical scheme.
As will be seen from the details of specific schemes in the coming sections, a particular family of functions arise in the development of exponential integrators. These are known as the ϕ-functions and may be defined for a scalar z as
and the functions satisfy the recurrence relation
With matrix arguments, the first three ϕ-functions are given by ${\varphi}_{0}(A)={e}^{A}$, , and ${\varphi}_{2}(A)=\left({e}^{A}-I-A\right){\left({A}^{-1}\right)}^{2}$.
By using a polynomial approximation for the integral in eq. (3), we can recover multistep numerical schemes of arbitrary order. Alternatively, we can use a multistage Runge–Kutta type approach. Many examples may be found in the literature; for example, Beylkin et al. (1998), Cox and Matthews (2002), Minchev and Wright (2005).
In relation to atmospheric applications, we note in passing the ‘slave-frog’ scheme of Beylkin et al. (1998), which is derived in a similar manner from eq. (2) over three time-levels and may be written as:
In a shallow water model based on a normal mode expansion, Daley (1980) used this method for the integration of the fast modes of the system, pointing out that this exact treatment of the linear terms does not result in a slowing of high-frequency gravity waves, unlike a semi-implicit treatment. In this case, of course, the system is diagonalised through the normal mode decomposition and thus the use of exponentials is not a computational concern.
Exact treatment of the linear terms with the resulting phase accuracy may also be achieved using a Laplace transform approach, whereby the system is solved in transform space and the solution then inverted to physical space. Clancy and Lynch (2011) describe such an algorithm in which the inversion is carried out using numerical quadrature. The scheme in eq. (8) is an analogous form of this, where inversion is exact.
In our current work, we focus on methods which use the Jacobian of the system, rather than splitting the forcing into linear and nonlinear parts, that is, approach (6) instead of eq. (3). A number of such schemes are evaluated in Tokman (2006); we consider two of these in particular. The first is the second-order scheme denoted EPI2:
This method also commonly appears in the literature as the ‘exponentially fitted Euler’ method, for example, Hochbruck et al. (1998), Newman (2003).
The second scheme we choose is the third-order EPI3:
where ${R}_{n-1}={F}_{n-1}-{F}_{n}-{A}_{n}\left({U}_{n-1}-{U}_{n}\right)$.
The motivation for choosing these two particular schemes will be discussed in the coming sections.
For a given matrix and vector $b\in {\mathbf{R}}^{N}$, the m-dimensional Krylov subspace is given by
Krylov subspace methods are widely used in the numerical solution of linear systems; see, for example, Chapters 6 and 7 of Saad (2003) for more details. As mentioned in the introduction, they have also been used to great effect with exponential integrators. All of these discretisations involve functions of matrix exponentials, for example, eq. (10). We do not need these explicitly, rather their action on a given vector. We can therefore seek approximations in a Krylov subspace, whose dimension is $m\ll N$. This approach to approximation of functions of matrices was first proposed by Van der Vorst (1987).
An Arnoldi algorithm may be used to construct an orthonormal basis $\{{v}_{1},{v}_{2},\mathrm{...},{v}_{m}\}$ for the Krylov subspace ${K}_{m}(A,b)$. The algorithm also produces an upper Hessenberg matrix ${H}_{m}={V}_{m}^{T}A{V}_{m}$, where V_{m} is an N×m matrix whose columns are the basis vectors (Saad, 2003). For a given function of a matrix f(A), we may approximate the action on the given vector, f(A)b, in K_{m}(A,b) as
where e_{1} is the first standard unit vector and $(.{(}_{2}$ denotes the vector 2-norm. Further details may be found in Tokman (2006). Since H_{m} has size m×m, the key to successful exponential integration is to be able to use m sufficiently small so that f(H_{m}) may be efficiently calculated using traditional methods, such as Padé approximation for exponential functions (Moler and Van Loan, 2003).
In terms of the ϕ-functions introduced earlier, we may rewrite the schemes (9) and (10) as
and
The two-stage, third-order EPIRK3 scheme of Tokman (2006) may be written
All of the methods presented require the evaluation of expressions of the form
for some set of vectors ${b}_{k}\in {\mathbf{R}}^{N}$, $k=0,1,\dots ,p$. Niesen and Wright (2012) have developed the phipm algorithm to efficiently carry out this task. This is achieved by using an adaptive time-stepping on the non-autonomous linear ODE
which has solution
By taking t=1, we recover the expression (15). The algorithm steps through this pseudo-time interval. Due to the recurrence relations between the ϕ-functions, at each step just one exponential-type expression of the form ${\varphi}_{p}(\tau A)v$ is needed, where τ is the pseudo-time-step. This is done using a Krylov projection as outlined in the previous section. The full description of the algorithm is given in the paper of Niesen and Wright (2012).
For a specified accuracy, the efficiency is determined both by the size of the Krylov subspaces used and the number of steps taken. When approximating ${\varphi}_{p}(\tau A)v$ with $\tau <1$, the smaller τ is, the fewer Krylov basis vectors will be needed from the Arnoldi iteration. However, this will then require more steps over the interval. The algorithm is fully adaptive and determines, after each step, which is the more cost-effective: more steps or larger subspace dimension m. Tokman et al. (2012) use the phipm algorithm in conjunction with Runge–Kutta exponential integrators and report a definite computational advantage to this adaptive approach.
We consider the use of phipm with the EPI2 and EPI3 methods given by eqs. (12) and (13), respectively. Taking eq. (13), for example, we see from eq. (15) that the second and third terms on the right-hand side may be evaluated by one call to the algorithm, with b_{0} the zero vector, ${b}_{1}=\mathrm{\Delta}t{F}_{n}$, ${b}_{2}=\frac{2}{3}\mathrm{\Delta}t{R}_{n-1}$ and $A=\mathrm{\Delta}t{A}_{n}$. In contrast, the two-stage EPIRK3 scheme in eq. (14) requires two calls to phipm at each time-step; as pointed out in Tokman et al. (2012), the number of stages in the scheme equals the number of required executions of the algorithm.
Tokman (2006) discusses the benefits of Runge–Kutta-type schemes, such as EPIRK3, when used in the context of adaptive time-stepping. In our present work, we are interested in methods with a constant time-step for atmospheric models. For this reason, we concentrate here on the single-stage EPI2 and EPI3 methods. As will be seen in Section 4, the main computational cost at each time-step is the call to the phipm subroutine, and so we wish to minimise the number of required calls overall.
The shallow water equations, describing a shallow rotating layer of incompressible and homogeneous fluid, are a commonly used testbed for numerical methods for atmospheric models. Here we use the model developed in Pudykiewicz (2011). This uses a finite volume discretisation of the governing equations on an icosahedral grid. The time integration was carried out using an explicit Runge–Kutta fourth-order scheme (RK4). Clancy and Pudykiewicz (2013) subsequently proposed a number of semi-implicit predictor–corrector time schemes and tested these against the RK4 using a number of standard experiments. These methods used the Krylov-based generalised conjugate residual method (Eisenstat et al., 1983) to solve the linear systems arising from the semi-implicit formulation and were found to allow considerably longer time-steps while retaining sufficient levels of accuracy. We now seek to test the utility of the EPI2 and EPI3 schemes within this experimental framework.
The shallow water model consists of four prognostic variables: the three Cartesian wind components and the depth of the fluid layer. After spatial discretisation, we have a system of ODEs which can be written in the form (1). Details are given in the appendix. The state vector, U, contains the averages of the prognostic variables in each of the control volumes on the icosahedral grid. Thus, for a grid with N vertices, we have a system with a state vector of length 4N and a 4N×4N sparse Jacobi matrix.
To solve the ODE system using the EPI2 or EPI3 schemes given by eqs. (12) and (13), respectively, we proceed as follows. At any given time-level n, representing time t=nΔt, n=0,1,…, we first compute the forcing terms at this time, F_{n}, along with the Jacobi matrix A_{n}. An expression for this is given in the appendix. Note that EPI3 uses three time-levels, with the older (n–1) information needed in the R_{n–1} term. To start the integration, we take a single EPI2 step to get U_{1} from U_{0}.
Each scheme then requires a single call to the phipm subroutine. As discussed in the previous section, the input vectors b_{k}$[{b}_{0};{b}_{1}]=[0;\mathrm{\Delta}t{F}_{n}]$ for EPI2 and $[{b}_{0};{b}_{1};{b}_{2}]=[0;\mathrm{\Delta}t{F}_{n};\frac{2}{3}\mathrm{\Delta}t{R}_{n-1}]$ for EPI3. The vector (15) calculated by phipm is then used to update the solution from U_{n} to U_{n+1}.
The phipm subroutine has a number of additional input arguments. As mentioned earlier, we set the parameter t in eq. (15) to 1. In both EPI2 and EPI3, the matrix used in the routine is ΔtA_{n}; in our case this is specified as non-symmetric (the logical flag symm allows for efficient Krylov projection in the symmetric case).
The first guess of the Krylov subspace dimension is given by the argument m. The adaptive algorithm subsequently adjusts the dimension. The remaining input is the tolerance parameter tol. This appears in eqs.(17) and (18) of Niesen and Wright (2012) and relates to the initial choice of step size in the algorithm and the adaptive procedure for this.
We now test the exponential schemes in the icosahedral model using a number of standard test cases from the literature. In all of the following, we use the icosahedral grid number 6, corresponding to six successive divisions of the faces of the original icosahedron. This grid has N=40,962 vertices with an average spatial resolution of 112 km (Pudykiewicz, 2011).
In order to analyse the simulation results, we will use the standard normalised ${\ue54b}_{2}$ and ${\ue54b}_{\infty}$ height error measures, as defined in Williamson et al. (1992):
where h_{a} is the analytic or reference height field and $\mathrm{\mathcal{I}}\left\{\right\}$ denotes integration of the field over the whole spherical domain.
Before tests with the full shallow water system, we will first consider the advection of a scalar field by a prescribed velocity field in Section 4.1 below. This involves just the continuity equation. As we will see, it is a special case for exponential methods and is therefore considered separately. In Section 4.2, we return to the full system and discuss the various model configurations for the experiments. A description of the shallow water test cases and the simulation results follow this.
In this case, we deal with just one of the discretised equations from the shallow water set given in the appendix, namely the continuity equation:
We note that with the winds constant (or alternatively prescribed at each time-step), the above can be written as a linear system
where the (i, j)-th component of the matrix M is
No summation convention is used in the above definition.
As noted in previous sections, exponential methods are A-stable and exact for linear constant-coefficient systems; the Jacobi matrix is simply M. The only limits in terms of accuracy and stability will be from the computation of the exponential ${e}^{M\mathrm{\Delta}t}$.
We test this with the deformational flow case of Nair and Machenhauer (2002), with the same parameter settings as in Pudykiewicz (2011). This consists of the advection of the scalar field by two static vortices centred at the poles. In Fig. 1, the analytic solutions (looking down on the north pole) after 6, 20 and 60 simulated days are plotted in panels (a), (c) and (e), respectively. Table 1 shows the normalised ${\ue54b}_{\infty}$ errors for a number of simulations. First is the RK4 scheme with a time-step of 2160s, also used by Pudykiewicz (2011). The scheme is stable for Courant numbers of around 2.8 (Durran, 1999) and the simulation may in fact be carried out with as much as a 4-h time-step. The results, however, are indistinguishable from those even with a much lower time-step, and so the solution shown is dominated by spatial truncation errors.
The other results in Table 1 are for the EPI2 and EPI3 schemes with a time-step of 12 h. The results are essentially identical to those with the RK4. In Fig. 1b,d,f, we show the solutions for the EPI2 run, which very closely match the analytic solutions.
While this linear system is a trivial case for the exponential methods, it is still noteworthy that they converge so quickly even with such unrealistically long time-steps. The phipm routine at each step requires fewer than 10 Krylov basis vectors, on average. Furthermore we see no obvious errors originating from the computation of the matrix exponentials.
The method is also mass conserving down to the round-off error level (Qaddouri et al., 2012). Combination of this property with the stability of the algorithm for very large Courant numbers is particularly important for simulation of cloud variables and atmospheric transport of chemically reactive species. In the past, these fields were advected in the NWP and climate models using semi-Lagrangian advection which suffers from the inherent mass conservation errors. The use of exponential integrators for advecting the crucial microphysical fields will be particularly important to increase the efficiency and accuracy of many aspects of atmospheric chemistry.
We now return to the full shallow water system. As mentioned, the icosahedral grid number 6 has N=40,962 vertices. The state vector of the system, comprised of the four prognostic variables, is therefore of size 163 848. The construction of the sparse Jacobi matrix, of size (163,848×163,848), is discussed in the appendix.
We compare the integrations with EPI2 and EPI3 against those with two reference time schemes. The first is the explicit RK4 mentioned in the previous section. For this, we use a time-step of Δt=240s, as in Pudykiewicz (2011).
In addition, we consider the semi-implicit T–ABT scheme of Clancy and Pudykiewicz (2013). Briefly, this uses a split form of the equations, as in eq. (2). The nonlinear terms are discretised with an Adams–Bashforth-trapezoidal predictor–corrector method (Kar, 2012), while a trapezoidal averaging is applied to the linear terms at each stage. With the system written in the form (2), the scheme is given by
Details of the linear splitting are given in the appendix. With this T–ABT scheme, we take Δt=900s, to balance accuracy and efficiency concerns (Clancy and Pudykiewicz, 2013).
For the exponential integration schemes, we select time-steps of 3600 and 7200s. These are considerably longer than those for the two reference methods. As mentioned in the introduction, the exponential schemes can be shown to be trivially A-stable, since they are exact for linear equations. The choice of time-step is then governed by accuracy and also by the convergence of the Krylov projections. Tests showed that taking Δt=10800s was too long; the algorithm was slow to converge and the results were noticeably less accurate. For the arguments of phipm already discussed, we use m=30 and tol=1×10^{−4}. Reducing tol to 1×10^{−5} was found to have negligible effect on the solutions.
We now discuss the results from four test cases, in terms of accuracy and efficiency. We note that, unless specified, no explicit dissipation is added by default in these experiments.
We first consider one of the time-dependent flow cases of Läuter et al. (2005). This unsteady case is a useful test because it has a known analytic solution, allowing us to measure the numerical accuracy. In terms of the zonal (u) and meridional (v) winds, the height of the fluid layer (h) and the height of the orography (h_{s}), the flow is given at time t as follows:
where λ and φ are longitude and latitude, respectively, a is the Earth's radius and Ω is the Earth's angular velocity. As in Pudykiewicz (2011), we use the parameter values u_{0}=2πa/12m/d, k_{1}=133681m^{2} s^{−2}, k_{2}=10 m^{2}s^{−2} and θ=π/4.
The next two cases are from the standard suite of Williamson et al. (1992). We use case 5, consisting of an initially balanced zonal flow interacting with an isolated mountain, and the Rossby-Haurwitz wave case 6. Neither of these cases has a known analytic solution. In order to compute error measures for comparison, we use a high-resolution reference solution obtained from the icosahedral model at grid 7 (163 842 vertices, with an average grid resolution of 56 km) with the RK4 scheme at Δt=90s. This reference was also used in Clancy and Pudykiewicz (2013).
The final case to be considered is the barotropic instability case described in Galewsky et al. (2004). The initial conditions consist of a mid-latitude jet which is destabilised by the addition of a small height perturbation, resulting in a more complex, multiscale flow when compared with other test cases.
In Fig. 2, we plot the errors for a 10-d simulation of the Läuter et al. (2005) unsteady case. Here, for interest, we also include forecasts with the exponential methods at Δt=900s, matching the semi-implicit T–ABT.
The explicit RK4 with the lowest time-step is the most accurate. However, both exponential methods show very similar accuracy with the much longer Δt=900s, despite both having lower order of accuracy than the RK4. Increasing the time-step by a factor of 4 to 1 h, the EPI2 and EPI3 are both still considerably more accurate than the T–ABT at 900s. Even at a 2-h time-step, they generally outperform the T–ABT, although the EPI3 begins to lose accuracy towards the end of the simulation (in fact, this particular simulation begins to destabilise after day 10).
In Figs. 3 and 4, we show the results for the Williamson et al. (1992) mountain and Rossby–Haurwitz wave cases, respectively. Both of these sets of experiments show similar behaviour. The exponential methods with time-steps of 1 or 2 h are as accurate as the explicit RK4 with only Δt=240s. They are more accurate than the T–ABT forecasts, although the difference is less in the Rossby–Haurwitz case, particularly at the later stages (Fig. 4).
Finally we consider the barotropic instability case. In Fig. 5, we plot the relative vorticity fields after 6 d of integration. The performance of the icosahedral model with the RK4 scheme in this case has been compared with simulations from other models (Qaddouri et al., 2012).
From the top of Fig. 5, we have the RK4 followed by the T–ABT, both giving comparable results. The bottom two panels are for the EPI2 and EPI3, respectively, with a time-step of 1 h. Looking in particular at the region between $\lambda =\pi /2$ and $\lambda =\pi $, the T–ABT is, perhaps, the smoothest; each of the other schemes produces some more noise in this area. Given the length of the time-steps for the two exponential schemes, it is impressive that they can compete with the RK4 in this quite challenging test case, even without the addition of explicit diffusion. In Fig. 6, we plot similar fields for the exponential methods using a 2-h time-step. For these values, however, we see that there is a noticeable distortion of the solutions when compared with Fig. 5.
In general, some form of artificial dissipation is needed in simulations with atmospheric models to control the smallest grid-scales, even in the case of fully monotonic, non-oscillatory schemes; for a further discussion see Jablonowski and Williamson (2011). We now add the same fourth-order diffusion described in Pudykiewicz (2011); c.f. Fig. 21 in that paper. In Fig. 7, we again plot the relative vorticity fields at day 6 for the RK4 at Δt=240s along with the exponential schemes with a 2-h time-step. In contrast to the results in Figs. 5 and 6, there is now close agreement between the models.
This test case also allows us to examine the effect of a discretisation method on the phase speed of fast-moving waves. Galewsky et al. (2004) examine the instantaneous height and divergence fields (their Fig. 2) during the early hours of the simulation. At this adjustment stage, the flow evolves with the propagation of gravity waves.
In Fig. 8, we plot the divergence fields after 12 h for the six simulations depicted in Figs. 5 and 6, that is, no dissipation included. We would expect a phase error with the semi-implicit T–ABT, due to the use of trapezoidal averaging (Durran, 1999). Indeed, all of the other models show comparable performance, whereas for the T–ABT (top right panel) there is a noticeable slowing and distortion of the gravity wave. This fact is a key motivation for work with exponential integrators.
In the previous section, we saw how exponential integration can offer a high level of accuracy at rather long time-steps. We now consider issues of comparative efficiency between the various methods. We expect that the Krylov projections involved in the exponential methods will add substantially to the computational cost per time-step. Ideally, this would be compensated by fewer steps.
The model is implemented in MATLAB and run on a desktop computer with an AMD Athlon 64 X2 Dual Core 4800+ processor. In Table 2, we list some timings for the first three of the test cases used. The timings given, in minutes, are the average execution time per simulated day of the experiment. The semi-implicit T–ABT simulations are consistently the fastest. The exponential methods are comparable with the RK4, particularly with the 2-h time-step.
In the tests of comparative performance, Loffeld and Tokman (2013) demonstrate that exponential methods can be superior to traditional implicit schemes. In our case, however, the discretisation of T–ABT reduces to a linear system in the height field only (details are given in Appendix A.2). The exponential methods, on the other hand, work with the entire system of four prognostic variables. The difference in execution times is therefore to be expected.
A partial breakdown of the cost of the exponential methods is given in Table 3 for the 14-d Rossby–Haurwitz wave simulation. This shows the average time, in seconds, taken by three different components of the discretisations at each step: the computation of the Jacobi matrix, the calculation of the forcing terms and the execution time of the phipm routine. As expected, the phipm routine is the dominant cost. The final column shows the average number of Krylov subspace basis vectors required at each step. Looking at Table UT0002 and 3 together, we see why doubling the time-step for EPI2 and EPI3 does not significantly reduce the overall execution time. With the longer time-step, a larger Krylov subspace is necessary to maintain the same level of accuracy, and this results in the longer timings for phipm.
The tolerance parameter in phipm, tol, is another consideration when discussing the efficiency of the exponential integrators. The default value used by Niesen and Wright (2012) is 10^{−7}. This parameter determines the accuracy of the evaluation of the ϕ-functions by controlling the dimension of the Krylov subspace and the length of the internal pseudo-time-step. As mentioned earlier, we have used a value of tol=10^{−4} throughout this work, with time-steps of 1 and 2 h. In experiments with less accurate calculation of the ϕ-functions using a tolerance of 10^{−3}, it was possible to keep a stable integration even with the longer time-step, with only slight deterioration of the conservation properties. More complete analysis and experimentation with the balancing of stability, efficiency and accuracy is an ongoing effort.
Further gains in efficiency may also be achieved by exploring other recently described techniques for evaluating the solution of eq. (6). One promising direction is discussed by Al-Mohy (2010), who presents an alternate algorithm for exponential integrating schemes which is shown to be more efficient than phipm in a number of experiments. The examination of this new scheme will be the subject of forthcoming research in the context of NWP.
Two single-stage exponential time integration methods have been implemented and tested in a shallow water model. This represents the first step in a longer-term goal of assessing the potential benefits of this approach for the time integration of more complete models of the atmosphere. The methods were compared with an explicit and semi-implicit time discretisation using a number of standard test cases.
The exponential methods were found to yield very accurate solutions, exceeding those of the semi-implicit method. The second-order EPI2 gave comparable results to the explicit fourth-order Runge–Kutta, even with dramatically longer time-steps. In spite of the caveat of Tokman (2006) that this method is not accurate enough for general nonlinear systems, due to its omission of the integral in eq. (6), it seems to be sufficient for simulating shallow water flow at least. Only in the challenging unstable jet case of Galewsky et al. (2004) do the two exponential methods struggle at the longer, 2-h time-step. However, once dissipation was incorporated into the simulation, the results again were consistent with those of the explicit scheme. In any practical applications with more complex models, some mechanism for dissipating smaller scales is necessary, and so this is not a major cause of concern.
The high accuracy of the exponential methods is to be somewhat expected, given their analytic treatment of the linearised part of the system. Indeed, in the case of the passive advection where we are dealing with a linear system with constant winds, the methods give an ‘exact’ solution with no formal time truncation errors and are subject only to the errors in the computation of the matrix exponentials.
The crucial issue to address is the efficiency of these methods, given the size of the systems involved and the potentially large computational overhead with the matrix exponentials. In this work, we used the phipm algorithm of Niesen and Wright (2012). This was found to give execution times roughly comparable with the existing RK4 method. On the other hand, the semi-implicit T–ABT was always faster. This is unsurprising since, with this scheme, we have to solve only a linear system (a discrete Helmholtz equation) for just one of the prognostic variables, as described in the Appendix A.2. The exponential methods, however, must deal with a system four times as large as a result of considering the full system of four prognostic variables. On the other hand, for the case of advection with prescribed winds, we have just one prognostic variable. With the high accuracy mentioned previously and the ability to take very long time-steps, exponential methods may prove to be of benefit in efficiently simulating the advection of various tracers and interacting species.
As discussed in the previous section, it may also be possible to improve on the efficiency of the exponential methods by experimenting with the tolerance parameter when computing the exponential functions. The resulting compromise in accuracy may be acceptable and should be measured and balanced against the performance of the semi-implicit schemes.
The next stage of this research is to investigate further the emerging exponential integrators and apply them to more complete atmospheric models. The increased stiffness and resulting time-step limitations of non-hydrostatic models, due to the presence of rapidly propagating acoustic waves, render purely explicit methods impractical. Exponential methods potentially offer a means of achieving high accuracy and stability, without the distortion of the fast waves typical of semi-implicit methods. It remains to be seen what level of efficiency can be maintained in the complete system.
The authors are grateful to Claude Girard for his helpful internal review and to the two anonymous reviewers whose comments were of great benefit. This research was partly funded by the Natural Sciences and Engineering Research Council of Canada.
A. Details of the shallow water model
Here we provide details of the shallow water model relevant to the current work. A full description of the icosahedral grid and shallow water model may be found in Pudykiewicz (2006, 2011).
After the finite volume spatial discretisation, the shallow water equations are written as systems of ODEs as follows; c.f. eq. (63) of Pudykiewicz (2011).
The prognostic variables are the Cartesian wind components, u_{x}, u_{y} and u_{z}, and the depth of the fluid layer, h. The {} brackets represent column arrays consisting of the control volume averages of the variables. The W_{x}, W_{y} and W_{z} arrays contain terms involving vorticity components and $f=g\left({h}_{s}+h\right)+\mid \mathbf{u}{\mid}^{2}/2$, where h_{s} is the orography and u is the velocity vector. The sparse matrices (GS_{x}, GS_{y}, GS_{z}) and (D_{x}, D_{y}, D_{z}) are used to compute the Cartesian components of the gradient and divergence operators, respectively. The construction of the operators ensures that the wind is constrained to be tangent to the sphere. Full details are given in Pudykiewicz (2011).
We may also add explicit dissipation to the system in the form $\mathbf{D}f\left\{\psi \right\}$, for each prognostic variable $\psi $. The dissipation operator takes the form
where L is the sparse matrix representing the Laplace operator.
A.1. Jacobi matrix
When implementing the exponential integration schemes in Section 3, we write the system in terms of a single state vector consisting of the arrays of all the prognostic variables. We then also require the Jacobi matrix of the right-hand side forcing.
The system of eq. (A1), with the dissipation terms added, could be rewritten in the compact form
where the state vector U contains components of the velocity and the height field
The lower index denotes Cartesian components whereas the upper index represents the control volumes. The elements of the right-hand side vector of eq. (A3)
are defined by the following relations:
where $G{S}_{x}^{ij}$, $G{S}_{y}^{ij}$, $G{S}_{z}^{ij}$, ${V}_{x}^{ij}$, ${V}_{y}^{ij}$, ${V}_{z}^{ij}$, ${D}_{x}^{ij}$, ${D}_{y}^{ij}$, ${D}_{z}^{ij}$ are the components of the sparse matrices used to calculate gradient, vorticity and divergence, ${\mathrm{\mathcal{N}}}_{x}$, ${\mathrm{\mathcal{N}}}_{y}$,${\mathrm{\mathcal{N}}}_{z}$ are column arrays with the x, y and z components of the vector normal to the surface of the sphere and the Coriolis parameter is given by γ. The symbol .* indicates that arrays are multiplied component-wise. The Einstein summation convention is applied in eqs. (A4)–(A7).
The Jacobi matrix of the system (A3)
$\mathrm{\mathcal{J}}=\left\{\frac{\partial F}{\partial {U}^{}}\right\}$ can be written in the explicit form
where
the arrays $\left\{{P}_{x}^{i}\right\}$, $\left\{{P}_{y}^{i}\right\}$ and $\left\{{P}_{z}^{i}\right\}$ contain the x, y and z components of the vector product of the velocity with the vector normal to the surface of the sphere, {0} denotes the all-zero sparse matrix,
and
with
The Kronecker delta is written above as δ^{ij} and the z^{i} components are defined in Pudykiewicz (2011). Please note that we do not use the summation convention in the bracketed terms in eqs. (A9)–(A11). Consequently, with this convention the expression $\left\{{P}_{x}^{i}{V}_{x}^{ij}\right\}$ denotes the matrix with ij-th element equal to ${P}_{x}^{i}{V}_{x}^{ij}$.
Due to the sparsity of this Jacobi matrix, we only need to store the relatively small number of non-zero elements. The product of the matrix and a vector can then be efficiently evaluated when required. An alternative to this approach would be to use a subroutine which evaluates this product without explicitly storing the constructed matrix.
A.2. Linear terms for semi-implicit methods
For the T–ABT integration scheme given in eq. (20), we apply a trapezoidal averaging to the linear terms governing the fast gravity waves in the shallow water system. As was done in Clancy and Pudykiewicz (2013), we define constant reference and perturbation depths such that $h=\overline{h}+h\prime $ and redefine $f=g{h}_{s}+\mid {\mathbf{u}}^{2}\mid /2$. We can then rewrite the system (A1) as
where the N_{x}, N_{y}, N_{z}, and N_{h} represent those terms remaining from the forcing in eq. (A1). With the system now written in the form of eq. (2), we can apply the discretisation of eq. (20). After some substitution, we arrive at a discrete Helmholtz equation for $\left\{h\prime \right\}$at the new time-level. Full details may be found in Clancy and Pudykiewicz (2013).
Al-Mohy A. H . Algorithms for the matrix exponential and its Fréchet derivative . 2010 ; University of Manchester, Manchester, UK : Faculty of Engineering and Physical Sciences . 155 . PhD Thesis .
Beylkin G , Keiser J. M , Vozovoi L . A new class of time discretization schemes for the solution of nonlinear PDEs . J. Comput. Phys . 1998 ; 147 : 362 – 387 .
Bonaventura L . Exponential integrators for multiple time scale problems of environmental fluid dynamics . Innovative Integrators Workshop 27–30 October 2010 . 2010 ; Austria : Universität Innsbruck .
Clancy C , Lynch P . Laplace transform integration of the shallow-water equations. Part I: Eulerian formulation and Kelvin waves . Q. J. Roy. Meteorol. Soc . 2011 ; 137 : 792 – 799 .
Clancy C , Pudykiewicz J. A . A class of semi-implicit predictor–corrector schemes for the time integration of atmospheric models . J. Comput. Phys . 2013 ; 250 : 665 – 684 .
Cox S. M , Matthews P. C . Exponential time differencing for stiff systems . J. Comput. Phys . 2002 ; 176 : 430 – 455 .
Daley R . The development of efficient time integration schemes using model normal modes . Mon. Weather Rev . 1980 ; 108 : 100 – 110 .
Durran D. R . Numerical Methods for Wave Equations in Geophysical Fluid Dynamics . 1999 ; New York : Springer .
Eisenstat S. C , Elman H. C , Schultz M. H . Variational iterative methods for nonsymmetric systems of linear equations . SIAM J. Numer. Anal . 1983 ; 20 : 345 – 357 .
Galewsky J , Scott R. K , Polvani L. M . An initial-value problem for testing numerical models of the global shallow-water equations . Tellus A . 2004 ; 56 : 429 – 440 .
Hochbruck M , Lubich C . On Krylov subspace approximations to the matrix exponential operator . SIAM J. Numer. Anal . 1997 ; 34 : 1911 – 1925 .
Hochbruck M , Lubich C , Selhofer H . Exponential integrators for large systems of differential equations . SIAM J. Sci. Comput . 1998 ; 19 : 1552 – 1574 .
Jablonowski J , Williamson D. L , Lauritzen P. H , Jablonowski C , Taylor M. A , Nair R. D . The pros and cons of diffusion, filters and fixers in atmospheric general circulation models . Numerical Techniques for Global Atmospheric Models . 2011 ; Berlin Heidelberg : Springer–Verlag . 381 – 493 .
Kar S. K . An explicit time-difference scheme with an Adams–Bashforth predictor and a trapezoidal corrector . Mon. Weather Rev . 2012 ; 140 : 307 – 322 .
Läuter M , Handorf D , Dethloff K . Unsteady analytical solutions of the spherical shallow water equations . J. Comput. Phys . 2005 ; 210 : 535 – 553 .
Loffeld J , Tokman M . Comparative performance of exponential, implicit, and explicit integrators for stiff systems of ODEs . J. Comput. Appl. Math . 2013 ; 241 : 45 – 67 .
Minchev B. V , Wright W. M . A Review of Exponential Integrators for First Order Semi-Linear Problems . 2005 . Technical Report 2/2005, Department of Mathematical Sciences, NorwegianUniversity of Science and Technology, N-7491 Trondheim, Norway .
Moler C , Van Loan C . Nineteen dubious ways to compute the exponential of a matrix, twenty–five years later . SIAM Rev . 2003 ; 45 : 3 – 49 .
Nair R. D , Machenhauer B . The mass-conservative cell-integrated semi-Lagrangian advection scheme on the sphere . Mon. Weather Rev . 2002 ; 130 : 649 – 667 .
Newman C. K . Exponential Integrators for the Incompressible Navier–Stokes Equations . 2003 ; Blacksburg, VA : Virginia Polytechnic Institute and State University . 101 . PhD Thesis .
Niesen J , Wright W. M . Algorithm 919: a Krylov subspace algorithm for evaluating the ϕ-functions appearing in exponential integrators . ACM Trans. Math. Software . 2012 ; 38 ( 3 ): 1 – 19 .
Pudykiewicz J. A . Numerical solution of the reaction-advection-diffusion equation on the sphere . J. Comput. Phys . 2006 ; 213 : 358 – 390 .
Pudykiewicz J.A . On numerical solution of the shallow water equations with chemical reactions on icosahedral geodesic grid . J. Comput. Phys . 2011 ; 230 : 1956 – 1991 .
Qaddouri A , Pudykiewicz J , Tanguay M , Girard C , Côté J . Experiments with different discretizations for the shallow-water equations on a sphere . Q. J. Roy. Meteorol. Soc . 2012 ; 138 : 989 – 1003 .
Saad Y . Iterative Methods for Sparse Linear Systems . 2003 ; SIAM, Philadelphia .
Tokman M . Efficient integration of large stiff systems of ODEs with exponential propagation iterative (EPI) methods . J. Comput. Phys . 2006 ; 213 : 748 – 776 .
Tokman M , Loffeld J , Tranquilli P . New adaptive exponential propagation iterative methods of Runge–Kutta type . SIAM J. Sci. Comput . 2012 ; 34 : A2650 – A2669 .
Van der Vorst H. A . An iterative solution method for solving f(a)x=b using Krylov subspace information obtained for the symmetric positive definite matrix a . J. Comput. Appl. Math . 1987 ; 18 : 249 – 263 .
Williamson D. L , Drake J. B , Hack J. J , Jakob R , Swarztrauber P. N . A standard test set for numerical approximation to the Shallow Water equations in spherical geometry . J. Comp. Phys . 1992 ; 102 : 211 – 224 .