A- A+
Alt. Display

# Nowcasting Meso-γ-Scale Convective Storms Using Convolutional LSTM Models and High-Resolution Radar Observations

## Abstract

Keywords:
How to Cite: Kim, D.-K. and Ushio, T., 2022. Nowcasting Meso-γ-Scale Convective Storms Using Convolutional LSTM Models and High-Resolution Radar Observations. Tellus A: Dynamic Meteorology and Oceanography, 74(1), pp.17–32. DOI: http://doi.org/10.16993/tellusa.37
Published on 22 Mar 2022
Accepted on 18 Feb 2022            Submitted on 18 Feb 2022

## 1. Introduction

Localized, meso-γ-scale convective storms often called convective showers occur frequently during summertime, bringing about heavy rains, flash floods or hails for a time period of less than about 1 h. Over the past decades, there has been progress in precipitation nowcasting techniques mostly based on radar echo extrapolation with retrieved velocity fields (Rinehart and Garvey, 1978; Dixon and Wiener, 1993; Germann and Zawadzki, 2002; Bowler et al., 2006; Ruzanski et al., 2011; Li et al., 2014; Bechini and Chandrasekar, 2017). Nowadays, many precipitation nowcasting systems such as McGill Algorithm for Precipitation Nowcasting by Lagrangian Extrapolation (MAPLE) (Germann and Zawadzki, 2002), Short-Term Ensemble Prediction System (STEP) (Bowler et al., 2006), Severe Weather Automatic Nowcast System (SWAN) (Hu et al., 2012) or Short-range Warning of Intense Rainstorms in Localized Systems (SWIRLS) (Li et al., 2014) are operational at different scales to prevent or mitigate disasters and losses caused by such storms.

Nevertheless, it remains quite challenging to nowcast thunderstorms which are highly variable in time and space, particularly in very short periods < 6 h. In general, the role of observational data is quite important for very short-term forecasts of 0~6 h (i.e., nowcast) and short-term forecasts of more than 6 h. Whereas, Numerical Weather Prediction (NWP) models have limitations in these time ranges due to initial spin-up issues until reaching a stable model state (Chung and Yao, 2020) and are not capable of reproducing initial target distributions correctly. To improve nowcasting of thunderstorms in less than 6 h, various observations of the atmospheric state from radars, satellites, or other platforms need to be properly and timely assimilated or blended into NWP models (Bowler et al., 2006; Sun et al., 2014). Doppler radar observations remain most useful for quantitative precipitation nowcasting since radars can provide high spatial and temporal resolution measurements of liquid or ice particles which backscatter intensities are proportional to the sum of the sixth powers of the diameters. Radar-based extrapolation nowcasts still appear to outperform very short-range NWP forecasts but also reveal limitations to predict the growth or decay of storms accurately. On the other hand, Otsuka et al. (2016) performed radar-based three-dimensional (3D) nowcasting experiments for intense rainfall areas by using motion vectors derived from the Tracking Radar Echoes by Correlation (TREC) technique (Rinehart and Garvey, 1978) and the Continuity of TREC vectors (COTREC) (Li et al., 1995) which is the TREC’s variational version to smooth erroneous vector fields. They noted that the 3D nowcasting model outperformed the 2D nowcasting since 3D motion vectors could capture changes of ascending and descending convective cores in different phases, leading to more accurate rainfall predictions.

As such, although there have been much efforts to improve extrapolation-based nowcasts by blending with operational NWP forecasts (Bowler et al., 2006; Chung and Yao, 2020), it is still needed to predict heavy rain intensities and areas more precisely under the complex atmospheric state at a given place. In the last decades, there have been increasing studies to use Artificial Neural Networks (ANN) for estimating precipitation from observations of spaceborne platforms (Bellerby et al., 2000; Tao et al., 2018; Sadeghi et al., 2019). A Convolutional Neural Network (CNN) model proposed by Sadeghi et al. (2019) showed higher accuracy in estimating rainfall rates and capturing a more precise spatial shape and rainfall peaks. Predictive studies that exploit CNN models with a Long Short-Term Memory (LSTM) network have also been done by detecting motions in images or videos in successive frames and generating future motion frames (Shi et al., 2015; Walker et al. 2015; Wang et al., 2017). Shi et al. (2015) indicated that the convolutional LSTM network is more efficient for processing sequential data and learning long-range dependencies than conventional Recurrent Neural Networks (RNNs). CNNs have an advantage for image processing by extracting important features from local neighborhood data via convolutional transformation (Miao et al., 2015).

However, a few studies that use CNN or ANN models for nowcasting local convective storms using radar observations have been conducted (Shi et al., 2015; Ayzel et al., 2019; Foresti et al., 2019; Kim et al., 2021; Wang et al., 2021). Foresti et al. (2019) noted that ANN predictions that take the term of growth and decay into account showed better performance compared to Lagrangian persistence which advects precipitation patterns along with motion vector fields. Shi et al. (2015) employed a convolutional LSTM network for 2D sequential radar images of a convective storm to test its nowcasting capability and indicated that the convolutional LSTM network outperforms a fully connected LSTM since the convolutional LSTM network could detect spatial features as filters slide over pixels of image. Higher relationships between pixels close to each other are learned during training and complex features can be captured by deep layers of convolution (Miao et al., 2015; Sadeghi et al., 2019). This study proposes two convolutional LSTM models in different architecture that use unique high-resolution radar datasets from short-lived, local convective storms. Nowcasting such small convective storms is examined with increasing lead time up to 10 min although there are various types of storms across scales. This paper is organized as follows. Section 2 describes radar datasets and a local convective storm event selected in this study. Section 3 explains the methodology and model design. Section 4 describes the experimental results with statistics and forecast skill scores and Section 5 presents the discussion and summary.

As a new-generation weather radar, Multi-Parameter Phased Array Weather Radar (MP-PAWR) was developed to make very rapid volume scans as well as accurate rainfall estimations by dual-polarization capability and was installed in Saitama city, Japan (N35.8615°, E139.6090°) in December 2017 (Figure 1). At X-band (9425 MHz), the MP-PAWR can complete a 3D volume scan of a precipitating system at every 30 s with a volume size of 60×60×15 km and produce polarimetric radar variables such as horizontal reflectivity (Zh), differential reflectivity (Zdr), differential phase Φdp, correlation coefficient (ρhv), etc. Such fine 3D volume scans of the MP-PAWR enable to resolve small changes in convective cells, which can contribute to better nowcasting by deep neural networks. The specification and coverage of the MP-PAWR are shown in Table 1 and Figure 2a, respectively. Radar reflectivity (Z) data are converted to Constant Altitude Plan Position Indicator (CAPPI) data. Such horizontal data assist in detecting radar echo patterns and obtaining horizontal motion vectors from an optical flow method which will be described in the following section. The CAPPI images at 1.5 km above ground level (AGL) were used as input to all of the models.

Figure 1

Photos of (a) the MP-PAWR and (b) the phased array antenna.

Table 1

Specification of the MP-PAWR.

PARAMETERS AUTO-01 AUTO-02

Antenna element Dual-polarized patch antenna

Frequency 9425 MHz

Azimuth resolution 1.2 degree

Elevation 0.0~60.0 degrees 0.0~90.0 degrees

Elevation resolution 0.5~1.0 degrees

Number of elevation 77114

Range resolution 150 m75 m

Time resolution for volume scan 60 sec30 sec

Pulse width 1 μs (short)1 μs (short)
74 μs (long)48 μs (long)

Differential reflectivity (Zdr)
Differential phase (Φdp)
Specific differential phase (Kdp)
Doppler velocity (V)
Spectrum width (W)

Figure 2

(a) Map of the MP-PAWR coverage (AUTO-02 mode) and (b) the observed Z field at 22:49:55 JST on July 24, 2018, which is an initialized time (t0) for all model forecasts. Here, Japan Standard Time (JST) = UTC + 9. In (b), the red box indicates the experiment domain of the CNN models (40 × 50 km) and the arrows that denote motion vectors are obtained from the optical flow method.

The case selected in this study is a localized convective rain event that occurred on 24 July 2018 over the Tokyo metropolitan area. The MP-PAWR observed a series of meso-γ-scale storms from 2100 to 2359 JST. Here, Japan Standard Time (JST) = UTC + 9. During this period, heavy rainfall with a Z maximum over 60 dBZ occurred as a few convective cells with the largest length of ~20 km move towards the south. They showed dynamic changes in three stages: cumulus stage, mature stage, and dissipating stage during their lifetime (<1 h). Figure 2b shows an observed Z field on 22:49:55 JST which is an initialized time for all model forecasts in this study. Several cells are shown with motion vectors obtained from an optical flow method. The domain size of analysis is 40×50 km (161×201 pixels). The x and y pixel resolution is 250 m. The data period is 2 h from 2200 to 2359 JST at 30 s time interval and thus the total number of radar data files is 242.

## 3. Methodology and models

### 3.1. Optical flow method and advection forecast model

In the past, various algorithms such as a cross-correlation method (Rinehart and Garvey, 1978), centroid method (Dixon and Wiener 1993), mean absolute difference method (Otsuka et al., 2016), Fast Fourier Transform (Ruzanski et al., 2011) have been used to estimate motion vectors between two consecutive fields. An optical flow method, first proposed by Horn and Schunck (1981) and Lucas and Kanade (1981), has been used in various applications such as segmentation, motion detection and tracking, and video prediction (Fernando et al., 2015; Ayzel et al., 2019). In this study, the optical flow method of Lucas and Kanade (1981) was used to estimate an apparent radar echo motion between two consecutive Z fields. The optical flow method assumes that flow is essentially constant over a small region at a small time interval. With the constraint of the same velocity within all the pixels of the region, optical flow equations can be solved from the least squares error criterion using the neighboring pixels in the considered region.

We develop a 2D Advection Forecast Model (hereafter, AFM) which is similar to Otsuka et al. (2016) to compare with forecasts from the ConvLSTM models proposed in this study. First, the AFM begins with using the 2D continuity equation as follows:

(1)
$\frac{\mathrm{dZ}}{\mathrm{dt}}=\frac{\partial Z}{\partial t}+u\frac{\partial Z}{\partial x}+v\frac{\partial Z}{\partial y}$

where Z is the radar reflectivity (dBZ), dZ/dt is a source-sink term related to growth and decay. ∂Z/∂t is the local rate of change and u and v is the advection vector which is estimated from the optical flow method. If dZ/dt is zero (i.e., no source-sink of Z), the Lagrangian persistence nowcast can be derived by the advection scheme in the AFM using the following forms:

(2)
(3)

where τ is the forecast lead time, m is the spatial location at (X, Y), α is the estimated displacement vector, and U is the velocity vector, U = [u, v]. The AFM’s nowcast is based on Lagrangian persistence which assumes that the source-sink term (i.e., growth and decay) in the conservation equation of reflectivity is zero. Then it implements the optical flow method (Lucas and Kanade, 1981) to estimate a motion field of radar echo and applies an advection scheme to generate an extrapolation-based nowcast (Foresti et al., 2019; Germann and Zawadzki, 2002). For a given lead time, Z and U values can be predicted following the processes shown in Figure 3. There are four steps of the optical flow method, upwind difference method (Versteeg and Malalasekera, 2007), Euler method, and Runge-Kutta method. First, u and v motion vectors are obtained from the optical flow method between consecutive Z images at t and tt (Δt is the time interval). Z and U values are used in the upwind difference equation to calculate their slopes, respectively. Then using the Euler method which is the first-order derivative, Z and U values at t+Δt are predicted. To improve accuracy, we take one more derivative, which is the second-order Runge-Kutta method, over the obtained slopes of Z and U with respect to time. Then, a forecast is finally generated for a desired lead time. The Runge-Kutta method used in the final step is shown as below:

Figure 3

Flowchart of the AFM nowcast.

(4)

where f is a change rate of Z at time t or t+Δt. The proposed models forecast Z values, not rainfall rates and Z is referred to as a proxy of rainfall.

### 3.2. Design of convolutional LSTM models

In this study, a Convolutional LSTM Model (CLM) and Encoder-Decoder Model (EDM) are proposed for sequence-to-sequence prediction of rainfall in very short-term range. Both the models use a convolutional LSTM (hereafter, ConvLSTM) network in their architecture, while the EDM applies it in the reduced latent space. LSTM which is a specific RNN is efficient for keeping track of long-range dependencies between time sequences of input data. As shown in Figure 4a, the EDM follows an encoder-decoder architecture. A three-layer ConvLSTM is implemented in the intermediate state between an encoder and decoder part. The input tensor has four dimensions of [sample (batch size), time step, nx, ny], where nx and ny are the number of pixels in the x and y direction of an input image. Here, sample means the total number of training samples and a batch size determines the number of samples that are fed into the ConvLSTM models as a batch to update gradient at one time. Time step is the number of temporal sequences contained in each sample. The encoder processes sequential input data and extracts information using multiple convolutional filters. Through the convolutional layers, the shape of the input data reduces by half as a result of downsampling (stride = 2). Then the encoder passes the encoded features to the decoder. The decoder decodes the feature vectors using filters that perform an inverse convolution while upsampling the shape of the decoded feature map across the layers. Maxpooling is not used. In the end, a cropping layer is used to match the input shape. Finally, a single-sequence prediction is generated for a given lead time.

Figure 4

Schematics of (a) the EDM and (b) the CLM architecture. A bolded number indicates the number of filters used in the convolutional LSTM layer. In this study, the number of time steps for the EDM is 7 and for the CLM is 20. The activation function of ReLU was used in all the layers of convolution.

Figure 4b shows the schematic diagram of the CLM architecture with details about shapes and filter sizes. Like the EDM, the CLM employs the stacked ConvLSTM layers. Many-to-many sequence prediction is conducted for the CLM. The many-to-many sequence prediction means that the input has images at multiple time steps and the output is predictions at the same number of time steps. Whereas, the many-to-one sequence prediction means that the input has a sequence of images and the output is a single-sequence prediction. As noted, the many-to-one sequence prediction is performed for the EDM. Also, padding is used in the ConvLSTM layers to have the same dimensions as that of the input data. The ConvLSTM network can capture spatial features and render them to a next time step as convolution filters slide over a 2D field domain. For this, we used a ‘ConvLSTM2D’ function in Keras with TensorFlow backend in Python 3.7.3 version. The LSTM has a memory cell and forget gate that determines how much information it will forget or hold from a previous cell. In Figure 5, the schematic of the LSTM is shown. An LSTM unit consists of various components shown in the following equations:

Figure 5

Schematic of an LSTM layer with regard to Eq. (5). The superscript l is the number of layer and the subscript t is time of sequences. Please refer to the text for more details.

(5)
$\begin{array}{c}{f}_{t}=\sigma \left({W}_{\mathrm{xf}}\ast {x}_{t}+{W}_{\mathrm{hf}}\ast {h}_{t-1}+{b}_{f}\right)\\ {i}_{t}=\sigma \left({W}_{\mathrm{xi}}\ast {x}_{t}+{W}_{\mathrm{hi}}\ast {h}_{t-1}+{b}_{i}\right)\\ {o}_{t}=\sigma \left({W}_{\mathrm{xo}}\ast {x}_{t}+{W}_{\mathrm{ho}}\ast {h}_{t-1}+{b}_{o}\right)\\ \stackrel{˜}{{C}_{t}}=\mathrm{tanh}\left({W}_{\mathrm{xc}}\ast {x}_{t}+{W}_{\mathrm{hc}}\ast {h}_{t-1}+{b}_{c}\right)\\ {C}_{t}={f}_{t}\circ {C}_{t-1}+{i}_{t}\circ {\stackrel{˜}{C}}_{t}\\ {h}_{t}={o}_{t}\circ \mathrm{tanh}\left({C}_{t}\right),\end{array}$

where input vector xt, hidden state vector ht, cell state vector Ct, input gate it, output gate ot, and forget gate ft at time t. * is a convolution operation, ° is the Hadamard product, W is a weight matrix used in the convolution, and b is the bias. The subscripts i, f, and o denote each gate, respectively and c is the cell output. σ and tanh indicate a sigmoid activation function that has a range of [0,1] and a hyperbolic tangent activation function that has a range of [-1,1], respectively. As similar to Shi et al. (2015), the “ConvLSTM2D” function was applied to consecutive 2D radar echo fields for precipitation nowcasting. A convolution operation between a radar image and a filter (i.e., kernel) is performed. The filter in a matrix applies to all the pixels of an image and returns a feature map in each layer. At the beginning of training, filter weights are randomly initialized and updated during training. Thus, the final output can be different even if there is no change in the model setup. The number of filters used in the convolutional LSTM layer is 48 for the EDM and 40, 21 for the CLM and the kernel size is 3×3. The total number of trainable parameters of the EDM is smaller than that of the CLM since the CLM keeps input dimensions across layers. Thus it takes relatively less time and memory for the EDM to finish its run, compared to the CLM.

### 3.3. Datasets

In this study, the total 2-h datasets of Z images were chronologically labeled from 1 to 242. Then they were split into train (40%) test (20%), and validation set (40%) in time order. The latest samples were used for creating the test and validation dataset. Relative to the size of the training dataset, we increased the proportion of the validation dataset a little more for a proper turning of hyperparameters since the total dataset is not large and limited in size. The optimal values of hyperparameters are shown in the following section. To avoid overfitting of the training dataset, we checked model performance by keeping track of train and validation’s accuracy and losses as the number of epochs increases. The test data unseen to the trained model were used without overlap with the train data. The test data right after the last label of the train data were used to make predictions with an expectation of higher forecast skill. Since the train and test datasets are sequential in time, forecast skill would be relatively high because there are strong temporal relations particularly for nowcasting such fast-varying convective storms. In this study, lead times considered are 2.5, 5, 7.5, and 10 min and the number of time steps in a sample is 7 for the EDM and 20 for the CLM, respectively.

Forecast lead time varies with a life cycle and scale of storm. Each scale has a maximum lead time at which storm variations can be predicted most (Germann and Zawadzki, 2004). A selected case is localized convective storms with the largest length of ~20 km (Figure 2) (i.e., meso-γ-scale). Considering that they were short-lived in less than an hour, we designed models to learn storm variations in a relatively mature stage, which enables us to have a relatively more amount of data for both training and validation. The high-resolution Z dataset was input to all of the models. Before training, we normalized the intensities of Z to be ranged from 0 to 1 by using Znormalized = (ZZmin)/(ZmaxZmin).

### 3.4. Hyperparameters

To find the optimal hyperparameters, various values have been tested on a trial and error basis through numerous model runs between the EDM and CLM independently. From the train and validation data, model accuracy and loss values per epoch were checked to avoid overfitting until they stay relatively constant. Besides, statistics and skill scores were also checked after prediction. Below, the value of hyperparameter before parenthesis is for the EDM and in parenthesis is for the CLM. To reduce variances and thus have an effect of generalization, a dropout rate of 0.2 (0.1) was used to drop connections from a layer to the next. Batch normalization is also used to stabilize training and dropout with this can prevent overfitting while training the datasets. An activation function of the rectified linear unit (ReLU) which is represented by f(x) = max(0, x) has been used in all the layers of convolution (see Figure 4). Learning rate which is used during gradient descent was set to 0.00045 (0.0001). When the learning rate is small, training a model becomes slower and this mitigates loss gradients from oscillating or being divergent. However, too small of a learning rate during training can result in a ConvLSTM model that does not converge and can retain a large loss. The optimization algorithm used during gradient descent is Adaptive Momentum Estimation (ADAM) and the loss function of mean squared error (MSE) is employed to minimize losses. The total number of epochs is 2000 (2800) and the batch size is 2 (10). The batch size was chosen as small as possible to avoid an out of memory error.

## 4. Experiment results

### 4.1. Statistic metrics

Model predictions are statistically assessed with lead time. Correlation coefficient (r), root mean square error (RMSE), mean bias, and standard deviation (STD) are calculated between observed and predicted Z fields and are summarized in Table 2. These metrics are calculated for all pixels in an observed and predicted field above a certain Z threshold. In this study, two Z criteria are used to classify the rain area into two categories. Comparisons of the observed and predicted Z fields and their scatterplots with lead time are shown in Figs. 6 and 7, respectively. At the initialized time (22:49:55 JST) (Figure 2b), two localized convective cells were observed as they move towards the south. One is located on the upper side and the other is on the lower side of the domain.

Table 2

A summary of statistics of each model with lead time.

LEAD TIME MODEL CORRELATION COEFFICIENT MEAN BIAS ROOT MEAN SQUARED ERROR STANDARD DEVIATION

2.5 min CLM 0.88 –0.37 6.84 6.83

EDM 0.73 –2.4 8.92 8.6

AFM 0.87 0.9 7.0 6.94

5 min CLM 0.69 0.3 9.51 9.51

EDM 0.61 –3.27 10.55 10.03

AFM 0.65 0.86 10.56 10.53

7.5 min CLM 0.58 –0.22 10.41 10.41

EDM 0.51 –1.22 10.77 10.71

AFM 0.46 2.01 12.63 12.47

10 min CLM 0.56 –0.74 8.96 8.93

EDM 0.48 –2.08 11.66 11.47

AFM 0.36 2.28 14.05 13.86

In Figure 6, it is seen that there is no large variation in the observed Z fields for this 10-min range. The CLM could represent the detailed patterns within the convective cells at the lead times of 2.5 and 5 min. From 2.5 to 7.5 min, the EDM produced the relatively more similar cell shapes but in lower intensities. On the whole, the Z intensities from both the CLM and EDM tended to weaken with lead time. The predicted rain areas showed more change in shape, compared to those from the AFM which almost maintained with lead time. At the lead time of 2.5 min (22:52:24 JST), all the model predictions show very high correlations with the true observations. The CLM and AFM showed the quite high r values of 0.88 and 0.87, respectively. The CLM also had the lowest RMSE and STD at 2.5 min (see Table 2). The lowest r of 0.73 was found in the EDM. At the lead time of 5 min (22:54:53 JST), the CLM again showed the highest r of 0.69 with the lowest RMSE and STD. In particular, the EDM showed the largest negative mean bias of –3.27 dB in this 10-min period, indicating the Z underestimation and such negative biases existed until 10 min (Table 2). By contrast, the AFM showed positive mean biases at all lead times. After 7.5 min, skewness in the Z distributions became more notable in the scatterplots of the CLM and EDM as shown in Figure 7.

Figure 6

Comparison of the observed (OBS) and predicted Z fields from the CLM, EDM, and AFM with lead time.

Figure 7

Scatterplots with correlation coefficients between observations and forecasts with lead time for the total rain area category of Z > 10 dBZ.

At the lead time of 10 min, the larger negative mean bias of –0.74 dB and –2.08 dB is found in the CLM and EDM, respectively. The reason for the negative biases from both the models is not clear yet. It might be related to a choice of loss function which can affect predicted results by a different speed of convergence to reach minima of losses or an issue of normalization such as input data normalization or batch normalization (for mini-batches) which has effects of regularization and convergence during training. Also, a relatively smaller proportion of higher Z values in the training dataset may contribute to underpredicting the strength of convective rain. A weighted MSE loss function could be a choice for the models to improve the skill in high Z areas. The AFM showed the highest RMSEs and STDs at all the lead times as shown in Table 2. The last row in Figure 7 shows that the data points of high Z > 30 dBZ are more densely distributed along the 1:1 line at the lead times of 7.5 and 10 min. This is well contrasted with the scatterplots of the CLM and EDM that show the skewed distributions by the negative biases as noted above. This might be one reason to cause relatively lower skill scores of the CLM and EDM for convective rain (Z > 35 dBZ) than the AFM since the amount of Z data greater than 35 dBZ becomes fewer. As expected, the degree of scatter has degraded more severely with lead time in all the model forecasts (Figure 7).

In Figure 8, the CLM showed the highest r and lowest RMSE values for all the lead times, implying its better nowcasting performance. The AFM showed the lowest r of 0.46 at 7.5 min and 0.36 at 10 min. Thus, it is the AFM that showed the largest change in the correlation coefficient r from 0.87 to 0.36 in this 10-min period. Importantly, the EDM showed the higher r and lower RMSE values even at 7.5 and 10 min, compared to the AFM. This suggests that the EDM has better skill than the AFM for relatively long-range predictions. Thus, it can be noted that the difference in forecast skill between the ConvLSTM models and extrapolation-based model becomes more pronounced with increased lead time. The ConvLSTM models could learn the fast-evolving features of rain areas at consecutive frames through deep neural networks. By contrast, the AFM advected the existing fields of precipitation southward, almost without changing shape and intensity with lead time (see the last row in Figure 6). This contributed to keeping the high Z intensities (>30 dBZ), compared to the CLM and EDM which underpredicted the Z intensities as addressed above (Figure 7).

Figure 8

Time series of the correlation coefficients (solid) and RMSEs (dotted) of the CLM, EDM, and AFM for the total rain area of Z > 10 dBZ. A legend for the symbols of the models is shown in the left-bottom corner.

### 4.2. Forecast evaluations

Performances of the proposed models are evaluated with lead time using three forecast metrics of Probability of Detection (POD), False Alarm Ratio (FAR), and Critical Success Index (CSI) which are common to be used for rainfall nowcasting and verification (Schaefer, 1990). The forecast metrics, ranged from 0 to 1, are calculated from the equations below:

(6)
$\mathrm{POD}=\frac{\mathrm{Hits}}{\mathrm{Hits}+\mathrm{Misses}}$
(7)
(8)

The POD is defined as the ratio of the number of correct forecasts as rain (“hits”) to the total number of observed events (“hits + misses”). The FAR is the ratio of the number of incorrect forecasts as rain (“false alarms”) to the total number of forecast events (“hits + false alarms”). The CSI is also denoted as Threat Score (TS) which is the ratio of the number of correct forecasts as rain to the total number of observed and forecast events. A perfect score for the POD and CSI is 1 and for the FAR is 0. In analyzing these metrics, we made two Z categories using different Z criteria that separate heavy rain areas from a given total rain area. Thus, one is the total rain area that covers all stratiform and convective rain with Z > 10 dBZ. 10 dBZ corresponds to approximately 0.1 mm h–1 in rainfall rate (R) when the relation of z = 300R1.6 is assumed, where z is radar reflectivity factor (mm6 m–3) and rainfall rates below this are considered as “no rain”. The other is the category of convective rain area with Z > 35 dBZ (Dixon and Wiener, 1993). Note that the CSI and POD are values related to solely a size of rain area unlike the correlation coefficient r which is a function of not only an area but also intensity. Thus, they need to be analyzed together with caution.

For the two categories, the skill scores of the CSI, POD, and FAR with lead time are summarized in Table 3 and in the time series in Figure 9. It is shown that the CSI scores between the models in the total rain area category show no much difference with lead time in Figure 9a. The CSI scores of the persistence method were much lower than those of the other models. Forecast skill scores of the Eulerian persistence method (hereafter, persistence method) were obtained by assuming that the current Z field is taken to be a forecast for any lead time (Mandapaka et al., 2012) and are also shown in Table 3. In Figure 9a, the CLM CSIs are higher than the AFM CSIs for all the lead times. Also, the CLM CSIs are a little higher than the EDM CSIs, showing the higher r and lower RMSE values (Figure 8). Thus, the CLM is regarded to perform best for predictions in this category of the total rain area. For 2.5 and 5 min in Table 3, the EDM CSIs are almost the same as the AFM CSIs. For 7.5 and 10 min, however, the EDM CSIs are slightly greater than the AFM CSIs with the higher r and lower RMSE values as shown in Figure 8.

Table 3

A summary of the nowcasting metrics with lead time for the two categories of Z >10 dBZ and Z > 35 dBZ (in the bracket).

LEAD TIME MODEL POD FAR CSI (THREAT SCORE)

2.5 min CLM 0.91 (0.69) 0.08 (0.27) 0.85 (0.55)

EDM 0.88 (0.35) 0.07 (0.34) 0.83 (0.30)

AFM 0.91 (0.76) 0.09 (0.33) 0.83 (0.55)

Persistence 0.88 (0.62) 0.14 (0.35) 0.78 (0.46)

5 min CLM 0.90 (0.52) 0.14 (0.46) 0.78 (0.36)

EDM 0.84 (0.17) 0.10 (0.49) 0.77 (0.15)

AFM 0.87 (0.49) 0.15 (0.51) 0.76 (0.32)

Persistence 0.82 (0.43) 0.20 (0.62) 0.68 (0.25)

7.5 min CLM 0.87 (0.29) 0.17 (0.49) 0.74 (0.23)

EDM 0.84 (0.10) 0.17 (0.62) 0.72 (0.09)

AFM 0.87 (0.40) 0.20 (0.64) 0.71 (0.23)

Persistence 0.76 (0.35) 0.23 (0.66) 0.62 (0.21)

10 min CLM 0.81 (0.21) 0.17 (0.42) 0.70 (0.16)

EDM 0.81 (0.07) 0.17 (0.58) 0.69 (0.07)

AFM 0.85 (0.35) 0.25 (0.70) 0.66 (0.19)

Persistence 0.71 (0.31) 0.27 (0.71) 0.56 (0.17)

Figure 9

Time series of the scores of (a) CSI, (b) POD, and (c) FAR from the CLM, EDM, AFM, and persistence method. The solid line is for the total rain area and the dashed for the convective rain area. The CLM is in red, the EDM in black, the AFM in blue, and the persistence method in green.

In Figs. 9b and c, POD and FAR variations show that their differences between the models are smallest at 2.5 min and tend to become larger after 5 min. In Figure 9c, it is noteworthy that the EDM FARs are smaller than the AFM FARs at all the lead times. This contributed to increasing the EDM CSI scores particularly at 7.5 and 10 min. Therefore, the forecast skill of the AFM was about equal to that of the EDM at the early lead time. However, the EDM showed the better skill at the longer lead times (7.5 and 10 min) than the AFM which skill became poorer with increased FARs and RMSEs. The EDM PODs, which do not reflect false alarms (Eq. (6)), are lower than the AFM PODs for all the lead times as shown in Figure 9b. It is the persistence method that had the lowest POD and highest FAR values at all the lead times.

Next, model performances for the convective rain area are assessed with skill scores shown in the bracket in Table 3. It is seen in Figure 9a that the CLM CSIs are equal to or slightly higher than the AFM CSIs at 2.5 ~ 7.5 min but lowered a bit below the AFM CSI at 10 min. The more notable decrease in the CLM PODs is shown at 7.5 and 10 min in Figure 9b. This indicates that the nowcasting skill of the CLM started to be as high as that of the AFM but got poorer at 10 min. With respect to the EDM, the AFM POD and CSI scores were more than a double of the EDM POD and CSI scores at all the lead times. Even the persistence method’s PODs and CSIs were much higher than the EDM PODs and CSIs. Thus, the EDM is regarded to be the worst model for the convective rain category. It is considered that the negative biases in Z (i.e., underestimation) seem to have contributed to decreasing the EDM CSI scores because the number of data points above 35 dBZ became fewer due to the biases.

This resulted in the marked decrease in the CLM CSI and POD scores at 10 min with the largest negative bias (-0.74 dB) as noted in section 4.1. Eventually, the AFM predicted the convective rain areas fairly well in contrast to the lower skill for the total rain area. The higher skill of the AFM at longer lead times is because it has advected precipitation fields, almost keeping cell shapes and high Z intensities as noted earlier. That is, the high Z values helped to maintain the AFM’s high skill scores until 10 min. In Figure 9c, it is noted that the CLM had the lowest FAR scores for the convective rain area at all the lead times, contributing to somewhat an increase in the CSI scores. Overall, the AFM and persistence method showed similar variations to each other. Certainly, the persistence method showed the lower skill than the AFM at all the lead times since the AFM utilized the Lagrangian advection scheme and optical flow method.

## 5. Discussion and summary

In this study, we proposed the two ConvLSTM models and examined to nowcast the meso-γ-scale, localized convective storms using the unique high-resolution Z datasets from the MP-PAWR. The predicted Z fields with lead time up to 10 min were compared with those from the AFM which utilized the Lagrangian advection scheme with motion vectors from the optical flow method. For the total rain area, the results indicate that the CLM outperformed both the AFM and EDM, producing the highest CSI and r values and lowest RMSEs at all the lead times. Thus, the CLM is found to be the best nowcasting model in this rain area category. The higher performance is probably related to the CLM structure that employed the three stacked ConvLSTM layers and filters without reducing the size of the input dimensions. This helped to keep better tracking of spatiotemporal features in detail between time steps of observations. By contrast, the EDM reduces the dimension to its half twice by a stride convolution, which may lead to some losses of spatial information. Instead, the EDM had the merit to reduce computing time and memory, still capable of producing comparable results, due to the reduced number of trainable parameters. The AFM performed fairly well at 2.5 ~ 5 min but afterward, its skill became poorer with increasing FAR and RMSE values. The EDM generated the relatively better forecasts at 7.5 and 10 min.

On the other hand, the performance of the ConvLSTM models relies on the 2D convolutional transformation by multiple filters with weights as updated across each convolution layer. By this merit, the CLM that keeps their input dimensions through the layers could predict better the total rain areas and intensities, showing the higher correlation coefficient r and CSI values. However, the CLM has underpredicted the areas and intensities of convective rain, producing the more negative Z biases at the longer lead times, compared to the AFM. The EDM also showed the same limitation by resulting in the negative biases on the range of higher Z, which led to the lower CSI and POD scores in this convective rain category. In addition to the issues of regularization or loss convergence, the negative biases may also be related to the smaller proportion of high Z areas in our training dataset and thus the models may have focused to improve the skill in relatively weaker Z areas (Wang et al., 2021). There are many more factors that affect the nowcasting skill of ConvLSTM models such as training data size, architecture, optimal hyperparameter, overfitting, and so on. Their nowcasts can be further influenced by rainfall type, lifetime, data resolution, and model domain size, etc. Germann and Zawadzki (2002) reported that predictability tends to increase with scale of thunderstorms. Higher variability in prediction is expected for short-lived, small convective storms, which is our case of this study.

A potential limitation of our study is that the proposed ConvLSTM models have not been evaluated with storms of larger scales (e.g., mesoscale convective systems). In this study, we used only the datasets from a series of small convective storms. The training data was not enough from the 2-h observations but the storm variations were resolved more finely at high temporal resolution (30 s) of the MP-PAWR. Such fine resolution data provides high potentials for nowcasting such small short-lived storms since short-period sequential data at high temporal resolution is relatively more critical than other large-scale prediction studies. From the given training data, the ConvLSTM network could learn spatial features and temporal correlations in sequential 2D images and predict Z fields comparably up to 10 min. The maximum lead time of 10 min is rather short but was able to be exploited considering the storm’s life cycle less than about 1 h in this study. Yet, this study is preliminary, still on the way to improve the ConvLSTM models to better nowcast meso-γ-scale convective storms using the high-resolution radar data. For future work, we will evaluate these models for other storm events with a larger size of training data (especially more samples in high Z areas). In regard to estimating and predicting up- and downward motion vectors closely related to the cell’s growth or decay, a 3D ConvLSTM model extended from the 2D ConvLSTM would be an alternative in the future. It is also worth to integrate multi-source meteorological data (such as re-analysis data) with the radar data as input to the 2D and 3D ConvLSTM models and test their nowcasting skill in the future.

## Acknowledgements

This research was supported by Cross-ministerial Strategic Innovation Promotion Program (SIP).

## Competing Interests

The authors have no competing interests to declare.

## References

1. Ayzel, G, Heistermann, M and Winterrath, T. 2019. Optical flow models as an open benchmark for radar-based precipitation nowcasting (rainymotion v0.1). Geosci. Model Dev, 12: 1387–1402. DOI: https://doi.org/10.5194/gmd-12-1387-2019

2. Bechini, R and Chandrasekar, V. 2017. An Enhanced Optical Flow Technique for Radar Nowcasting of Precipitation and Winds. Journal of Atmospheric and Oceanic Technology, 34(12): 2637-2658. DOI: https://doi.org/10.1175/JTECH-D-17-0110.1

3. Bellerby, T, Todd, M, Kniveton, D and Kidd, C. 2000. Rainfall Estimation from a Combination of TRMM Precipitation Radar and GOES Multispectral Satellite Imagery through the Use of an Artificial Neural Network. J. Appl. Meteor., 39: 2115–2128. DOI: https://doi.org/10.1175/1520-0450(2001)040<2115:REFACO>2.0.CO;2

4. Bowler, N, Pierce, C and Seed, A. 2006. STEPS: A probabilistic precipitation forecasting scheme which merges an extrapolation nowcast with downscaled NWP. Quart. J. Roy. Meteor. Soc., 132: 2127–2155. DOI: https://doi.org/10.1256/qj.04.100

5. Chung, K and Yao, I. 2020. Improving Radar Echo Lagrangian Extrapolation Nowcasting by Blending Numerical Model Wind Information: Statistical Performance of 16 Typhoon Cases. Mon. Wea. Rev., 148: 1099–1120. DOI: https://doi.org/10.1175/MWR-D-19-0193.1

6. Dixon, M and Wiener, G. 1993. TITAN: Thunderstorm identification, tracking, analysis, and nowcasting—A radar-based methodology. J. Atmos. Oceanic Technol., 10: 785–797. DOI: https://doi.org/10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2

7. Fernando, B, Gavves, E, José Oramas, M, Ghodrati, A and Tuytelaars, T. 2015. Modeling video evolution for action recognition. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, 5378–5387. DOI: https://doi.org/10.1109/CVPR.2015.7299176

8. Foresti, L, Sideris, IV, Nerini, D, Beusch, L and Germann, U. 2019. Using a 10-Year Radar Archive for Nowcasting Precipitation Growth and Decay: A Probabilistic Machine Learning Approach. Wea. Forecasting, 34: 1547–1569. DOI: https://doi.org/10.1175/WAF-D-18-0206.1

9. Germann, U and Zawadzki, I. 2002. Scale-Dependence of the Predictability of Precipitation from Continental Radar Images. Part I: Description of the Methodology. Mon. Wea. Rev., 130: 2859–2873. DOI: https://doi.org/10.1175/1520-0493(2002)130<2859:SDOTPO>2.0.CO;2

10. Germann, U and Zawadzki, I. 2004. Scale dependence of the predictability of precipitation from continental radar images. Part II: Probability forecasts. J. Appl. Meteor., 43: 74–89. DOI: https://doi.org/10.1175/1520-0450(2004)043<0074:SDOTPO>2.0.CO;2

11. Horn, B and Schunck, B. 1981. Determining optical flow, Artificial Intelligence, 17(1-3): 185–203. DOI: https://doi.org/10.1016/0004-3702(81)90024-2

12. Hu, S, Luo, C, Huang, X, Li, H and He, R. 2012. Comparisons of one hour precipitation forecast between the radar extrapolation and mesoscale numerical model (in Chinese). Meteor. Mon., 38(3): 274–280.

13. Kim, D-K, Suezawa, T, Mega, T, Kikuchi, H, Yoshikawa, E and co-authors. 2021. Improving precipitation nowcasting using a three-dimensional convolutional neural network model from multi parameter phased array weather radar observations. Atmos. Res. 262: 105774. DOI: https://doi.org/10.1016/j.atmosres.2021.105774

14. Li, L, Schmid, W and Joss, J. 1995. Nowcasting of motion and growth of precipitation with radar over a complex orography. J. Appl. Meteor., 34: 1286–1300. DOI: https://doi.org/10.1175/1520-0450(1995)034<1286:NOMAGO>2.0.CO;2

15. Li, P, Wong, W, Cheung, P and Co-authors. 2014. An overview of nowcasting development, applications, and services in the Hong Kong Observatory. J Meteorol Res, 28: 859–876. DOI: https://doi.org/10.1007/s13351-014-4048-9

16. Lucas, BD and Kanade, T. 1981. An Iterative Image Registration Technique with an Application to Stereo Vision. In Proceedings of the DARPA Image Understanding Workshop, Washington, DC, USA, April 1981, 674–679.

17. Mandapaka, PV, Germann, U, Panziera, L and Hering, A. 2012. Can Lagrangian extrapolation of radar fields be used for precipitation nowcasting over complex Alpine orography? Wea. Forecasting, 27: 28–49. DOI: https://doi.org/10.1175/WAF-D-11-00050.1

18. Miao, C, Ashouri, H, Hsu, K-L, Sorooshian, S and Duan, Q. 2015. Evaluation of the PERSIANN-CDR daily rainfall estimates in capturing the behavior of extreme precipitation events over China. J. Hydrometeor., 16: 1387–1396. DOI: https://doi.org/10.1175/JHM-D-14-0174.1

19. Otsuka, S and Coauthors. 2016. Precipitation Nowcasting with Three-Dimensional Space–Time Extrapolation of Dense and Frequent Phased-Array Weather Radar Observations. Wea. Forecasting, 31: 329–340. DOI: https://doi.org/10.1175/WAF-D-15-0063.1

20. Rinehart, RE and Garvey, ET. 1978. Three-dimensional storm motion detection by conventional weather radar. Nature, 273: 287–289. DOI: https://doi.org/10.1038/273287a0

21. Ruzanski, E, Chandrasekar, V and Wang, Y. 2011. The CASA Nowcasting System. J. Atmos. Oceanic Technol., 28: 640–655. DOI: https://doi.org/10.1175/2011JTECHA1496.1

22. Sadeghi, M, Asanjan, AA, Faridzad, M, Nguyen, P, Hsu, K, Sorooshian, S and Braithwaite, D. 2019. PERSIANN-CNN: Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks–Convolutional Neural Networks. J. Hydrometeor., 20: 2273–2289. DOI: https://doi.org/10.1175/JHM-D-19-0110.1

23. Schaefer, JT. 1990. The critical success index as an indicator of warning skill. Weather Forecast, 5: 570–575. DOI: https://doi.org/10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2

24. Shi, X, Chen, Z, Wang, H, Yeung, D-Y, Wong, W-K and Woo, W-C. 2015. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. Advances in neural information processing systems (NIPS), 802–810.

25. Sun, J, Xue, M, Wilson, JW and Coauthors. 2014. Use of NWP for Nowcasting Convective Precipitation: Recent Progress and Challenges. Bull. Amer. Meteor. Soc., 95: 409–426. DOI: https://doi.org/10.1175/BAMS-D-11-00263.1

26. Tao, Y, Hsu, K, Ihler, A, Gao, X and Sorooshian, S. 2018. A Two-Stage Deep Neural Network Framework for Precipitation Estimation from Bispectral Satellite Information. J. Hydrometeor., 19: 393–408. DOI: https://doi.org/10.1175/JHM-D-17-0077.1

27. Walker, J, Gupta, A and Hebert, M. 2015. Dense Optical Flow Prediction from a Static Image. DOI: https://doi.org/10.1109/ICCV.2015.281

28. Wang, C, Wang, P, Wang, P, Xue, B and Wang, D. 2021. Using Conditional Generative Adversarial 3-D Convolutional Neural Network for Precise Radar Extrapolation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14: 5735-5749. DOI: https://doi.org/10.1109/JSTARS.2021.3083647

29. Wang, Y, Long, M, Wang, J, Gao, Z and Yu, PS. 2017. PredRNN: Recurrent neural networks for predictive learning using spatiotemporal LSTMs. Advances in Neural Information Processing Systems (NIPS), 879–888.

30. Versteeg, HK and Malalasekera, W. 2007. An introduction to computational fluid dynamics: the finite volume method. Harlow, England: Pearson Education Ltd.