1.

## Introduction

Drought, the highest-ranked natural hazard, is the primary source of severe destructive effects on the planet (White, 1974). Its sustained consequences lead to the sterilisation of agricultural land and the initiation of diseases. Factors associated with a higher risk of drought are the long duration of rainfall, high rate of evapotranspiration, low relative humidity, high temperature, and high wind speed (Edwards et al., 2009). Moreover, many other environmental and ecological factors are also responsible for the recurrent occurrences of drought hazard. However, drought intensity, duration, and severity may vary from region to region. In recent decades, almost all the developing countries are facing water shortage due to continued expansion in agriculture, industrial, and energy sectors. Consequently, a perpetual increase in the difference between water demand and renewable freshwater resources will lead to significant social and economic issues.

However, to overcome the severe effect of the frequent occurrence of drought, forecasting plays a significant role in drought mitigation policies. In previous research, several forecasting and assessment tools for the characterisation of drought regions and the quantification of drought risk have been established. Several studies have been conducted in the field of hydrology and climatology for the assessment and modelling of drought classes for different regions in the world. Drought indices are one of the most used tools for the assessment and quantification of drought risk. A range of different drought indices involving different climatic parameters has been developed to detect dry and wet categories of a region for a specified time. Details on the list of drought indices corresponding with their variable requirement can be found in Svoboda et al. (2016).

Besides drought monitoring and assessment tools, several probabilistic and deterministic forecasting models have been developed and used to predict and forecast drought classes for various climatological regions. In recent decades, the rapid increase in the development of theories associated with the stochastic process is found for modelling many real-life uncertain phenomena. An example includes, stock market and exchange rate fluctuations; signals such as speech; audio and video; medical data such as a patient's EKG, EEG, blood pressure or temperature; and random movements, such as Brownian motion or random walks. Among several other stochastic models, the theory of Markov chains is a promising approach to dynamic model activities that have a stochastic factor (Lange, 2010). To model uncertain events, especially in the field of engineering (Takahashi et al., 2007), economics (Lee and Chen, 2006), and physics (Crommelin and Vanden-Eijnden, 2006), Markov chain models play a significant role in the prediction and forecasting of the probabilities associated with such events. Markov chain models can be useful for forecasting future drought classes due to their multifaceted nature to enumerate uncertainties connected with hydro-meteorological variables causing droughts.

Sen (1990) derived stationary second-order Markov chains for finite sample lengths for exact probability distribution functions (PDF) for three representative annual flow series from various regions of the globe. Lohani and Loganathan (1997) used non-homogeneous Markov chain models to characterise the random behaviour of droughts using the Palmer Drought Severity Index. Paulo and Pereira (2007) used Markov Chain models on the drought classes determined by the SPI drought index to characterise the stochasticity of drought and predict three months of drought class. Bacanli et al. (2009) used the SPI drought index based on an Adaptive Neuro-Fuzzy Inference System (ANFIS) forecasting drought. Ali et al. (2017b) used a multilayer perceptron model and SPEI drought index for drought forecasting in the Northern Area and KPK.

However, it is difficult to adjust the transition probability matrix and the precision of the forecast that is affected by objective factors. To overcome this problem, in many applications Weighted Markov Chain (WMC) method have been employed in several disciplines, including hydrology and environmental sciences (Benoit, 2005; Le-Tian, 2005; De-di and Chen, 2006; Peng et al., 2010; Kaliakatsos-Papakostas et al., 2011; Zhou et al., 2011; Chen and Yang, 2012; Gong et al., 2014; Gui and Shao, 2017). Chen and Yang (2012) proposed a drought prediction model for SPI with different time scales under the weighted Markov chain framework for Anhui Province of Huaihe River in China. Outcomes associated with this model show that the weighted Markov Chain method is a useful approach for drought prediction, and it can be helpful for decision-making in regional drought management. In recent years, we have proposed a new weighting scheme of WMC for ordinal data (see Ali et al., 2018). In this article, the Standardised Precipitation Temperature Index (SPTI) (Ali et al., 2017) has been used to evaluate the proposed scheme.

However, analysing droughts by using a single variable is not enough to distinguish different regions because drought hazards relate to multiple variables. A comprehensive analysis of the characterisation of drought classes is required that make a joint analysis of rainfall, runoff, and soil moisture conditions (Vicente-Serrano et al., 2005). The SPI is the simplest and commonly used drought index that is based on accumulated precipitation (McKee et al., 1993). Vicente-Serrano et al. (2010) developed a new multi-scaler drought index, the standardised precipitation evapotranspiration index (SPEI). The methodological structure of SPEI is like SPI. However, in SPEI, instead of only precipitation, the mean monthly temperature is also used in the estimation of drought classes.

The objective of this research is to handle all the difficulties in formulating a mathematical model for forecasting SPEI drought index at a one-month time scale under the WMC framework in various regions of Pakistan. We use autocorrelations from the historical series of SPEI drought index with a one-month time scale as a weight in first-order Markov chain transition probability matrices to forecast the next incidence of drought categories for seventeen meteorological stations located in the Northern Areas and KPK (Pakistan). The drought has become a recurrent phenomenon in the country. In the recent decade, due to severe drought hazards, the economic system of the country was severely disturbed. In recent decades, several authors had been working to explore the geographical and hydrological importance of this region. Awan (2002) and Archer and Fowler (2004) explored and inferred different climatic variables in terms of regression, spatial correlation, and temporal variation. Ahmad et al. (2012) evaluated the significance of these mountainous areas that have substantial potential in hydro-power production and water resources. Ali et al. (2017a) have compared the performance of SPTI with SPI and SPEI using time series data on precipitation and temperature of these stations.

In this research considered seventeen meteorological stations having different climatology and estimated historical time series data on SPEI drought index for a one-month time scale. Time series data on precipitation and temperature are used to estimate SPEI values for these stations.

The organisation of this paper is as follows. A brief description of the study area, the estimation method of SPEI, and the mathematical formulation of WMC method is presented in Section 1. Section 2 consists of temporal representations of SPEI drought classes and results associated with WMC based forecasting. Section 3 is based on the results of this paper. In contrast, discussion and conclusion have been presented in Sections 4 and 5, respectively.

2.

## Materials and methods

2.1.

### Study area and data

We consider seventeen meteorological stations located in different climatic regions of Northern Areas of Pakistan. These stations have high variability in rainfall throughout the season. Concerning the climatological statistics (see Table 1), the behaviour of all stations varies from zone to zone.

In each season, some of the stations are continuing to bear extremely vulnerable drought conditions. In the current research, twelve meteorological stations exhibiting cold and humid climate and five meteorological stations having mild cold and arid climate are included to check the efficiency of SPEI drought index from the global warming perspective. Primary data on monthly total rainfall, mean minimum temperature, and mean maximum temperature of each station for the period 1976–2017 were collected from the Karachi data processing center through Pakistan Meteorological Department (PMD), Islamabad. Figure 1 shows the geographical distribution of the study area located in different climatological zones.

Figure 1.

Study area: meteorological stations of Northern Area and KPK (Pakistan).

2.2.

### Standardised precipitation evapotranspiration Index-SPEI

There are several procedures to report drought severity using a multiscalar drought index (Ali et al., 2017a). McKee et al. (1993) developed an SPI drought index, which is based on long-term precipitation records to quantify precipitation scarcity for different time scales. One of the significant advantage of using the SPI index is that it can be used to monitor drought for various time scales. Vicente-Serrano et al. (2010) developed the standardised precipitation evapotranspiration index (SPEI): a multiscalar drought index. In SPEI, the water balance model based on the difference between precipitation and potential evapotranspiration (PET) is used with a similar estimation procedure of SPI. One significant advantage of SPEI over SPI is to include the effect of evaporation in rainfall data to characterise the regions under study. The method employed in SPEI drought index for the determination of drought classes is a little bit different from SPI. SPEI uses both temperature and precipitation for the characterisation of drought classes, whereas, in the SPI drought index, only monthly cumulative precipitation data is used. In SPEI drought index, monthly time series data on drought classes is obtained by standardising the distribution of the difference between precipitation and Potential Evapotranspiration (PET). Several methods are available in the literature for the estimation of PET. The choice of the method for the estimation of PET depends on the availability of data and the sensitivity of the PET values. In their original paper, Vicente-Serrano et al. (2010) had chosen the simplest approach to calculate PET by using Thornthwait (TH) equation (Thornthwaite 1948).

However, this method underestimates PET values at arid regions in cold climatic regions, whereas overestimates PET values in humid regions (Ali et al., 2017a). The Hargreaves method for the estimation of PET overcomes this issue. However, it needs additional climatic parameters (i.e. mean minimum and mean maximum temperature etc.) (Hargreaves, 1994). In this research, we use the Hargreaves method for the estimation of PET based on minimum mean and minimum-maximum temperature using SPEI package in R. Further detailed procedures on computation, temporal behaviour, and normality test on SPEI time series data can be found in Ali et al. (2019).

2.3.

### Markov chain

A stochastic process, or random process {Z = Z(t), t ε T} is a set of random variables indexed by time. That is, for all t in the index set T, Z(t) is a random variable. If the index set T is a countable set, we call Z(t) a discrete-time stochastic process. If T is a straight-set, we call Z(t) a continuous-time stochastic process. All possible values that Z(t) can assume, are called its state space (Chiang, 1968). A Markov chain is a stochastic process having the property that the value of the process at time t, Zt depends only on its previous value Zt−1 at time t − 1, but does not depend on the past values of the process (Haan, 1977).

2.3.1.

#### Drought conditions as a Markov chain process

A discrete Markov chain is a random process that describes a sequence of events from a set of finite possible states. In contrast, the current event depends only on the preceding event. It has been commonly used to model uncertain events in various disciplines. Each discrete Markov chain is characterised by a transition probability matrix that represents the probability of transition from one state to another. Shatanawi et al. (2013) reported that the exact prediction of Drought Index (DI) values is impossible with ARIMA models. However, early warning of drought can be detected from monthly Markov transition probabilities. Therefore, in drought modelling context, time-series data SPTI drought classes for a single station can be considered as a sequence of ordinal drought classes.

Consequently, a historical series of drought classification states for a specified station can be embodied as a discrete Markov chain process. Here, we assume that any single class of drought in time series of SPEI depends on its previous class and then proceed to the construction of the transition probability matrix. It is just statistical compliance that allows us to consider each drought class as a first-order Markov chain. However, one can also use a second-order Markov chain, where if each class is assumed to depend on its previous two classes.

2.3.2.

#### Configuration of weighted Markov chain (WMC) for observed drought classes

In this scenario, time-series data on drought classes determined by SPEI drought index can be assumed as a series of correlated random variables. Various empirical studies show that self-correlation coefficients in the historical data of drought classes for all the study regions have significant importance for the prediction of future drought classes. This confirms that previous drought classes (on a monthly or yearly basis) can be considered in advance to predict the present drought class. So, in our case, the basic idea behind using WMC is that weighted averages can be made according to the incidence behaviour in the past month. Hence, the probability of present or next drought classes can be inferred and predicted in advance by appropriate configuration of weights to each drought class in WMC framework.

The fundamental steps involved in the proposed method for prediction of drought classes using SPEI drought index under the WMC model are given below.

2.3.2.1.

#### Classification quantitative values of SPEI for transition probability matrix

Let D1, D2, …, Dn be the time series of drought classes. Where D can assume the nominal value droughts classes depending on the classification criteria of SPEI drought index (see Table 2). As in each drought class, we can find the transition probability matrix in the following form:

Drought Classes

$\begin{array}{ccc}{c}_{1\mathrm{}}& {c}_{2}\mathrm{}\dots & \mathrm{}{c}_{n}\end{array}$
$\begin{array}{c}{c}_{1}\\ {c}_{2}\\ \begin{array}{c}⋮\\ {c}_{n}\end{array}\end{array}\mathrm{}\left(\begin{array}{ccc}{p}_{11}& {p}_{12}\mathrm{}\cdots & {p}_{1n}\\ {p}_{21}& {p}_{22}\mathrm{}\cdots & {p}_{2n}\\ \begin{array}{c}⋮\\ {p}_{n1}\end{array}& \begin{array}{cc}⋮& \ddots \\ {p}_{n2}& \dots \end{array}& \begin{array}{c}⋮\\ {p}_{nn}\end{array}\end{array}\right)$
where, c1, c2,…, cn are the drought classes, i.e. too wet, very wet,…, extremely dry.

In this step, we classify SPEI drought index estimated with a one-month time scale according to the classification criteria provided in Table 2. Descriptions on the classification states for drought classes determined by SPEI drought index were shown in Table 2.

Transition probability matrices for each station are computed using the Markovchain package to predict the future drought classes (Spedicato et al., 2016) package of R.

2.3.2.2.

#### Construction of the transition probability matrix

Let ${Y}_{ij}^{\left(t\right)}$ the number of transitions from the state Si to state Sj through t steps in time series length of drought classes Xk calculated from a one-month time scale. Here, the transition probabilities for various time steps and various drought classes are computed using the following equation.

((6))
${P}_{ij}^{\left(t\right)}=\frac{{Y}_{ij}^{\left(t\right)}}{{Y}_{i}},i,\mathrm{}j=1,2,\dots \dots \mathrm{}m$
where ${Y}_{i}$ is the total number of individual drought classes, t represents the order of Markov chain. Further, the transition probability matrix for various drought classes can be obtained as;
((7))
${P}^{\left(t\right)}=\left[\begin{array}{cccc}{p}_{11}^{\left(t\right)}& {p}_{12}^{\left(t\right)}& \begin{array}{cc}.& .\end{array}& {p}_{1m}^{\left(t\right)}\\ {p}_{21}^{\left(t\right)}& {p}_{22}^{\left(t\right)}& \begin{array}{cc}.& .\end{array}& {p}_{2m}^{\left(t\right)}\\ \begin{array}{c}.\\ .\end{array}& \begin{array}{c}.\\ .\end{array}& \begin{array}{c}\begin{array}{cc}.& .\end{array}\\ \begin{array}{cc}.& .\end{array}\end{array}& \begin{array}{c}.\\ .\end{array}\\ {p}_{m1}^{\left(t\right)}& {p}_{m2}^{\left(t\right)}& \begin{array}{cc}.& .\end{array}& {p}_{mm}^{\left(t\right)}\end{array}\right]$

Transition probability matrices for Astore and Balakot stations from the time series data on drought classes determined by SPEI drought index are shown in Table 3.

2.3.2.3.

#### Computation weights using autocorrelations

The weights (${w}_{i}$) for the weighted Markov chain model can be computed by standardising the self-correlation coefficient (${r}_{i}$). The formula for weights (${w}_{i}$) and self-correlation coefficient (${r}_{i}$) are provided in Eqs. (8) and (9), respectively.

((8))
$w\mathrm{}=\frac{|{r}_{p}|}{{\sum }_{p=1}^{m}|{r}_{p}|}$
((9))
$r\mathrm{}=\frac{{\sum }_{p=1}^{n-p}\left({Z}_{\left(p\right)}-\overline{Z}\right)\left({Z}_{\left(p+1\right)}-\overline{Z}\right)}{{\sum }_{p=1}^{n}\left({Z}_{\left(p\right)}-\overline{Z}{\right)}^{2}}$

2.3.2.4.

#### Prediction of the probability of the occurrence of a drought class using weighted Markov chain

In this step, we assume the occurrence of drought classification states in the very last month as an initial drought class Ψi and combine it with the row vectors of their corresponding transition probability matrix. Here, we assumed that the inaccurate drought class is probable due to the estimation of SPEI time series data. Hence, weighting the drought classes according to Eq. (10), we arrive at the prediction probabilities Pi for the next drought classes.

((10))
${P}_{i}=\sum _{i=1}^{m}{w}_{i}{P}_{ij}^{\left(t\right)}$

In the above equation ${P}_{i}$ are the weighted probabilities for incoming drought events. However, the predicted drought class is the drought class having max {Pi, i ε S} under the weighted Markov Chain method.

3.

## Results

For exploratory analysis, the percentage of occurrences of various drought classes in the historical time series data on SPEI-1 with a one-month time scale of the selected study regions are presented in Table 4. A high proportion of near-normal drought classes were found in most of the stations. Moreover, the test of equality of proportions, as suggested in Marden (1996), is performed using easyanova Arnhold (2014) package in R.

Additionally, multiple comparison tests show that the near-normal drought class has a significantly high proportion as compared to other drought classes. However, the severely dry drought class has a significant difference from the severely wet class. Besides, the proportion of the moderate dry class is significantly greater than for moderate wet.

Here, transition probability matrices are used for further prediction of future drought classes using equation 10. Weights from step one to step five associated with each station are shown in Table 5. These weights are calculated by normalising autocorrelation coefficients (see Eq. (7)). Table 6 shows the predicted probabilities for each drought class in December 2017 for Astore station. For the near-normal drought class, the predicted probability of this class using the weighted Markov chain is 0.712, which is very high concerning other drought classes. The actual drought condition is also near to normal, which indicates that the prediction is correct, and the chance of errors is very low. On the same line, the interpretation can be made for the rest of the months and stations as well. One month ahead prediction probabilities of the drought classes for the rest of the stations are shown in Table 7. Except for Drosh and Peshawar, the probabilities of near-normal drought categories are very high in all the stations. It is due to the high frequency of near-normal drought classes in the qualitative vectors of drought classes in all stations.

4.

## Discussion

Results associated with this test show that there is a significant difference in the proportion of each drought class. These results are consistent from the ecological and climatological point of view since, in these areas, most of the stations bear longer Moonsoon and precipitation periods. In most stations, near-normal drought class found a high probability of occurrences. However, the moderate wet drought class has a low probability as compared to a severely dry drought class. However, the near-normal drought class has a high probability in most of the study region. However, predicted probabilities for each drought class in Peshawar, Bunji, and Drosh have somehow different behaviour. In Peshawar, drought classes have almost equal probabilities except for the Severely dry class. Bunji shows a 0.533 probability of occurrence for the severely dry class. In Drosh station, almost equivalent probabilities are found for near normal and moderate wet.

5.

## Conclusion

Prediction and forecasting play a vital role, especially in early warning situations. Consequently, accurate and precise techniques of drought forecasting may reduce their severe effect by making effective drought mitigation policies. In this article, the SPEI-1 drought index being a more comprehensive drought monitoring procedure is used to classify historical monthly drought profile for seventeen meteorological stations of Pakistan. To predict the future drought classes, the standardised self-correlation coefficient in the time series data of SPEI-1 index is used as weights in WMC method (see Table 5). Outcomes associated with the WMC method for prediction of drought classes show that this forecasting method is flexible enough to incorporate change of drought condition, just by changing the transition probability matrix and the autocorrelation structure. It is observed that the probability of the near-normal drought class is higher in most stations of the study area (see Table 7). Further, in a global warming context, the situation of a significant increase in trend from wet drought classes to dry drought classes should be alarming for water resources and management authorities.

The limitation of the study includes:

1. The current study is based on SPEI index. Other indices such as SPI and SPEI can be incorporated. Further, their comparative assessment should be made for better understanding.
2. In this manuscript, we have used quantitative time-series data. Consequently, our weights are based on autocorrelation. In the future, the qualitative time series of drought can be used with our novel weighting scheme for WMC (see Ali et al. 2018).