## Introduction

Drought is a complicated natural phenomenon that occurs basically due to the lack of precipitation over a time period. And its occurrence usually results in great costs on various parts of nature and society. Drought is usually believed as a prolonged lack of precipitation. A worldwide definition based on precipitation quantity and period is: A drought implies that a spot has less precipitation (rain or snow) than ordinary over a couple of months or significantly more or drought is usually a situation of adversity because of the insufficient water due to surprising meteorological conditions. It could be categorized into disparate types such as; Meteorological, Hydrological, Agricultural and Socio-economic. Meteorological drought is related to precipitation for assessing dryness and dry period for a specific region where average precipitation may diverge spatially. Hydrological drought occurs when stream flow, reservoir, soil moisture, groundwater recharge, and lake levels are affected due to a decrease in precipitation. Agricultural drought occurs when there is not sufficient moisture to maintain average crop production for specific land area. Socioeconomic drought relates to the supply and demand of some economic goods with fundamentals of hydrological, meteorological, and agricultural drought. Climate and weather conditions in Pakistan differ across its large territory primarily due to the variety of altitude, longitude and latitude, wind flows and distance from the sea. A number of climate classification schemes have been introduced to account for spatial variability. One of the distinguished is the Köppen-Geiger scheme (Kottek and Rubel, **2007**; Peel et al., **2007**), which on the basis of precipitation and temperature separates climate into five classes, such as polar and alpine, continental, mild, dry and tropical climate. Since the late 20th century, a trend of so-called global warming can be observed around the world (Hansen et al., **2010**; Rohde et al., **2013**). Continuance of the trend is expected by numerous climate models under most emission scenarios and concentration pathways, which calculate the rise of the surface temperature between 0.7 and 2.4 °C by the year 2050 in comparison with the reference period of 1986–2005 (Collins et al., **2013**). Field et al. (**2012**) assumes that the rise of overall temperature might cause unprecedented climate disasters (such as droughts and floods). Few severe droughts occurring in the early 21st century are supporting this presumption, as for example the 2010 drought in Australia (Cai et al., **2014**), the 2011 drought in southern China Sun and Yang (**2012**) and the 2011–2012 drought in the USA (Seneviratne et al., **2017**).

Several further environmental factors are engaged in the occurrence of drought, such as temperature, wind flows, relative humidity and intensity, duration and severity of rain (Wilhite et al., 1994). However, the long term precipitation and temperature factors play the leading role in methods for calculating drought indices (Coffel and Horton, **2015**). In recent decades, the frequency, intensity and influencing area of the drought have significantly increased, which have drawn the attention of many researchers. It was mainly caused by climate changes and human activities.

Not only precipitation and temperature on a global level but also their regional distribution is of high importance for accurate monitoring of climate changes and other natural disasters. Furthermore, the documentation of rainfall and its regional distribution is one of the main responsibilities of the governmental meteorological department. Therefore, it is essential to track drought conditions with the use of regional statistics and data. Moreover, accurate regional precipitation estimates are indispensable for a wide range of research fields, such as hydrology, meteorology and others. Understanding of drought characteristics on a regional level is necessary for mitigating drought risks, moderating potential impacts on various socio-economic sectors and adopting proper measures and strategies (Hirabayashi et al., **2008**; Svoboda et al., 2016).

Spatial distribution of precipitation and temperature belong to the major watershed factors playing a significant role in advanced hydrological researches. Consequently, taking into account the intricacy of the temporal structure of the regional climate, several authors Coles and Tawn (**1991**), Guler et al. (**2007**) and Mahdian et al. (**2009**) jointly constructed two methods based on geospatial tools and advanced statistical models. Yet, those techniques are built on temporal data collected from single stations, which means that it only covers a single realization at a continuous spatial domain. That deprives the findings of those methods from the effect of spatial prevalence in climate. Additionally, increased the uncertainty of the prediction might have negative consequences on climate shifts policies and reliability of environmental, climate and weather conditions prognoses. Therefore, the incorporation of regional rainfall records combined with the temperature already at the entry stage might provide a substantial contribution to the accuracy, efficiency, and reliability of drought moderation policies. A regional rainfall may be defined as ‘the average of all stations’ rainfall (monthly/annually) in a region’.

However, contemporary advances in estimation methods which add auxiliary variables are accessible in sampling theory and environmental statistics. Cochran (**2007**) proposes comprehensive theory and methods connected to the usage of auxiliary data in order to improve the assessment of unknown features of unsystematic variables. Several other researchers also incorporate additional supplementary data to enhance the estimation of an examined variable under auxiliary information based statistical methods such as regression and kriging (Zhu and Lin, **2010**). Paloscia et al. (**2013**) works with Australian Landsat images having Landsat (path = 92/row = 84) that are similar to ENVISAT data in order to observe the effects of auxiliary information on vegetation. Apaydin et al. (**2011**) deals with altitude as a basis of auxiliary data connected to climatic inputs (precipitation and temperature) for interpolation under co-kriging settings.

Keeping in mind the significance of auxiliary information in an estimation procedure, the aim of this research article is to incorporate and apply regional temperature as auxiliary information in the process. By regional temperature we mean the average of all stations’ monthly mean temperature of a region. We propose another drought index, its framework is constructed on the basis of regional temperature used as an auxiliary variable to improve the accuracy and reliability of average precipitation estimates. Calculated precipitation values are further utilized to acquire Standardized Drought Index (SDI) values (Ali et al., **2019**). In contrast, temperature and rainfall can be perceived as global representatives of a specific natural homogenous optimized regional catchment area. Thus, using regional temperature as the auxiliary information might improve the validity of precipitation data.

Although hydrological characteristics of droughts have been examined broader than ecological and socio-economic aspects, deeper understanding of hydrological aspects of drought is still necessary to assess potential effects of drought more precisely and to design and implement adequate and effective moderation measures. In this regard, further research on drought conditions taking into account hydrological and regional disparity factors seems crucial. Indeed, regional spatial variability of drought affects the definition of the drought itself. As Yevjevich (**1967**) indicates, the term ‘drought’ can be used for a large covered area together with a long term severe deficit of water, whereas expressions as ‘water shortage’ or ‘deficit’ describe occasions with less harmful impacts. Ali et al. (**2019**) proposed Locally Weighted Standardized Drought Index (LWSDI). They utilized auxiliary information as a local weight for the improvement of monthly precipitation record. Ali et al. (**2019**) used regression estimator as a tool to incorporate auxiliary information in the study variable. One disadvantage is that they didn’t cover the scenario for the negative correlation between actual rainfall and average temperature (auxiliary variable). And the other disadvantage is that they improved only the specified stations but not all the regions of Pakistan. Our study is designed to cover all the disadvantages in a precise manner.

Moreover, growing vulnerability and dependence of a high-quality agricultural production on the accessibility of water and the development of large-scale multi-purpose water-supply systems indicate the fact, that the analysis of drought only at a single station is insufficient and a regional scale would be more beneficial. Lastly, the necessity for working with the whole array of existing regional data in measuring stations has arisen and it can be similarly used in different kinds of hydrological researches (e.g. flood analysis) as well. Historically it has been recorded that almost all Pakistani provinces faced repeated droughts: Khyber Pakhtunkhwa (KPK) in 1902 and 1951; Punjab province in 1899, 1920 and 1935 and Sindh province in 1871, 1881, 1899, 1931, 1947 and 1999 (Drought Bulletin of Pakistan, **2015**). Therefore, it is necessary to assess the regional impact of drought on a particular area in order to better understand the situation and subsequently adopt appropriate measures.

There are numerous number of estimation procedures such as product, regression and ratio estimators, which integrate complementary information on single/multiple auxiliary variables. Detailed overview, as well as the mathematical structure of regression, product and ratio estimators can be found in (Cochran, **2007**). In each respective method, validation of the proposed estimator depends on the incorporation of the auxiliary variables. For instance, when it is perceived that there is a negative correlation between study variable and auxiliary variable, then product estimator provides precise estimates for the population characteristics. On the contrary, when there is positive and perfect correlation then ratio estimator is useful. Preliminary applications based on the auxiliary information employed in 1973 Tarima and Pavlov (**2006**), when Pugachev (**1973**) incorporated the auxiliary information using correlation effects. As, in many surveys and records keeping modules, a collection of some extra information related to study variables are common practice. Ali et al. (**2019**) proposed a new method for the assessment of drought-the Locally Weighted Standardized Precipitation Index (LWSDI). They used LWSDI on ten different stations having different regions of Pakistan. Different probability distributions have been used to calculate LWSDI. For each stations, the CDF of those distributions having smallest value of BIC, are then selected for standardization.

This research article suggests a new way to characterize annual drought conditions, in which precipitation data specified by the auxiliary variable are incorporated in Standardized Drought Index (SDI) process. Hereby we recommend adding a regional average of monthly temperature data as weights in order to improve dependent precipitation estimates under regression and product estimation settings. Thus, we are able to diminish the sampling defects in the projected rainfall quantity records and to take account of the global warming effect while monitoring the drought conditions.

This paper is organized as follows: Section 2 consists of materials and methods which includes the overview of the data, study area as well as methodology which provides a comprehensive exposition of the theoretical details behind the product and regression estimation, SPI, and RIWSDI. We also present the mathematical formula employing average temperature as auxiliary information under regional settings. Here we proposed a drought index employing improved precipitation estimates under parametric and non-parametric approaches. In Section 3, we will provide a detailed discussion and results on how we improved the precipitation estimates and used in SDI procedure. Here, some temporal plots showing the difference between simple and improved records will also be presented. Finally, a conclusion on the proposed method will be provided in Section 4.

## Materials and methods

### Data and study area

Pakistan is located in Southern Asia which is the junction of Middle East and Central Asia having 23°–37° N latitude and 61°–78°E longitude (Ahmed et al., **2018**). It hosts the triple point (junction) of three world’s famous mountain ranges Karakoram, Hindukush and Himalayas in its north. Pakistan lies between 0 to 8611 m altitude. Mean temperature as well as precipitation data are considered as the climatic inputs in the current study. The highest precipitation (1038.6 mm) was recorded in Islamabad in July, 2001, and minimum precipitation record of (0 mm) in several regions were also observed during the study. There are both anthropogenic and natural reasons of change in the climate but earlier is the most leading with a constantly increasing trend since 1940 (the industrial revolution) (Anwar, **2011**). Pakistan shares her border with four countries; in the west with Afghanistan and Iran, in the east with India, in the north with China, and the Arabian sea is located in the south of Pakistan. According to 2016 census, Pakistan has a total population of more than 200 million. Where typically most of the people concerned with agriculture sectors either directly or indirectly.

This research consists of seven meteorological regions, and these seven regions contains overall 50 stations, situated in different climatic regions of Pakistan. Figure 1 shows the chosen meteorological regions. Detailed clustering of the regions of Pakistan can be seen (Hussain et al., **2011**). The map is generated through Geographic Information System (GIS). The proposed method, required a time series data of long-term monthly precipitations, as well as maximum and minimum temperatures. Consequently, a data ranges from January 1967 to December 2016 is obtained from – Karachi Data Processing Centre (KDPC) – via Pakistan Meteorological Department (PMD). The regions have significant high variability in rainfall and temperature during different seasons. Table 1 reveals the statistics of the selected stations for 50 years (1967–2016).

### Methodology

In order to assess and compute the comparative ascertainment of drought indices based on RIWSDI, secondary time series data of average temperature and monthly total precipitation (1967–2016) is used.

In the current study, we evaluate 32 probability distributions using an R package i.e. Propagate (Tellinghuisen and Spiess, **2014**). Here, various types of fitness criterion e.g. Anderson-Darling, Kolmogorov-Smirnov and Chi-Square tests were used in order to figure out the candidate distribution among others, for each individual indicator. For standardization of each study region and all the stations included in the regions, a CDF of the distributions, having a minimum value of BIC, are subsequently selected. To portrait different results in tables some regions has been chosen randomly i.e. Region-1,3,6,7. In this research, we are handling the problem of updating and improving regional precipitation estimates. A step-by-step procedure can be observed from the flow chart by analyzing Fig. 2. Here, the average temperature is suggested as an auxiliary information. In previous work, various surveys indicate that there is a positive correlation between rain and temperature. Zhao and Khalil (**1993**) examine the relationship between precipitation and temperature for eight regions including the USA. Their survey indicates that there is a positive correlation between these variables in all the seasons. At the Guliya ice core, detailed analyses of the precipitation index (glacier accumulation) and the temperature proxy recorded in since 300 years BP show that precipitation correlates with temperature in this region (Yang et al., **2006**). Rajeevan et al. (**1998**) found that temperature and rainfall were positively correlated during January and May but negatively correlated during July. Sneva (**1977**) found a positive month-wise correlation between temperature and rainfall in southeastern Oregon.

As the temperature is a globally representative environmental variable and homogeneous in nature has a strong association with precipitation. Thus, the use of temperature as auxiliary data is logically valid. Here the auxiliary variable is the mean monthly temperature. The following equations were used to incorporate auxiliary information,

In Eqs. (1) and (2)${\overline{y}}_{r}$ and ${\overline{y}}_{p}$ are the updated regression and product means of the study variable respectively, ${\overline{X}}_{j}$ is the overall mean auxiliary variable of $$ station, ${\overline{x}}_{i}$ is the sample mean of the $i\text{th}$ month of the auxiliary variable. In Eq. (1)${b}_{1}$ is the regression slope between the study variable and auxiliary variable. After assessing theoretical support about the positive and negative correlations between precipitation and average temperature, this study suggests average temperature as a piece of auxiliary information to improve annual meteorological records of precipitation. So, before defining drought characteristics and precipitation deficient, we utilized the concepts of regression and product estimator to improve the annual estimates of precipitation using average temperature as auxiliary information. The mathematical structures of regression and product estimator employing monthly mean temperature as auxiliary information for the estimation of the total monthly amount of precipitation are as follows,

*b*

_{1}is the regression slope between the study variable (precipitation) and auxiliary variable (average temperature).

#### Comparative statistics and quality measures

In this study, we encompass a well-known correlation statistic ‘Pearson Product-Moment Correlation coefficient *r*’, commonly called the correlation coefficient, for the comparison of the outcomes of our proposed index with that of existing indices. Correlation coefficient *r* is most widely used test statistic and measures the collinearity between two series. Formula for *r* is in Eq. (5)

*x*and

*y*representing the two series with

*n*number of elements, $\overline{x}$ and $\overline{y}$ representing mean values of the two series. The range –1 to 1 contains the values of

*r*. Positive values close to 1 indicates a strong positive correlation between two series, on the other hand negative values indicate an inverse correlation.

#### Methodology for standardized precipitation index (SPI)

The SPI is a common indicator of drought that does not require information about land surface conditions and needs only precipitation data to compute drought properties. According to McKee et al. (**1993**) and Wu et al. (**2007**), the SPI can be calculated in a given year ‘*o*’, a calendar month ‘*p*’ and for time scale *q*, by the following steps;

- The long-term record of precipitation is fitted to a probability distribution, which is then transformed into a normal distribution.
- The 1st step is the computation of the cumulative precipitation data,
${X}_{op}^{q}(o=1,2,3,\dots ,n)$ for a period of interest
*p*. - The 2nd step is to fit a cumulative probability distribution (commonly gamma distribution function), but in the current study different distributions have been applied. The PDF of gamma is defined as;
((6))$f\left(x\right)=\frac{1}{{\beta}^{\alpha}\Gamma \left(\alpha \right)}{x}^{\alpha -1}{e}^{-\frac{x}{\beta}},\text{}\mathit{for}\mathrm{}x\ge 0$
- A cumulative probability distribution for the particular time scale and given month of the observed precipitations event,
((7))$F\left(x\right)={\int}_{0}^{x}f\left(x\right)dx={\int}_{0}^{x}\frac{1}{{\beta}^{\alpha}\Gamma \left(\alpha \right)}{x}^{\alpha -1}{e}^{-\frac{x}{\beta}}dx$
- The cumulative probability of each observed precipitation event
*x*can be derived by Eq. (7). An equiprobability transformation is then made from the cumulative probability to the standard normal random variable_{i}*Z*with zero mean and unit variance, where the SPI takes on the value of*Z*.((8))$\mathit{SPI}={\phi}^{-1}\left[F\left(X\right)\right]$ - The distribution of precipitation may contain zeros. For instance, taking a special case of Gamma distribution, suppose the probability of all zero values in a time series of
*IWP*is denoted by_{(r,p)}*q*. Let*m*be the number of all zero values and*n*is the total number of observations contained in*IWP*time series, therefore,_{(r,p)}*q*can be estimated by the ratio of*m*and*n*i.e.*m/n*,((9))$H\left(x\right)=q+\left(1-q\right)F\left(x\right)$ - As precipitation is not distributed normally, so an equiprobability transformation is done from the CDF of the mixed distribution to CDF of a standard normal distribution, with zero mean and variance 1, which is given by;
((10))$\mathit{SPI}={\phi}^{-1}\left[H\left(X\right)\right]$
- Because the SPI is normalized, wetter and drier climates can be represented in the same way, and wet periods can also be monitored using the SPI.
- The 3rd step is to show the adequacy of the selected distribution, using some numerical or graphical methods.
- The 4th step is to verify the normality of SPI using numerical or graphical techniques.

### Proposing a hydrological drought index: regionally improved weighted standardized drought index (RIWSDI)

In this study, we used IWP(*r*,*p*) weights in the placement of generally simple precipitation series in the SDI technique. As the estimates IWP(*r*,*p*) are representative more regionally which encounter direct effect of maximum and minimum temperatures in the estimation stage. Consequently, the inclination of using IWP(*r*,*p*) estimates in place of simple precipitation records is logically adequate. Therefore, to achieve the SDI, follow the instruction of Stagge et al. (**2015**) on the parametric standardization approach, the current study integrates the probability distributions, that fits well on the specified temporal series of IWP(*r*,*p*) estimates. Furthermore, this research incorporates different probability plotting (PP) methods for the assessment of the coherence and validity of IWP(*r*,*p*) estimates in non-parametric approach. The following are the brief descriptions of both the methods.

#### Parametric mechanism: selection of optimal probability distributions

In parametric approach, we arrange a list of different, 32 candidate distributions among various commonly available probability distributions. Although, the study suggests the probability distributions having multiple-parameters, for instance, in place of a Weibull and Gamma distributions having two-parameters, a goodness of fit must be applied to a four-parameter Weibull and three-parameter Gamma, respectively. Where the lowest value of Bayesian Information Criteria (BIC) Schwarz (**1978**) indicates the optimal probability distribution.

In the experimental and computational analysis, we incorporate Tellinghuisen and Spiess (**2014**) an *r* package to achieve the optimum probability distribution BIC for all SPI-3 and 12 – and RIWSDI-3 and 12. The study consist of 32 highly parametric distributions, for example; gumbel distribution, generalized extreme value, and generalized normal distribution, etc. Chi-square, Anderson Darling, and Shapiro Wilk tests are used for the nomination of the optimal distributions. Whereas, a Levenberg-Marquardt algorithm using minpack.lm Elzhov et al. (**2010**) an *r* package, is used to estimate the parameters of each chosen distribution. Furthermore, the Cumulative Distribution Function (CDF) of the selected optimal distribution is then converted using the method in Eq. (9).

In researches related to hydrology, particularly in proposing a new drought index as well as their comparative assessment, *r* is the most frequently used statistical technique (Tsakiris and Vangelis, **2005**; Ali et al., **2017**). Although, the existing various drought indices bounds to choose optimum and most pertinent drought indices. In the previous researches, numerous authors proposed new drought indices related to hydrology including Cumbie-Ward and Boyles (**2016**); Jain et al. (**2015**); Naumann et al. (**2014**) and Ye et al. (**2016**) and compared those indices with the well-known index: Standardized Precipitation Index (SPI) (McKee et al., **1993**). RIWSDI is mainly based on a long-term rainfall (precipitation) series as well as on average temperature as auxiliary variable, and is used to highlight the insufficiency in the amount of precipitation for various time scales (1, 3, 6, 9, 12, 24, 48) at a particular chosen station. Initially, the RIWSDI method is based on the standardization of CDF of the chosen distribution from 32 different candidate distributions. Where negative and positive values of RIWSDI respectively show less than or greater than the median precipitation.

Here, the study finds out that for defining drought related to hydrology, RIWSDI at 3 and 12-month time scale procedure uses, respectively, three- and twelve-months’ average data of monthly precipitation records as well as the auxiliary variable. Therefore, a comparative analysis of RIWSDI with SPI at 3 and 12-month time scale, is considered. Habibi et al. (**2018**) stated that most of the researches related to drought monitoring and hydrology, SPI-3 and SPI-12 are most extensively, efficient and effective used indicators for highlighting drought related to hydrology. Furthermore, characterization of drought based on 3 and 12-month timescale show an overview of the seasonal drought and an overall behaviour of hydrological conditions related to regions (Gumus and Algin, **2017**).

In computational and experimental analysis, it has been shown that the utilization of Gamma distribution is appropriate for modelling precipitation records at different time scales across all accumulation periods and regions within Europe (Stagge et al., **2015**). That study is for Europe, as a result, a PDF is required among the enlisted probability functions for the regions under study. In the current study, we are following guidelines of Stagge et al. (**2015**) in the process of estimation. Hence, a general expression of RIWSDI index can be written as follows,

*is the monthly cumulative total of either using regression or product type estimator for the improvement of rainfall data, and PDF shows the optimum probability function with*

_{(r,p)}*n*parameters. Estimation of the values of RIWSDI can be done by normalizing CDF of the pertinent selected PDF, which are fitted to the time series records of improved monthly cumulative precipitations.

In Eq. (9), a small amendment is made in the CDF to fully adjust the adverse effect of non-precise values of the series. Furthermore, following McKee et al. (**1993**) and Ali et al. (**2017**), quantitative records of RIWSDI are then classified in accordance to the severity of drought. The drought characterization for SPI and RIWSDI can be seen in Table 2.

#### Non-parametric mechanism: incorporation of graphical methods

In each of the probabilistic models, as a result ambiguity repeatedly exists in the precise and accurate estimation procedures (Parker, **2014**). Furthermore, the selection of the precise probability distributions considering each indicator is exclusively subjective in its nature. To avoid these kind of problems, for non-parametric drought monitoring, Hao and AghaKouchak (**2014**) gave an idea by using Probability Position formulas (PP-formula) of Gringorten (**1963**) as an alternative technique of Gamma distribution for obtaining the SDI. The fact behind the use of graphical technique is to fascinate the extreme events, and as a result, to reduce the errors, in precise as well as accurate estimation of the specified drought index. Farahmand and AghaKouchak (**2015**); Ghamghami et al. (**2017**) and Zhang et al. (**2018**) also used a non-parametric approach for drought monitoring. Although, the behaviour of data is varying, as it is varied by place to place, therefore, it is not enough to the incorporate only one probability plotting position. Stagge et al. (**2015**) studied that it is not enough to use only the Gamma distribution to capture and observe the behaviours of different climatic regions to acquire the drought indices. Hence, the deployment and use of different PP-formulas are mandatory for evaluating different behaviours of varies specified probability distributions (Cunnane, **1978**; Vogel, **1986**; Shukri Yah et al., 2012). In the current study rather than utilizing Gingorten PP-formula, we use other six well known non-parametric PP-formulas for the computation of RIWSDI.

## Results and discussion

### Basic statistics, deviations, and temporal behaviour

Statistics of each region’s stations is presented in Table 1. It contains the information of the precipitation and temperature of all the stations included in different regions. Coordinates and Altitude(*m*) can also be seen from Table 1. Mean, Minimum and Maximum precipitation as well as temperature can be obtained from this table. For example, in C1 (Region-1), Astor station has a mean, minimum and maximum annual precipitation of 39.5(mm), 21.6(mm) and 72.8(mm), respectively. Similarly, it has a mean, minimum and maximum annual temperature of 9.9(°C), 8.3(°C) and 11.3(°C), respectively. Astor has a latitude (N) and longitude (E) of 35.3570° and 74.8624°, respectively. Astor is 2546(m) above the sea level. Similarly, we can see the statistics of each and every station from Table 1.

SPI-12 is the most commonly used drought index for annual monitoring and characterization of hydrological drought (Habibi et al., **2018**). In previous research, several studies proposed hydrological drought indices and compared it with SPI-12 (Naumann et al., **2014**; Jain et al., **2015**; Cumbie-Ward and Boyles, **2016**). In this study, SPI-3 and SPI-12 with the comparison of RIWSDI-3 and RIWSDI-12 have been computed, before the standardization of IWP_{(}_{r,p}_{)}, a little graphical analysis is done by assessing the temporal behaviour and deviations in improving precipitation records with those which are used in the SPI-3 and SPI-12. SPI-3 and SPI-12 uses three- and twelve-month average of monthly precipitation records, respectively. Analogous to SPI, the proposed structure of precipitation records used in the IWP_{(}_{r,p}_{)} model has the same mathematical structure and rationale. Therefore, it is necessary to show how auxiliary information plays a role in the temporal estimation of precipitation records. Figure 3 shows the graphical representation of temporal precipitation records of regionally weighted and the usual records for all 7 regions (clusters) along with their stations as well. From Fig. 3 we observed that there are significant changes, particularly in upper precipitation record of Astor, Dir, Parachinar, Badin and Dalbandin stations. These contrasts reveal how drought characterization and analysis of regional meteorology of climates and their phenomena can be modified for the betterness by the introduction of advanced estimation techniques. It can be seen that the improved records in all the stations shows a better trend. Furthermore, before the standardization of the precipitation records, we may obtain the assorted proposition of the appropriate probability distributions for improved and simple precipitation records. Table 3 summarizes the statistics of the selected regions (clusters). It is found that the mean precipitation for simple and improved precipitation are almost the same. But, there is a significant difference, between the standard deviation (SD) and coefficient of variation (CV), of the saying records.

### Parametric computation

Analogous to RIWSDI this study also calculates SPI. Subsequently, for each of the indicator, some sort of list of probability distributions is equipped in order to check their optimum fitness. Goodness-of-fit (GoF) test statistics are usually used for confirming validity and for choosing the best-fit among different distributions for a specific data set. In the current study BIC is used for choosing the best-fit. Numerical (BIC) as well as graphical performance (pdf) were used in the current study to assess and select the best fitted probability distribution among other 32 candidate distribution. A pdf with lowest value of BIC is considered as a best fit. The Bayesian Information Criterion (BIC) Konishi and Kitagawa (**2008**); Sakamoto et al. (**1986**) is applied here. A distribution, for which the value of BIC calculated by means of the following Eq. (12) is the lowest, is considered as the best fit,

*L*,

*k*and

*N*is the function of likelihood, number of the estimated parameters and total number of observations respectively, for the analyzed data (Kotowski and Kaźmierczak,

**2013**).

Tables 4–7 summarize the CDF of the optimum probability functions for all the indicators at all study regions and their associated stations. It also shows the chosen probability function and their estimated parameters as well as BIC values. Selection of each probability distribution is based on (weighted) residual sum-of-squares as the minimization criterion based on the Levenberg-Marquardt algorithm. Where, the estimation phase consists of method of moments, method of maximum likelihood estimation and method of L-moments. All these methods were implemented using lmom *R* packages. Especially in SPI and RIWSDI, fitness and selection of different probability distribution validate the finding of Stagge et al. (**2015**). A number of modifications to SPEI and SPI methodology, and also for assessing SPEI and SPI an updated procedure based on Shapiro-Wilk test were proposed by Stagge et al. (**2015**). They found gamma distribution and generalized extreme value distribution for SPI and SPEI respectively as the best fitted distribution in their study.

Figure 4 shows temporal behaviours of RIWSDI-3 and SPI-3 indices, Fig. 5 shows temporal behaviours of RIWSDI-12 and SPI-12 indices. Figures 6 and 7 shows the histograms, Q-Q plots and also provides the values of correlation coefficients between RIWSDI and SPI, respectively. In almost all regions the correlation coefficient is high, which shows the significance of the correlation between RIWSDI and SPI. Whereas some stations show low correlation i.e. Region-2-Chitral (0.32), Drosh (0.46) for SPI3, Region-2-Chitral (0.46), Drosh (0.48), Zhob (0.34) for SPI-12, and Region-6-Nokkundi (0.45) for SPI-12 as well. This shows that RIWSDI can be recommended as an alternate drought index which incorporates auxiliary information-based precipitation data for characterization of hydrological drought. Figures 8–11 shows the plots of the chosen distributions among 32 distributions on the basis of the lower value of BIC.

### Non-parametric computation

Beside of cautious determination of probability distribution, our observed findings exhibit that outliers as well as extreme values cannot be fully capture by probability function. Figures 8–11 reveals that for Chilas station 4 P Beta distribution is the best fit with low BIC value, does not capture uncertainty in the significant parts of the data for SPI-3 model in Region-1. Similar results can be seen from Figs. 8–11 in some stations. All chosen distributions appeared to inefficient for coverage of extreme records. Hence, in order to check and validate furthermore, the current study incorporates six Probability Plotting Position (PPP) formula, see Table 8.

After standardization the vector of time series based on the PPP, we see that RIWSDI remains aligned with SPI. Table 9 summarizes the correlation between RIWSDI and SPI of the non-parametric approach. There is a highly positive correlation between the values of SPI and RIWSDI for all the stations and their respective regions, under different methods of non-parametric approach, 3 out of 50 stations have weak but positive correlation with their respective regions.

## Conclusion

The new proposed index shows significant advantages over the former index (SPI) by incorporating apart from the record of precipitation, an additional meteorological parameter as an auxiliary variable, the average temperature. It is figured that although the RIWSDI usually responds in an identical fashion towards the SPI, it is more delicate and appropriate in regional drought analysis. Drought is generally acknowledged as a regional phenomenon. Even so, facts are accumulated by determined meteorological stations, which can also be regarded as representing the regions related to them. This study advantages drought-observing component by the incorporation of improved precipitation series in standard strategies of SDI technique. Whereas, improvement of the precipitation series is derived from the use of auxiliary information in the estimation phase of mean rainfall amount. In this article, the simple regression and product estimator approaches are utilized to weight the rainfall amount of each region. Therefore, the current study proposed a new regional hydrological drought index: The Regionally Improved Weighted Standardized Drought Index (RIWSDI). Performance of RIWSDI is evaluated by verifying the direction of the relationship, the form (shape) of the relationship, and the degree (strength) of the relationship among the different values of SPI through Pearson correlation statistics. To check out the consistency and efficiency of the proposed index RIWSDI, this study contains seven meteorological regions dispersed in several climatic settings of Pakistan, see Fig. 1. To compare and compute the values of RIWSDI and SPI for time scale 3 and 12, estimation methodology comprises of both the parametric Stagge et al. (**2015**) and non-parametric Hao and AghaKouchak (**2014**) approaches.

Comparative analysis of improved precipitation records and simple precipitation records can be observed from Fig. 3. A significant difference between the probability distributions of simple and improved precipitation series can be observed from Tables 4–7, and Figs. 8–11. As we described previously, parametric distributions which are used to compute SPI and RIWSDI should be capable of providing normally distributed SPI and RIWSDI series. An SPI and RIWSDI series are consider non-normal if the criteria given below is satisfy simultaneously: (a) *p*-values ≤ 0.10 and Shapiro–Wilk (*W*) statistic lower than 0.96; (b) an absolute value of median > 0.05. Further information on *W* test may be found in numerous studies including Razali and Wah (**2011**). *W* test result can be found in Tables 10 and 11. By analyzing Tables 10 and 11, it can be concluded and verified that all RIWSDI and SPI for time scale-3 and 12 are normally distributed. For example, taking the values of RIWSDI-3 and RIWSDI-12 of Region-1, all the criterion of distribution to be normal are satisfied. The confirmation of normality of RIWSDI and SPI for all other regions and the stations included in the regions can be confirmed by analyzing the results of *W*, *p*-value and absolute median of the respective regions and stations. Moreover, on the contrary with the standardization based on various distribution, the pattern of the records of RIWSDI are very closed to that of SPI (time scale 3 and 12), see Figs. 4 and 5. It can be observed that the simple precipitation records within the regions are relatively different, whereas it can be seen from Figs. 6–7 that the correlations among the stations are significantly low, whereas the correlation between RIWSDI and SPI (time-scale 3 and 12) for different regions and the stations within the regions are significantly high. Generally, comparative evaluation indicates that RIWSDI is significantly high correlated with SPI (time-scale 3 and 12) in both standardization i.e. parametric and nonparametric. Overall, comparative evaluation indicates that RIWSDI is strongly correlated with SPI in each parametric and non-parametric standardization. On the other hand, some of the discrepancies can be observed in parametric standardization and show low but positive correlation.

Some extreme values are addressed by incorporating non-parametric methods for the analysis and comparison of RIWSDI and SPI. Here, six different probability plotting position formulas are used to handle the extreme values and outliers, see Table 8. Table 9 shows different results of the correlation of RIWSDI and SPI under non-parametric approach. It can be observed that SPI (3–12) values of all the stations are significantly high correlated with RIWSDI (3–12) values of the respective regions.

The foremost advantage of RIWSDI is to signify hydrological drought primarily based on regionally improved series of precipitation. Though, the limitation of the current study is that RIWSDI cannot be generalized in the settings of multiscales (Edwards, **1997**).