Pure water has been considered as precious as blue gold and is intended to be the gravest concern of today’s era. Around 70% of the world’s freshwater resources are consumed by agriculture (Hafeez and Khan, 2018). However, problems regarding water management issues are becoming ever more critical in several countries around the world. As a result, more effective irrigation water application is therefore required. Moreover, an effective and sustainable organization of water reservoirs requires precise knowledge of the source of production and distribution of resources (Song et al., 2019).
Evapotranspiration is an essential feature of the water cycle. Furthermore, it plays a pivotal role in the hydrological cycle, which has great importance in agriculture, hydrological, ecological, and climatic. It also involves both soil evaporation and plant transpiration and accounts for 90% of the rainfall in semi-arid and arid regions (El-wahed and El-mageed, 2014). Therefore, its accurate estimation is not merely crucial for the study of climate change and the assessment of water resources but also has an abundant application in the management of crop water needs, prediction, and monitoring of drought, effective development of water resources, etc. (Song et al., 2019). However, due to the shortage of measured evapotranspiration data, Reference Evapotranspiration (ETo) is invariably used to estimate actual evapotranspiration.
From the last few decades, numerous methods have been developed and recommended to estimate ETo for various types of climatic conditions. Amongst these, the FAO-56 Penman-Monteith (PM) method has been regarded as the standard method. Moreover, it is also used as the benchmark equation and utilized by numerous researchers to validate other equations (Bhat et al., 2017; Patel et al., 2015). Rim (2009) used PM equation to estimate ETo and examine the effect of urbanization, geographical, and topographical conditions on evapotranspiration. Similarly, Ndiaye et al. (2017) used PM to investigates the responses of ETo to the variation of climatic variables, and Córdova et al. (2015) evaluate the precision of the PM method for the calculation of ETo using various combinations of incomplete climate data of PMO landscapes in the high Andes of southern Ecuador.
The PM equation requires a wide range of meteorological data like Solar Radiation (SR), Wind Speed (WS), Temperature (Temp), Humidity (Hum), Vapor Pressure (VP), and Soil Heat Flux(SHF). However, all of these meteorological data are unlikely to be available with the majority of weather stations. For the situations in which climate data are incomplete, the PM equation is not suitable (Patel et al., 2015).
To overcome the issue of climatic parameters availability, various methods like Blaney-Criddle (BC), Thornthwaite (Th), Trabert (Tr), Hargreaves-Samani (HS), Jensen-Haise (JH), Abtew (Ab), Makkink (Mk), Ivanov(Iv), Turc (Tu), and Priestly-Taylor(PT) are explored to estimate ETo with limited meteorological data (Song et al., 2019; Bhat et al., 2017). Consequently, Bhat et al. (2017) compared various universally accepted ETo estimation methods, e.g., PT model, HS model, MK model, BC model, Iv model, by considering the PM as a standard method. Among these five models, MK Model was found to be the appropriate model, and the IV model was the least performance model for the location Srinagar in JK, India. For estimating the ETo, (Valipour et al., 2017) selected five models based on temperature, five on radiation, and five based on mass transfer, in relation to better performance of them in various climates based on previous research. They demonstrated that radiation-based models were better adapted to climate change than the temperature-based models and, more precisely, the models based on mass transfer.
Among the methods mentioned above, the HS equation received maximum attention from the research communities because it only needs temperature and SR for estimation. The simplification of the HS brings inaccuracy and inconsistency to ETo estimation (Patel et al., 2015). Hafeez and Khan (2018) evaluated HS and BC methods' precision concerning the standard PM method for approximating ETo in arid coastal regions of Sindh and Baluchistan. The research outcomes suggested that the HS method underestimated the standard PM method at all the meteorological stations of coastal conditions and the BC method overestimated the PM method at all the meteorological stations of coastal conditions. Similarly, Jabloun and Sahli (2008) evaluates the HS equation for computing ETo. Results obtained from the comparison of ETo daily estimates by the HS equation against the PM method across the Tunisian locations showed a systematic overestimation at inland sites; however, at coastal zones, the HS equation tends to underrate ETo.
Therefore, further research is needed to adjust the HS coefficients to local conditions to achieve greater accuracy. The HS constant calibration for various climatic conditions can be an acceptable approach to estimate the ETo with greater precision. Several researchers have succeeded in calibrating these constants under different climatic conditions. Ferreira et al. (2018) compared the calibration of the HS equation by using the Solver tool from Microsoft Excel, which employs the nonlinear optimization algorithm. For that purpose, they used daily data from six climatic stations situated in the state of Minas Gerais, from 1997 to 2016. Similarly, Patel et al. (2015) used fuzzy logic to calibrate only two HS equation constant for various meteorological stations of India like Bikaner, Calcutta, Kakinada, Coimbatore, Panjim-Goa, Kota, Deharadun, and Srinagar. Results of the study concluded that MHS enhanced the results by reducing the error in ETo estimates.
Although several studies are dealing with the calibration of equations for the ETo estimate, there is still a discrepancy in the effectiveness of the calibration methods. However, the usage of more appropriate calibration techniques may enhance the results. As a result, this study was therefore intended to calibrate the constants of HS equation. For that purpose, fuzzy logic-based calibration of all the constants (ah, bh, ch) in the HS equation for various meteorological stations of Pakistan is carried out. Meanwhile, In the Modified Hargreaves-Samani (MHS) equation, the effective temperature is used instead of the mean temperature. Moreover, the constant of effective temperature (dh) has also been calibrated. The calibration of the HS equation was conducted in four ways: by adjusting the parameter of effective temperature dh, by adjusting the parameters of ah and dh, by adjusting the parameters of ah, bh and dh, and by adjusting the parameters of ah, bh, ch and dh.
Furthermore, in any study where researchers are dealing with the time series, there is always the question of whether the probability distribution remains constant or changes. Therefore, homogeneity is an essential concern to detect the variability in data because when there is no change point in the data, it signifies that the data measurements can be performed simultaneously, with the same instruments and under the same environments. Researchers from all over the globe have conducted several studies to detect change points in data. Palaniswami and Muthiah (2018) used a combination of Cumulative Sum charts (CUSUM) and bootstrapping to detect change points in precipitation and Temperature series over the Vellar river basin. Similarly, a study has been conducted by Jaiswal et al. (2015) to identify the climatological parameters' breakpoints. For that purpose, they used different homogeneity tests, namely, Pettitt’s test, Von Neumann ratio test, Buishand’s range test, and Standard Normal Homogeneity test. Meanwhile, change point detection analysis for remote sensing data has been done by Militino et al. (2020) by using a different non-parametric test. Thus, considering the importance of change point analysis, this research is further extended to assess the homogeneity in ETo obtained by MHS.
This research has focused on providing a proposed ablative method for the precise estimation of ETo, and then the outcomes associated with this proposed algorithm are further utilized to detect an abrupt change in the distributional properties of ETo series. The finding of this research would be helpful to the researchers, planners, and policymakers to examine the climate change and the assessment of water resources and the management of crop water needs, prediction, monitoring of drought, and effective development of water resources.
The rest of the paper is organized in the following way. Section 2 consists of the overview of the PM and HS methods. Besides, a detailed description of the MHS, homogeneity tests, and statistical assessment are also given in this section. Section 3 is devoted to summarize the results of the study. In section 4, we look for the discussion of the results. The paper ends with the conclusion in section 5.
Materials and methods
The study site is Pakistan, which is located between 23 degrees 35 minutes to 37 degrees 05 minutes of north latitude and 60 degrees 50 minutes to 77 degrees 50 minutes of east longitude, and 8611 m above the mean sea level. In this study, the data set comprises maximum temperature, minimum temperature, wind speed measured at 0000 UTC and 1200 UTC, humidity measured at 0000 UTC and 1200 UTC, and sunshine hours of different meteorological stations of Pakistan from 1980 − 2019 are utilized to estimate ETo. Meteorological data were provided and rigorously quality controlled by the Meteorological department of Pakistan.
Estimation of reference evapotranspiration
Food and agriculture organization Penman-Monteith method
The PM method is considered the most appropriate method to estimate the ETo over a broad range of climates and is recommended by the Food and Agriculture Organization (FAO) as the standard method (Wang et al., 2017). The combination based PM equation is indicated in equation 1
Where EToPM is the ETo calculated by Penman-Monteith; Rn is the net SR; G is the SHF; t is the daily mean Temp; u2 is the WS at 2 m height; es is the saturation VP; ea is the actual VP; s is the slope of the saturation VP function; γ is the psychometric constant (Wang et al., 2017).
EToPM can also be calculated by using the SPEI package of R Software with input variables maximum Temp, minimum Temp, WS, Hum and sunshine hours. Whereas the variable Rs can be estimated with the help of available climatic data and location of the station (Rahman et al., 2020).
The HS method has been considered an attractive methodology to estimate ETo as it is facile and trustworthy. Moreover, it requires only a few data, so it is easy to calculate and has a minimal effect on arid meteorological station results. This method is based on temperature, as it necessitates only the minimal and maximal temperature values. Thus, the HS method is usually applied to estimate ETo when we have only the temperature. The reader can refer to numerous researches such as Hargreaves and Samani (1982, 1985) for further details on this method (Hafeez and Khan, 2018).
However, ETo estimated by HS equation is defined as.
Rs is the extra-terrestrial SR, and Tmean is mean monthly Temp (C°), respectively.
HS is made simple without considering the consequences of humidity and wind speed in the ETo rate explicitly. Thus, the actual HS equation cannot provide a precise estimate of the location under extreme weather conditions (Patel et al., 2015). Therefore, a Fuzzy Inference System (FIS) will be applied to determine the modified value of HS parameters.
Where, Tmax and Tmin are maximum and minimum Temp respectively, Teff = dh × ((3 ∗ Tmax) −Tmin)/2 and Rs is the extra-terrestrial SR which is obtained according to (Allen et al., 1998).
Fuzzy logic method for calibration
A fuzzy interference system is a scheme that utilizes fuzzy set theory to map inputs to outputs. Usually, FIS has four essential components: fuzzifier, fuzzy rule base, fuzzy inference engine, and defuzzifier. Fuzzifier transforms every input data element into degrees of membership by a look-up in single or multiple membership functions. The fuzzy rule base contains rules that encompass all possible fuzzy relations between inputs and outputs. These rules are demonstrated in the IF-THEN format. There are mainly two types of rule systems: Mamdani and Sugeno (Patel et al., 2015).
Depending on the problem at hand, the user can choose the appropriate rule system. The fuzzy inference engine considers all the fuzzy rules in the fuzzy rule base and adapts an input set into the corresponding outputs. Defuzzifier transforms the resulting fuzzy outputs from the fuzzy inference engine to a number. The conceptual layout of FIS is illustrated in Figure 1.
In this study, we used FIS to compute the modified empirical values for the MHS equation parameters (ah, bh, ch and dh). Two input variables are used for each parameter. Here input variables WS Diff is the difference between the wind speed measured at 1200 UTC and 0000 UTC, Temp Diff is the difference between max and min Temperature, Hum Diff is the difference between relative humidity measured at 1200 UTC and 0000 UTC, and sunshine hours. Further, a fuzzy rule base is developed based on the influence of such variables on the value of ah, bh, ch and dh. In this context, research literature’s on HS is used to develop the fuzzy rule base. Current research employed the FIS by using MATLAB software. Block diagram of fuzzy logic-based calibration of constants of the MHS is presented in Figure 2
Tests for change point detection
The detection of change point is an essential aspect in determining the time frame from where a significant change took place in a time series. Therefore, in this study, three non-parametric tests, namely Pettitt’s test, Busihand’s range test, and Buishand’s U test, are applied to identify the change point in ETo series. A brief description of the tests that have been used in this research is described as follow:
Pettitt’s test is a non-parametric homogeneity test that has been extensively utilized in hydro-climatological studies to detect a single change point in the mean of the distribution of the variable of interest. This test is based on the Mann-Whitney two-sample test and examines the null hypothesis: There is no change in the distribution of the variable of interest, as opposed to the alternative: Change point exists in the distribution of the variable of interest (Pohlert, 2018; Jaiswal et al., 2015). The Pettitt’s test statistic for sample length (n) is defined as:
PK = max|Uk| (4)
A change-point exists at time k when the statistic PK significantly differs from zero at a specified level (Mallakpour and Villarini, 2016). The approximate significant level is defined as:
We will reject the null hypothesis and split the data into two sub-series (fore and aft the change point) with two diverse distribution functions if the p-value is less than the level of significance α (Mallakpour and Villarini, 2016).
The Buishand’s tests presume that the sequence of ETo derives from the normal distribution with mean µ and variance σ2 (Jaiswal et al., 2015). Assuming a change in the mean at time k, we can write
The test statistics for Buishand U test is:
The corresponding p-values of both tests are computed by utilizing Monte Carlo simulation (Pohlert, 2018).
Researchers use several metrics to evaluate the performance of different methods, e.g., Root
Mean Square Error (RMSE), Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Performance Index (PI) etc. RMSE has been employed as a standard statistical metric to evaluate the efficiency of the methods. The fundamental assumption about RMSE is that the errors are unbiased and follow a normal distribution. Meanwhile, MAE is another effective measure broadly used in algorithm assessments, which calculate the average magnitude of the error in a set of the estimate, without considering their direction (Ferreira et al., 2018). Moreover, when both metrics are computed, by definition, RMSE is never smaller than the MAE. Hence, RMSE and MAE both are indifferent to the direction of error, their range is from zero to infinity, and they are negative-oriented scores, which means small values are better. The RMSE is more appropriate to represent algorithm performance as compared to MAE when the error is expected to be normally distributed. Although MAE appears similar to RMSE, there is an essential difference between these metrics, as MAE is an improper result and RMSE is proper results. The criteria to judge for the best method are relatively small MAE and RMSE (Chai and Draxler, 2014).
The statistical indices MAE, RMSE, and PI were used in this research for the performance evaluation, according to equations 11, 12, and 15, respectively. The PI was classified in accordance with the criteria suggested by (Camargo and Sentelhas, 1997) (Table 1).
Where, d - Willmott’s index of agreement r - Correlation Coefficient
PI - Performance Index
Analysis of the results
Parameters of calibrated Hargreaves-Samani equation
The FIS is utilized to empirically estimate the calibrated HS equation parameters (ah, bh, ch, dh) for different meteorological stations of Pakistan. The fuzzy inference engine generates the fuzzy output variable for ah, bh, ch and dh. Meanwhile, with an appropriate defuzzification center of gravity method, the fuzzy variables are converted into crisp output. Table 2 demonstrates that the average value of the adjusted HS coefficients, ah, fluctuated from 0.0001 to 0.0021, values lower than the original ones (0.0023), indicating that the original equation has an inclination to overestimate the ETo for this studied site. Moreover, ah is found to be higher for station Kotli and lower for Gilgit and Lahore. Similarly, the value of bh is varied from 0.45 to 0.69, which is lower for stations Skardu, Islamabad, and Kotli while greater for other stations than the original one. Similarly, ch is found to be varied from 11.9 to 22.9, which is greater for Bahawalnagar while smaller for other stations than the original one, and dh is found to be highest in Skardu and lowest in Muzaffarabad.
In this section, the comparative evaluation of HS and MHS is presented. The performance index of the models is classified according to the criteria given in Table 1. The results of the comparative evaluation given in Table 2 demonstrated that the original HS equation showed performance classified as Good for Skardu, as Medium for Gilgit and as Not Good for Bahawalnagar, Kotli, Sialkot, Jhelum, Muzaffarabad, Dera Ismail Khan (DIK), and Islamabad. After the calibration, it was possible to obtain better classifications. The calibration of only dh performance was classified as Good for Muzaffarabad, Gilgit, and Skardu, while as Medium for the remaining stations. Similarly, the calibration of all parameters (ah, bh, ch, dh) performance is classified as Excellent for Gilgit and Skardu, while as Very Good for the other stations.
Statistical analysis for accuracy of the MHS is carried out with RMSE, MAE, and PI analysis in comparison with the standard PM equation. The comparison shows that the original HS equation is the least suitable method for estimating evapotranspiration for the present study site due to the highest RMSE and lowest PI. The lowest values of RMSE and MAE and the highest values of the PI were obtained after calibrated for all HS equation parameters (ah, bh, ch and dh). However, all the other calibrations also gave near values for these indices, specifically the calibrations of three parameters of HS equation (ah, bh, dh).
In order to facilitate the visualization of the calibration effect, a graphical representation of six stations of Pakistan (randomly selected), namely, Bahawalnagar, Islamabad, Kotli, Lahore, Muzaffarabad, and Sialkot, from 1980-2019, are illustrated in Figures 3 and 4. Figure 3 demonstrates the graphical behavior of ETo estimated by PM method, original HS equation, and calibrated HS equations one by one. In contrast, Figure 4 elaborated the behavior of PM equation, HG equation and calibrated four parameters HS equation for the same stations of Pakistan, from 1980-2019.
Analysis of change point detection
The three homogeneity test results for estimated ETo by MHS in ten (randomly selected) meteorological stations of Pakistan are presented in Table 4. Change point analysis of station Bahawalnagar confirmed that the change point has occurred in February of 1999, while the change point occurred at station DIK in June of 1988 from all three homogeneity tests. Similarly, a significant change point has been exhibited for the stations of Jhelum, Kotli, and Muzaffarabad in March of 1998. In contrast, for Lahore and Islamabad's stations, the change point has appeared in June of 1995 and in June of 2006, respectively. Meanwhile, the behavior of the historical series of ETo confirmed that the series of April 1990 has a significant change point for the station of Sargodha. However, for station Gilgit, the change point has appeared in different years for three homogeneity tests. The change point is confirmed in April 1990 for Pettitt’s test while in May 1995 for the Buishand range test and Buishand U test. The graphical representation of homogeneity tests for three randomly selected meteorological stations of Pakistan is illustrated in Figure 5, where Uk and Sk at y-axis represented statistics of pettitt’s and Buishand tests respectively. Moreover, the values of Uk lies between −6000-6000 whereas the values of Sk lies between −1000-1000.
Estimation of reference evapotranspiration
Analyzing the literature on the subject, for the situation when climate data was not complete, we found several ablative methods to calculate ETo, but among these, the HS equation received maximum attention from the research communities (Hafeez and Khan, 2018; Jabloun and Sahli, 2008; Bhat et al., 2017; Song et al., 2019). However, in view of the previous work, we found that the HS equation underestimates or overestimates the standard PM method because it has been made without considering the effect of WS and Hum explicitly in ETo (Hafeez and Khan, 2018; Jabloun and Sahli, 2008). Similarly, in our study, it was found that HS equation is least appropriate for the estimation of ETo. Although, a number of researchers have successfully calibrated the parameters of HS equation under various climatic conditions, Ferreira et al. (2018) compared the process of calibration of the HS equation by using the Solver tool from Microsoft Excel, which employs the nonlinear optimization algorithm there is less certainty about what sort of solution Solver can find. Similarly, Patel et al. (2015) used fuzzy logic to calibrate the HS equation parameters for various meteorological stations of India and concluded that MHS enhanced the results by reducing the error in ETo estimates. However, Patel et al. (2015) enhanced their results by calibrating only two parameters of the HS equation. Thus, in this study, we were motivated by Patel et al. (2015) and gradually calibrated all HS equation constants along with the constant of effective temperature based on FIS to further enhance the ETo estimates. Therefore, the comparative analysis of the original HS equation and the calibrated HS equation based on FIS was evaluated for some randomly selected Pakistan stations. The comparison indicates that the original HS equation is the least suitable method for estimation of ETo for the current study site due to the highest RMSE and lowest PI. The lowest RMSE and MAE values and highest values of the performance index were obtained to calibrate all four parameters of Hargreaves equation (ah, bh, ch and dh). However, all the other calibrations also gave closed values for these indices, specifically the calibrations of three parameters of HS equation (ah, bh, dh). Hence, it can be concluded that all the calibrations promoted better values (lower values), which led to more efficient ETo estimates for current study area. However, the simultaneous calibration of ah, bh, ch and dh was overall the most efficient in the reducing of this error. Meanwhile, the results of the study can also be verified by analyzing the figures (Figures 3 and 4) which shows that all calibrations promoted better ETo estimates. However, their comparative analysis demonstrates that four parameters calibrated HS equation performed extremely well as compared to other’s one.
Change point detection
Detection of change points in the data is an important statistical challenge in several contexts. Moreover, this issue has often been raised in statistical and non-statistical communities over several epochs (Militino et al., 2020; Jaiswal et al., 2015; Pohlert, 2018). The present study demonstrates the statistical significance of change point in ETo for different meteorological stations of Pakistan by using three homogeneity test. The results of the change point analysis clearly indicate the distributional change in ETo series during the examined time period. Therefore, the observed ETo series had the breakpoints from 1980 to 2019, and this influence may be due to the rapid growth of industrial and commercial activities in various stations of Pakistan. However, the detailed reasons for the abrupt changes still need further research.
In this research, the original HS equation's performance and calibrated HS equations were analyzed by estimating ETo for diverse climatic locations of Pakistan by using the PM method as a reference. Statistical results of this study concluded that the original HS equation showed varying performance among the studied sites and had a tendency to overestimate ETo. However, after its calibration, based on FIS, significantly improve ETo estimates were observed. The simultaneous adjustment of all the empirical parameters (ah, bh, ch and dh) was the best alternative to calibrate the HS equation. Thus, the proposed technique has been proved an effective substitute for the experimental method of HS calibration. Meanwhile, because of urbanization and industrialization, various change points have been observed in ETo estimated by MHS for various meteorological stations of Pakistan from 1980-2019. However, concerning future work, it would be encouraged to use the actual Rs data rather than the estimates for a more precise estimation of ETo.