Forecasting Water Level Of Jhelum River Of Kashmir Valley India, Using Prediction And Earlywarning System

The hydrological disasters have the largest share in global disaster list and in 2016 the Asia’s share was 41% of the global occurrence of flood disasters. The Jammu and Kashmir is one of the most flood-prone regions of the Indian Himalayas. In the 2014 floods, approximately 268 people died and 168004 houses were damaged. Pulwama, Srinagar, and Bandipora districts were severely affected with 102, 100 and 148 km 2 respectively submerged in floods. To predict and warn people before the actual event occur, the Early Warning Systems were developed. The Early Warning Systems (EWS) improve the preparedness of community towards the disaster. The EWS does not help to prevent floods but it helps to reduce the loss of life and property largely. A flood monitoring and EWS is proposed in this research work. This system is composed of base stations and a control center. The base station comprises of sensing module and processing module, which makes a localised prediction of water level and transmits predicted results and measured data to the control center. The control center uses a hybrid system of Adaptive Neuro-Fuzzy Inference System (ANFIS) model and the supervised machine learning technique, Linear Multiple Regression (LMR) model for water level prediction. This hybrid system presented the high accuracy of 93.53% for daily predictions and 99.91% for hourly predictions.


INTRODUCTION
The Floods are the most damaging disaster in terms of property and life. Flood is the most occurring disaster in the world as compared to other types of natural disasters. (Ahern et al. 2005& Kiran et al. 2019. Floods are influenced by many factors like precipitation, Snow-melt, Land Use-Land Cover, and built-up (Kim et al. 2009). Out of all other continents, Asia is the most affected continent (Cavallo & Noy 2011; Table 1). As it was predicted by The Intergovernmental Panel on Climate Change (IPCC 2001) that flooding will worsen in decades to come because of the climate change. The climate change induces extreme precipitation , Glacier melting at faster rate, ) extreme temperatures, cyclones, rise in ocean water levels (IPCC 2014). From 2006-2015 annual average of deaths by natural disasters was 69,827. In 2016, $59 billion worth damages were reported for hydrological disasters and out of the top 10 countries in terms of disaster mortality, five are Asian countries and accounted for 43.2% (Guha-sapir et al. 2017). India is a rich country when talked about river systems. India has four river systems on a large scale viz. Aravalli, Ganges, Brahmaputra, and Indus that are large both in catchment size and drainage density. All these river systems have several tributaries, which spread along the length and breadth of India which makes it more prone to floods (Mathur 2019). During August 18, 2008, Kosi floods, which impacted India and Nepal, affected more than 3 million people (Bhatt et al. 2010). Apart from this example, there are numerous large-scale hydrological disasters which left parts of India devastated like Leh flashfloods 2010 (Thayyen et al. 2013), Brahmaputra floods 2012 (Pal et al. 2013), Kedarnath flashfloods 2013 , J&K floods 2014 (Mishra 2015), and Tamilnadu floods 2015 . Loss of wetlands, deforestation, Population explosion, climate change and inhabitation in high slope zones are the main reasons for the increase in these disasters (Berz 2001., Mcbean 2002. & Rafiq 2017. To mitigate the effects of floods a system is needed which can aware people before the occurrence of this hydrological disaster . In this work, the hybrid system of Linear Multiple Regression (LMR), and Adaptive Neuro Fuzzy Inference System (ANFIS) are integrated with the wireless sensor technology to develop an early flood warning system which is backed up by solar panels to keep it running in a power failure. The Wireless Sensor Networks (WSN) were chosen because they have low power consumption with high mobility characteristic (Do et al. 2015). The system also uses GPRS and other communication modules for communication. WSN and web monitoring allow remote administration of sensor network which leads to easy maintenance of sensor networks and accuracy in monitoring (Islam et al. 2014). The Webpage, SMS and loudspeakers were used to disseminate the warning to the population, which are the easiest ways to reach out to the people (Natividad & Mendez 2018). The proposed system uses spatially distributed precipitation as input to the model. ANFIS and Multiple Linear Regression models show more accurate forecasts than Artificial Neural Network (ANN) when spatially distributed precipitation is given as input (Rezaeianzadeh et al. 2013). The aim of this work is to propose a system, which is cost effective, accurate and simple to implement. The LMR approach used in the system is fast to fit, easy to interpret and perfect to predict continuous response like water level. The proposed system uses wireless sensors to measure parameters like temperature, rainfall, and water level because of its low cost, quick response, stability, and flexibility (Niranjan 2012), which is feasible for a developing country (Basha & Rus 2007). To monitor the water level by WSN and communication through GPRS and SMS makes it a real-time monitoring system (Sunkpho & Ootamakorn 2011). By combining sensor networks, artificial intelligence, and modern communication technology, better early warning systems can be built, as this methodology was employed by (Pengel et al. 2013) in a system which uses wireless sensors and Artificial intelligence to predict dike and embankment failure. Furthermore, WSN and machine learning when integrated together can predict floods efficiently (Roy et al. 2012).

STUDY AREA
The Jhelum River is the main river of Kashmir division, which runs along its entire length of 140 km. Most of the towns and villages are located on its banks. The width of the Jhelum river varies between 69 to 113 meters from Sangam to Ram munshibagh . The most flood-affected districts of Kashmir are Anantnag, Pulwama, Srinagar, and Bandipora. The valley does not have any Early flood warning system right now and the flood monitoring is a manual one (Fig. 1). The study area has the total area of 8603 sq/km, with 14 catchments having tributaries draining from Pir Panjal range and joining the river on the left bank and on the right   (Bhatt et al., 2017) as shown in the (Fig. 2). Sandran river, Bringi, Arapath, Lidder, Vaishow, Rambiara, Watalara, Aripal, Sasara, and Romushi are those tributaries which join the Jhelum river in Anantnag and Pulwama districts and contribute a lot to the water flow of Jhelum river (Fig. 2). For this reason, the Sangam gauge station was taken into consideration, because after Kakapora village, no other tributary joins Jhelum up to Srinagar.

DATA USED
The daily precipitation and temperature data of 30 years ranging from 1980-2010 from three meteorological stations, Pahalgam, Kokernag and Qazigund were obtained from the India Meteorological Department. The daily water-level data from 1980-2019 of Jhelum river at three gauging stations, Sangam, Rammunshibagh, and Asham was acquired from the Department of Irrigation and Flood Control Jammu and Kashmir. The watershed of the study area was generated using the SRTM Digital Elevation Model (DEM) using ArcGIS software.

METHODOLOGY
In this system, we have four base stations out of which three base stations were equipped with wireless sensors to measure different parameters and one base station namely Sangam, where sensors are not used and data was acquired from the meteorological station. All the three WSN equipped base stations have the same architecture (Fig. 3).The system mainly depends on the wireless communication for data transmission and not on the internet because of the volatile situation of the Kashmir valley which experiences frequent internet blockade by the government for security reasons (Iqbal 2017).
The ANFIS is a multilayer feed-forward network, being so specific operations were performed on incoming signals by each node (neuron). The ANFIS is a Takagi-Sugeno model with five layers in which membership functions, inputs, and derived rules determine its structure. An optimal number of epochs (iterations in learning phase) and type of membership function determines the efficiency of the model. The ANFIS employs «if-then» rules to perform an operation (Jang 1993) which is described below for a first-order model of common two fuzzy rules (Younes et al. 2015). Rule 1: If x 1 is A 1 and x 2 is B 1 then f 1 =p 1 x 1 +q 1 x 2 +r 1 Rule 2: if x 1 is A 2 and x 2 is B 2 then f 2 =p 2 x 1 +q 2 x 2 +r 2 Where A and B denote grade like «Low» or «Less» whereas p1,q1,p2,q2 are parameters.

Coefficient of determination
The first and foremost thing is to select the optimal variables, which are admissible to the desired output. 120 tests were done to get the optimal number of inputs and based on these tests, the best number of inputs turned out to be four. Furthermore, the best type of membership function was determined by testing all eight types of membership functions and the hybrid type of training algorithm was used. The epoch number and membership function number was kept constant at 40 and 3 respectively. The Triangular membership function showed the best results when compared to the other membership functions as shown in table 2. This model was selected for further modification in order to enhance its performance. To finalize the best performing structure of the model, it was necessary to determine the optimum number of functions. The number of membership functions were varied from 3-6 and the epochs were kept constant at 40. The tests revealed that the optimum number of membership functions is four. The best performing membership function was selected on the basis of the smallest RMSE for training and testing.

PROPOSED SYSTEM WORKFLOW
The process of flood monitoring and early warning starts from the sensing module. The CS475A radar sensor is ideal for outdoor rough condition, calculates the distance between the sensor and the water by measuring the elapsed time between the emission and return of pulses. This data along with the data from rain measuring, tipping, self-emptying bucket , temperature sensor, and the magnetic hall-effect water-flow sensor is transmitted to the microcontroller. The Arduino 2560, which has 54 digital I/O pins with Atmega 2560 microcontroller sends the sensor data to the processing module of the base station via the HC-12 communication module, which is processed and a localised prediction about the future water level at this base station is made by finding a correlation between the rainfall and the water level.
Where L is the river water level, Q is the lag time, R is the rainfall, and α is the coefficient, which illustrates the correlation between water level and rainfall. The increase in the water level is directly proportional to the intensity of the rain. The data acquired through sensors is transmitted to the control center via the SX1272LoRa module. At the control center, the data is received by another SX1272 LoRa module and stored in the database. This data is later used to update the database and retrain the model. For the accurate water level prediction, the system makes use of weather forecasts from IMD (Indian Meteorological Department). The trained ANFIS model at the control center takes the forecasted values as inputs and produces an output which is the predicted value of the water level at Sangam. As we have from hourly to day to day forecasts available so this model can predict the water levels accordingly. Now, as the water level prediction for Sangam, Kakapora, Pampore and Ram munshibagh are available, The Multiple Linear Regression model takes these four predicted values as input and generates the future water level of Ram munshibagh as output which can be denoted by the equation: Where B 4 is the response variable, β i are the coefficients, where i= 0, 1, 2 and 3 are regression coefficients. B 1 , B 2 , and B 3 are independent variables. The summary of the model is shown in table 3.
Then this predicted water level is compared against the five warning levels which are shown in Table 3 to determine the intensity and possibility of a flood. The intervals between the data acquisition from sensors depend upon the intensity of rain and the water level. The interval of per data acquisition from sensors starts from 15 minutes, which decreases with an increase in every warning level which is shown in (eqn. 7).
Where T j is the time interval between the two consecutive measurements, Δtis the increment unit, k i  {0, 1, 2, 3, 4, and 5} is the warning level. This equation means the interval between the two measurements decreases as the warning level increases. Therefore, we will have predictions that are more accurate.
Our system has five warning levels viz. Normal, High, Very High, Critical, and Flood. The time intervals of these warning levels are shown in Table 4.

RESULTS AND DISCUSSION
The proposed system was used to predict the water levels of river Jhelum at Srinagar. The ANFIS model was used to predict water level at Sangam or base station B1 because all the data was available to develop the model. The developed model has 256 rules, four-member functions for each input parameter and one output function. The efficiency of the ANFIS was evaluated using RMSE, MAE and R2 tests. The model achieved RMSEtrain (0.4306), RMSEtest (0.6109), MAEtrain (0.0623), MAEtest (0.0783), R2train (0.972) and R2test (0.966). These results were achieved at epoch number 230 and increasing epochs after that did not show any improvement in the model. The model showed that the predicted values and the measured values were almost equal with residuals falling between ±1 which signifies the efficiency of the model.The results were accurate with an accuracy of 93.53% for daily predictions and 99.91% for hourly predictions (Fig. 5, 6) The accuracy of the system in short-term predictions is better than the long-term predictions. The accuracy of the system depends on the accuracy of forecasted values of the precipitation and temperature.
In our regression model The R-sq is the regression coefficient, which determines how well the model fits our data  and 98.82% is quite acceptable in our scenario (Table 5). The Adjusted R-sq value fuses the number of indicators in the model to enable you to pick the right model. The difference in the R-sq and Adjusted R-sq value for a predictor shows the contribution of that predictor in improving the model. A model with higher predicted R2 values has better prediction ability and in our case 98.68% is excellent. Usually, 0.05 significance level works well but, in our model, we got the P-value of 0.038, 0.015, 0.014, 0.008 this signify that the association between response and each term is statistically significant. The Variance Inflation Factor of 3.36, 3.39, and 4.27 for respective variables is moderate inflation of variance of a coefficient due to correlation among the predictors in our model. The acceptable range is 10 and the regression coefficient is poorly estimated if the VIF value is greater than 10 (Babin & Anderson 2014).
In Fig. 6, The probability plot of residuals approximately follows a straight line with the least number of outliers. The residuals versus fits plot verify that there is no recognizable pattern in the points and the residuals are randomly distributed and fall randomly on both sides of 0 (Fig. 7). For all observations, the distribution of residuals is shown by the histogram of the residuals and which shows only two outliers. The order in which data were collected is displayed by the residuals versus order plot. No trends or patterns are shown in residuals and thus indicating that there is no correlation between independent variables.

CONCLUSION AND FUTURE WORK
The system will have a direct effect on the resilience index of the valley and will contribute to the economy of the valley. As the system will provide lead time in flood warning which can be used in evacuation. This will lead to saving of many human lives; livestock and valuables. The system uses wireless sensors to measure the different factors contributing to the floods. Machine learning methods used the data to predict the possible water level. The system uses two machine learning models and wireless sensor network to perform the task of river monitoring and early flood warning. This hybrid approach paves the way to look into more possible and efficient hybrid methods to predict the various aspects and behaviors of this river under certain circumstances. The proposed system is the first Early Flood Warning system devised for Jhelum basin and covers 52.3 km of Jhelum basin from Sangam village of Anantnag district to Ram munshibagh (Srinagar). In the future, we will integrate remote sensing and GIS with machine learning methods to cover the whole Jhelum basin in Kashmir from Anantnag to Baramulla district. We can use satellite-derived rainfall (Mishra & Rafiq 2017a) temperature (Rafiq et al. 2012) and other datasets (Mishra & Rafiq 2017b) to estimate runoff and develop a sophisticated, multi-dimension early warning system. To cover the whole Jhelum basin in Kashmir we have to find some cost-effective alternatives in hardware and communication between nodes for such a long basin would be a challenging task.