Impact Assessment And Stochastic Modeling Of Morphometric Parameters Of Thermokarst Hazard For Unpaved Roads

Active construction of new roads and other linear structures requires new techniques for the natural hazard assessment. These techniques can involve both stochastic modeling and remote sensing data (RSD). First, the dynamics of thermokarst appearance along an unpaved road (winter road) was analyzed. Then a probabilistic model of the thermokarst morphological pattern was developed for the area in the vicinity of a linear structure, a road in particular. The model operates with initial assumptions based on the physical parameters of thermokarst development and includes relations for estimating the distribution of morphometric dimensions of thermokarst depressions (ponds). The model was empirically tested for the study area, which represented a site with an unpaved road located in West Siberia region. To verify the model, we calculated the correlation coefficient values for the length of the focus projections on the linear structure and the perpendicular axis and compared the empirical distribution of the projections with the theoretical lognormal distribution using the Pearson’s criterion. The proposed model assumptions appeared to be valid for the study area, which makes it possible to proceed to the problem of probabilistic impact risk assessment to a linear structure by foci of human-induced thermokarst.


INTRODUCTION
Construction of unpaved roads within the permafrost zone is a widespread practice in Russia (Ragozin 1997). Because of that, there is an urgent problem concerning modeling the thermokarst processes initiated by the impact of an engineering structure itself (Arenson 2017;Doré et al. 2016;Fel'dman 1984;Hjort et al. 2018;Mazhitova et al. 2004;Shiklomanov et al. 2013;Streletskiy et al. 2012). The majority of researches use the thermodynamic approach (Dvornikov et al. 2018;Ling et al. 2003;Ling et al. 2012;Romanovskii et al. 2001;Tumskoy 2002;Shur 1977) though it has major drawbacks resulting from the complexity of the models on one hand and a lack of information as well as high variability of many background parameters such as ice content, soil temperature and particle size distribution on the other hand. The complexity of the models results from taking into account the phase transitions as well as from their three-dimensional approach.
Our previous studies showed that central points of natural thermokarst lakes in a homogeneous environment fit the Poisson distribution, meaning that they appear independently of each other (Victorov et al. 2015). Moreover, the area of natural thermokarst lakes fits the lognormal distribution, which means that they expand due to the proportional heat transfer through the side surface (Victorov et al. 2015). These conclusions were empirically verified at several sites all over the Russian and Canadian permafrost areas.
We have also revealed that the projections of the centers of the emerged thermokarst ponds divided the road into sections, the length of which follows the exponential distribution. It proves that the thermokarst depressions appear independently of each other due to the heating influence of the road .
The aim of the present research is to perform advanced stochastic modeling of human-induced thermokarst development along a road and assess the thermokarst hazard based on the stochastic approach and remote sensing data.
The general research procedure includes: -analyzing the type and activity of thermokarst at different sites; * URL: https://earthexplorer.usgs.gov/ + URL: www.digitalglobefoundation.org ++ URL: https://www.pgc.umn.edu/data/arcticdem/ -formulating initial assumptions on the occurrence and development of human-induced thermokarst; -mathematical implementation of these assumptions using the probability theory approach; -transformation of the initial assumptions through mathematical analysis; -obtaining the final equation and laws determining the thermokarst development process; -empirical testing of the obtained laws for the study area; -confirmation or refutation of the model assumptions and equations.

MATERIALS AND METHODS
The study area (Fig. 1) is located in the West Siberian lowland. It is an accumulative weakly dissected alluvial plain, covered by fluvioglacial sand with continuous permafrost of 100-300 m deep. The average annual temperature at the bottom of annual fluctuations zone is -1 --3 С, in the unstable active layer it is 1 --2 С. The plain is covered by the tundra forest vegetation. It is crossed by a perennial winter road, some parts of which are used in the summer also. Our study area satisfies the following criteria: -Morphological homogeneity of the site: -Physiographic (geology and topography in particular) homogeneity of the site: -High thermokarst dynamics. The morphological homogeneity of the study area was confirmed from the interpretation of remote imagery. Physiographic homogeneity was proven by the analysis of geological and other field data together with the interpretation of remote imagery.
Our research involved two different approaches. First, we followed the common practice, which includes direct determination of the thermokarst foci and statistical analysis of their development. We call it the «traditional approach».
The second approach was based on stochastic modeling.

Traditional approach
The traditional approach assesses the actual thermokarst impact on engineering structures (Cosford et al. 2014) * . Let us demonstrate its application. The study area (2.42 sq. km) is crossed by a 6.6 km long section of the winter road Nadym-Salekhard (Russia). It is a part of the West Siberian plain (plate) bounded by the Poluy and Ob rivers, Russia (State geological map 2015; Savincev 2012; Shamilishvili et al. 2012).
For the analysis, we used the satellite imagery from 1968, before the winter road was built (Corona 2/pix), as well as more recent imagery from 2012 and 2016 (WorldView-2 и WorldView-3) provided by DigitalGlobe under a grant from the DigitalGlobe Foundation + , Arctic DEM ++ .
The study area was chosen based on the following criteria: morphological and physiographic uniformity of the area, presence of the linear structure, active thermokarst.
Using the traditional approach, it was revealed that all the thermokarst foci mostly appeared along the road, which operates at different times.
Then the territory was divided according to its thermokarst activity, which includes the appearance and development of the thermokarst foci. We compared the territory according to the number and area of the foci. Afterwards, a criterion was developed based on the ratio between the size of the foci and their area, which characterizes the technogenic pressure.
Stochastic approach: the advanced mathematical model of the human-induced thermokarst in the vicinity of linear structures The advanced model, in particular, takes into account the basic model ) and the results of the previous research (Kapralova 2015;Victorov et al. 2015;Gonikov T.V. 2019). The proposed model is based on the following main assumptions and limitations. Let us consider an unpaved road built across a plain with homogenous geology. The construction of the road has changed the thermal regime and based on that three zones can be distinguished: under the road, near the road, and at a certain distance from the road. Change of the thermal regime leads to the thermokarst development, including the emergence of thermokarst depressions (ponds). The stochastic modeling is applied for every zone of the thermal regime within the isotropic area.
It is evident that while the soil and permafrost properties somehow determine the development of the thermokarst depressions, these factors are the same for the entire study area. During the first stage, the initial thermokarst depression represents a shallow puddle of an irregular shape, filled with water. Multiple researchers (Boike et al. 2015;Huang et al. 2019;Matell et al. 2013;Perl'shteyn et al. 2005;Shur 1977) have described this phase of the thermokarst development. During the second stage, which is longer, the thermokarst depressions become more isometric because of the ice melting (Methodical guidelines 1978). The proposed model deals with the second stage. The model of the human-induced thermokarst has the following assumptions: 1. Thermokarst depressions emerge within a narrow band (with a width a) in the vicinity of a linear structure. They appear independently from each other, and the probability of their appearance within the given section depends only on its area ∆ S (thus, in small areas the probability for development of a single depression is much higher than that for several depressions ) and the distance from the linear structure.
where λ(r) is a coefficient. 2. A focus of the human-induced thermokarst in the vicinity of a linear structure has an approximately elliptic shape with the ratio of the axes length ξ i (where i is a year), which follows the uniform probabilistic distribution and independently changes from year to year: where α and β are the lengths of the semi-axes. 3. The growth of the linear dimensions of the thermokarst depressions (the semi-axes of the ellipse) due to thermoabrasion (the low intensity of the thermos-abrasion is possible, at that, the process tends to the «classic thermokarst») occurs independently from each other; it is directly proportional to the density of heat loss through the side surface of the depressions below water level.
Thus, we analyze the scenario of a synchronous start when the primary depressions appear in a short period from the beginning of the linear structure construction. The first (2) assumption comes from the homogeneity of the area in question. It reflects the fact that any limited area contains a finite quantity of thermokarst depressions (their centers, to be precise). Besides, this assumption considers the nature of the disturbances, which cause thermokarst, including that of the soil, vegetation cover, microrelief and permafrost in the vicinity of the linear structure. They vary depending on the distance from the linear structure and generally do not change along the structure. Hence, the main direction of the thermokarst emergence variability is the direction perpendicular to the structure. The function λ(r) accounts for this variability, which depends on the distance from the structure. The second assumption reflects the proportionality of the thermokarst focus growth rate to the average density of heat loss through a side surface below water level; at the same time, the growth depends on many random factors including average annual air temperature and ice content of the soil in the vicinity of the pond. Fig. 1 shows a typical remote sensing image of the emerged thermokarst depressions along a linear structure and Fig. 2 represents a schematic depiction of the model parameters.
From the mathematical analysis of the assumptions, a set of conclusions can be made. For instance, the distribution of the number of thermokarst depressions (or their centers) within a given length of a linear structure (L) follows the Poisson law.
where Y is an average linear density of the thermokarst centers.
The analysis of this equation allows us to obtain the distribution of distances between the centers of thermokarst depressions along the linear structure, which must, subject to the model's validity, correspond to the exponential distribution.
Let us determine the distribution for semi-axes length. If V is the volume of water in the depression, which resulted from the water concentration from the catchment area, evaporation, runoff and other processes of water balance formation, then the depth of the lower part filled with water, taking into account the equation for the area of the ellipse μ and proportionality of the semi-axes α и β (assumption 2) in the year j is equal to In that case, the side area covered with water is where u j is the length of the semi-axis α in the beginning of the year j.
Hence, the second assumption gives us the formula Fig. 2. A scheme of the model parameters Fig. 1. (a, b). A typical remote sensing image of emerged thermokarst depressions along a road: a) high-resolution remote sensing data, GeoEye image b) photo (1) (3) where Δuj is an increment the length of the axis α per year j, c is the specific heat capacity, t 0 is the average water temperature, u j 0 is a random variable for episodic factors. After the substitution of the area from the equation (5) and simplification, this equation takes the form Accordingly, the length of the semi-axis in successive years is described by the following relation: So, if we take the diameter of the initial depression equal to 1, the length of the semi-axis in the year j can be determined from the equation: or The function u j 0 in the right side of the equation is a random variable for the permafrost degradation over a certain year, which depends on summer and winter temperatures, snow thickness, rainout, soil temperature, rainfall characteristics, ice and ground content in the direction of the semiaxes. These parameters do not depend on each other from year to year but they all follow the same distribution.
Since, according to the central limit theorem, the sum of a large number of independent identically distributed random variables is normally distributed, then the logarithm of the semiaxis will be normally distributed as well. Hence the growth of the semi-axis of initiated thermokarst depressions with a large value of j we can approximately consider as a Markov random process with continuous-time and the transition function: where a, α are the distribution parameters, ν is the initial length of the semi-axis of the thermokarst pond, x is the length of the semi-axis of the thermokarst pond after time t.
If we suppose that the functions of the right side are normally distributed then the same conclusion will hold for any period of the development (even a little one). Let us simplify the model and assume that the primary thermokarst depressions at the moment of their occurrence have a unit axis length (this corresponds to normalization by the minimum value), then for any given time we should get the lognormal distribution of the axis length of the thermokarst ponds, i.e.
where a, α are the distribution parameters and t is time since the beginning of the process.
Since the first semi-axis and the proportionality coefficient ξ j follow the lognormal distribution and are independent, their product (the second semi-axis β) should then also follow the lognormal distribution, but with a slightly different value of the parameter α.
Finally, taking into account the quadratic relationship between the area and the axis length the area of the human-induced thermokarst pond should then also follow the lognormal distribution.

Initial thermokarst basic investigation
Situation at the study area in 1968 is shown in Fig 3. The studied part of the winter road was divided into 14 sections using the satellite image from 2016 based on the relative uniformity of the distribution of thermokarst ponds, which allowed to identify the sections with the most significant anthropogenic load (Fig. 4).  The winter road was absent in 1968, but by 2016 it has been functioning for many years, with many thermokarst ponds developed. Fig. 5 shows the rate of depressions development.
As we can see from Fig. 5, the winter road affected the emergence of the thermokarst ponds, and their number increased significantly. Due to the active use in 2012-2016, for the winter road sections 7, 9, 11 an apparent accelerated increase in the depressions development can be observed, while sections 3, 5, 8, 14 demonstrate a decrease in their number. The average annual number of new thermokarst ponds during the construction period and moderate use of the winter road in 1968-2012 is relatively uniform along the entire road. At the present stage, during the period from 2012 to 2016, we see high dynamics of changes in their quantity; this indicates the thermokarst intensification during the period of active exploitation of the road. Fig. 6 shows the change in the area of ponds over the considered periods.
As we see in Fig. 6, there is an essential decrease in the area of ponds over the period 2012-2018 along the entire winter road. The negative values of the area change indicate drying out of the thermokarst ponds or their separation into smaller ones. We propose an evaluation criterion to assess the impact on the winter road: the ratio of the total area of the ponds to the length of the road section. Table 1 contains the results of the estimation. We chose the following classes: 1-negligible (less than 0.8); 2 -low (0.8-2.6); 3 -medium (2.6-3.4); 4 -high (3.4-4.2); 5 -critical (4.2-5.0) as it shown in Fig. 7.
So, the technogenic impact on the thermokarst development is negligible within section 12 (Fig. 7). The winter road influence on the thermokarst within sections 4,5,7,9,11 can be classified as low. Sections 10 and 14 have a medium technogenic impact. The high technogenic impact is found in sections 2 and 8, while sections 1, 3, 6, 13 are characterized by the critical load. Thus, the essential part of the road (23% or 1.485 km) is under either the high or critical impact. About twelve percent of the road belongs to the medium hazard class. It means that more than one-third of the road (35%) has dramatically changed due to the technogenic impact and the area of the winter road has changed over the period of its functioning. Particular attention should be paid to the areas where many ponds have appeared to prevent further occurrence of the new thermokarst foci and the technology of construction and reconstruction of the winter road should be considered carefully.

DISCUSSION
We tested our stochastic model empirically. The empirical testing includes: -Identifying human-induced thermokarst depressions (thermokarst ponds) using RSD, -Measuring the length of the focus projections on the linear structure, -Detecting the centers of the focus projections on the axis perpendicular to the linear structure, -Estimating correlation coefficients for the length of the focus projections on the linear structure and the axis perpendicular to the linear structure, -Comparison of the empirical distribution of the projections with the theoretical lognormal distribution using Pearson's criterion.
The model expects that the distribution of length for the projections of the depressions on the linear structure and the perpendicular axis should follow the lognormal distributiontable 2 contains the results of the comparison between the empirical and theoretical (lognormal) length distributions for the projections.
We used Pearson's chi-squared test to analyze the correspondence between the empirical and theoretical distributions. It is a statistical test applied to categorical datasets to evaluate how likely it is that any observed difference between the two sets arose by chance. It tests a null hypothesis stating that the frequency distribution of certain events observed in a sample is consistent with a particular theoretical distribution. The statistical significance was checked with p-value or probability value or asymptotic significance, which is the probability for a given statistical model that, when the null hypothesis is correct, the statistical summary (such as the sample mean or the difference between two compared groups) would be greater than or equal to the actual observed results. The analysis of table 2 shows that all samples of the length of the projections correspond to the lognormal distribution. According to the analysis of table 2, the initial hypothesis that the distributions of the length of the projections on the linear structure and the axis perpendicular to the linear structure are lognormal is confirmed for all the samples at a significance level of 0.99.
We have also analyzed the change of the ratio between the projections of the depressions on the linear structure and the perpendicular axis. Table 3 contains the average ratio between the projections on the linear structure and its perpendicular axis (elongation coefficient), the correlation coefficient between the projections for each depression and the result of comparing the elongation coefficient distribution with the lognormal distribution.
For the study area, a significant positive correlation of  Table 2. Comparison between the empirical and theoretical lognormal distribution for the focus projections on the linear structure and the perpendicular axis р* -here and later p is additional to quantile value, corresponding the actual significance criterion; for example, the hypothesis is consistent with a probability of 0.99 if p is greater than or equal to 0.01.
the length of projections on the linear structure and the perpendicular axis is observed. It confirms the validity of the second assumption of the model about a certain, though stochastic, proportionality between the semi-axes length. Table 3 shows the correspondence between the distribution of the elongation coefficient and the lognormal distribution. In most cases, the testing results confirm the suggested model at the level of significance 0.99.
The obtained results seem to be useful as they show that we can study the patterns of thermokarst development even without advanced engineering geological descriptions. We plan the following steps for improving our approach. First, further empirical testing will be conducted for different environments. Second, we will search for relations between the field-measured ground and permafrost characteristics and the obtained statistical data. The further improvement of the approach can make it a good base for the thermokarst forecasting in the conditions of lacking field data. CONCLUSION 1. We developed and empirically tested the advanced model of the human-induced thermokarst in the vicinity of built linear structures. This model is based on the reasonable assumptions that the thermokarst depressions appear in a narrow strip along a road and develop independently of each other. Their growth depends on the water volume in each of them. 2. This model is expected to be more promising than the basic investigation approach. 3. The assumptions of the model were written as mathematical equations, from which, using strict transformations, the following main conclusions are derived: -the length of the projections on the road and the axis perpendicular to it must follow the lognormal distribution, -the distribution of the centers of the projections of the depressions must follow the exponential distribution, -the relation between the projections must be correlated and follow the lognormal distribution. 4. The empirical testing proved the validity of these laws for the study area.  Table 3. Elongation and correlation coefficients for the length of projections on the linear structure and its perpendicular axis and comparison between the elongation coefficient and the lognormal distribution