A nonparametric clustering method, the Bagging Voronoi K-Medoid Alignment algorithm, which simultaneously clusters and aligns spatially/temporally dependent curves, is applied to study various data series from the Elbrus region (Central Caucasus). We used the algorithm to cluster annual curves obtained by smoothing of the following synchronous data series: titanium concentrations in varved (annually laminated) bottom sediments of proglacial Lake Donguz-Orun; an oxygen-18 isotope record in an ice core from Mt. Elbrus; temperature and precipitation observations with a monthly resolution from Teberda and Terskol meteorological stations. The data of different types were clustered independently. Due to restrictions concerned with the availability of meteorological data, we have fulfilled the clustering procedure separately for two periods: 1926–2010 and 1951–2010. The study is aimed to determine whether the instrumental period could be reasonably divided (clustered) into several sub-periods using different climate and proxy time series; to examine the interpretability of the resulting borders of the clusters (resulting time periods); to study typical patterns of intra-annual variations of the data series. The results of clustering suggest that the precipitation and to a lesser degree titanium decadal-scale data may be reasonably grouped, while the temperature and oxygen-18 series are too short to form meaningful clusters; the intercluster boundaries show a notable degree of coherence between temperature and oxygen-18 data, and less between titanium and oxygen-18 as well as for precipitation series; the annual curves for titanium and partially precipitation data reveal much more pronounced intercluster variability than the annual patterns of temperature and oxygen-18 data.

In the last decade, new paleoclimate archives were obtained in the course of expeditionary work involving the Institute of Geography of the Russian Academy of Sciences. Among them are ice cores of the Western plateau of Elbrus (Mikhalenko et al. 2015; Kozachek et al. 2017), bottom sediments of the lakes Karakel, Donguz-Orun, Khuko (Solomina et al. 2013; Alexandrin et al. 2018), etc. The obtained cores were studied and dated by laboratory methods; their elemental and isotopic compositions were determined (Darin et al. 2015a; Darin et al. 2015b; Kozachek et al. 2015). Until now, among the existing statistical approaches, mostly the correlation-regression and component analysis have been applied to study the new data (Alexandrin et al. 2018). Among the applications of cluster analysis to these data, only works on studying the backward air mass trajectories and dust transfer are known (Kutuzov et al. 2017; Khairedinova et al. 2017).

Cluster analysis is used to split a certain set of objects into relatively homogeneous groups (clusters). In this work, the clustering procedure was applied independently to investigate several synchronous time series characterizing the dynamics of the natural environment in the Central Caucasus in the 20th century. As the result of this procedure, each time series is divided into intervals corresponding to different clusters. Thus, a time sequence of clusters or parts of them appears in each data series.

In our work a nonparametric method of clustering functional data, the Bagging Voronoi K-Medoid Alignment (BVKMA) algorithm, was applied to analyze the data. This clustering method and its application for studying bottom sediments are described in detail in (Abramowicz et al. 2017) and summarized in the Method section of our paper. A specific feature of this approach is that splitting annual data into clusters is based on the shape of intraannual variations of the parameter under investigation. In that way, the analysis is aimed at answering the question «which time periods are similar and which are different in terms of their intra-annual variability». This clustering method may be only applied to the data that have several measurements for each period.

Moreover, the advantage of BVKMA compared to previously developed methods of clustering functional data is the ability of the method to deal jointly with two effects, revealing by data, that can lead the clustering procedure to a tendency of considering substantially similar or interconnected data as completely different or independent, what may be regarded as misclassification.

The first issue is the effect of misalignment functional data that is typically manifested in time lags. For instance, if the functional data are represented as time-dependent curves, one can notice time lags of peaks of some curves compared to the others, despite a common shape and reasons of the observed variability (see, e.g., Sangalli et al. 2010). The Alignment procedure, implemented in BVKMA, is intended to prevent possible misclassification of the curves during clustering due to their misalignment. In climatic research, the misalignment may naturally occur in annually repeated seasonal patterns.

The second effect that is important to account for, is a possible time/spatial dependence of functional data. For example, the neighboring annual curves, derived from high resolution paleoclimatic records, may be regarded as «dependent» because they are supposed to reflect common processes and effects characterizing the natural environment of a given period. In BVKMA algorithm the account for data dependency, expressed by a tendency to attribute consecutive curves to the same cluster, is provided by the usage of the Voronoi tessellation (Abramowicz et al. 2017). Also, the type of dependency (spatial or temporal) does not impose restrictions on the applicability of the method. Moreover, the spatial dependence observed for the parameter of interest in annual layers of proxy data can be transformed into temporal dependence on the base of known dating of the proxy. In our study, this is the case for lake sediment and ice core data.

The main purpose of our investigation is to assess the degree of consistency of the resulting clusters for geochemical, isotopic and meteorological data series and to find out essential or close time boundaries in different series. This approach of combining various types of data in order to reveal their implicit interrelations may be a useful tool for creating paleoclimate reconstructions.

In many palaeoclimatic studies, researchers aim to find modern analogues for past climates, and thus reconstruct specific parameters or palaeoenvironments for specific time periods. The clustering method used in this study previously was applied to cluster millennia-long time- series, resulting in only one cluster covering the whole instrumental period. The results of such an approach could be hardly interpreted in terms of finding modern analogues for past climates. Here for the first time we apply this method for time-series which are several decades long and fully intersect with the instrumental period. The purpose of this approach is (i) to determine whether an instrumental period of usual length could be reasonably divided (clustered) into several sub-periods using this method and different climate and proxy time-series; (ii) to study the shape of medoids (which represent intra-annual variations of parameters) for different climate and proxy time-series and their associations; and (iii) to examine the resulting borders of the clusters, or resulting time periods, in terms of their interpretability.

The Greater Caucasus borders the Russian Plain from the south. It is located in the temperate and subtropical zones between the Black and Caspian Seas. Elbrus volcano (5642 m) - the highest peak of the Caucasus, supports extensive modern glaciation. The climate in the region is dominated by the westerlies. The continentality is increasing from the west to east: the mean June temperature at the foothills of Greater Caucasus is approximately +23-24 °C, while in the east it is higher (25-29 °C): the annual precipitation, on the contrary, decreases in the west-east direction from 4000 mm (Kodory valley) to 1000-1500 mm in the eastern Caucasus (Gvozdetsky and Golubchikov 1987). Precipitation maxima occur in July-September; the warmest month is July, the coldest one is January.

Meteorological data at the high elevation of the Caucasus are quite scarce. In this paper, we used the data from Terskol station located in the area where our other proxies (ice core and lake sediments) are situated and Teberda station with longer meteorological records. Shahgedanova et al. (2014) noticed positive trends in summer temperature and precipitation of the accumulation period (October-April) recorded at the high- elevation Terskol and Klukhorsky Pereval stations in the period between 1987 and 2010.The glaciers are however retreating since the early 20th century and the retreat rate is increasing.

The study area and the locations of the proxy records and meteorological stations used as the sources of data are marked on the map below (Fig. 1).

Fig. 1. The study area. The map of the Elbrus region (Central Caucasus) with the marked locations of data sources: Lake Donguz-Orun, Mt. Elbrus, the weather stations Terskol and Teberda. Space image from Google Earth

The following data were used in the work.Data on the elemental composition of the core of the annually laminated bottom sediments of Lake Donguz- Orun. The top core used (160 mm) contains annual layers formed during the period 1922-2010 (Alexandrin et al. 2018). Among the chemical elements present in the sample, the terrigenous element titanium (Ti) was selected for cluster analysis, because variations in its content correlate most strongly with the series of meteorological observations in the region (Alexandrin et al. 2018).The vertical profile of the oxygen isotope content (δ18O) in the ice cores of the Western plateau of Elbrus (depth - up to 182 m; dated part - from 1774 to 2013; see (Preunkert et al. 2019; Kutuzov et al. 2019)).Monthly data on average air temperature from observations at the Teberda (since 1926) and Terskol (since 1951) weather stations.Monthly data on precipitation totals from observations at the above mentioned meteorological stations for the same periods.For data (a) and (b), the vertical profiles were converted to a time distribution based on the known depth-age correspondence.Clustering of annual curves was carried out separately for the following two periods.1926-2010 - the maximum period of time provided simultaneously by geochemical, isotopic and meteorological data. The Teberda weather station was selected as providing the longest series of observations in the region.1951-2010 - the period of observations at the Terskol weather station and the simultaneous availability of lake sediment and ice core data. The weather station Terskol was selected as the closest to Lake Donguz-Orun and Mt. Elbrus.In accordance with the designations introduced, below we will indicate the data type with a letter (a - d), and the study period with a number (1 or 2). For example, (a1) will denote the titanium data for 1926-2010.MethodTo study the data, we have applied a recently developed nonparametric method of clustering functional data, the Bagging Voronoi K-Medoid Alignment, which simultaneously clusters and aligns by phase the data elements (annual curves), using the information about the dependence (sequence) of these curves (Abramowicz et al. 2017). The method is a generalization of the previous Bagging Voronoi Clustering (Secchi et al. 2013), which does not handle misalignment of the data. All computations and analysis of the data are performed in the R programming language (R Core Team 2020).Preprocessing. From the time series representing our raw data, the associated functional form was reconstructed via a smoothing procedure. In order to do that, a series of each parameter (Ti, δ18O, temperature, precipitation) was divided into sub-series of observations for individual years. The annual data were centered with respect to their mean value. Without loss of information the yearly time scale was converted to a reference one by uniformly distributing the time instances on the interval [0, 1] (such that for each year the first time instance is associated to 0 and the last one to 1). Next, the centered annual data were normalized with respect to the maximum absolute value of the whole time series. Finally, after all previously described normalizations and transformations, a continuous function was reconstructed from each annual series by smoothing via a sum of the first few Fourier harmonics. Typically, we used from 5 to 9 Fourier basis functions depending on the stability or oscillations of the initial data. Thus, a series of annual curves was obtained for each parameter. This allowed us to apply the BVKMA algorithm designed for clustering functional data.Let us set out at a qualitative level the main stages of the BVKMA algorithm, following (Abramowicz et al. 2017), and the procedure for tuning its input parameters.For the sake of clarity, let us describe an input dataset as a rectangular array of numbers (matrix), organized as follows. Each row of the array contains the sequential values of the parameter of interest, belonging to a particular year (annual curve). We will call any row of the array and the data contained in it as a site. For instance, in the case of the temperature data, the ith row contains the values of temperature during the ith year of the studied period, obtained by smoothing of 12 monthly observations at a meteorological station. Thanks to the Fourier smoothing, for each type of data we have increased the number of annual values up to 50. Therefore, each of our datasets contains 50 columns. The number of rows N in our datasets is either 85 or 60, depending on the number of years in the analyzed time period, starting either from 1926 or from 1951.The execution of the BVKMA proceeds in two phases - bootstrap phase and aggregation phase.Bootstrap phase. This three-step procedure is being applied to the same input data array a specified number of times B. The individual replications of this procedure are independent and their results are being saved.Step I. Generation of a random Voronoi tessellation. At this step the data array is randomly divided into a given number n of sub-arrays (Voronoi cells). Each cell is a set of several consecutive sites. The cells can vary in size, since being formed randomly. This step demands an input parameter L=N/n - the expected number of sites within a Voronoi cell. Varying L, we change the measure of supposed dependence of annual data as preliminary information given to the algorithm.Step 2. Identification of local representatives (medoids) for each cell of the tessellation. First, the Alignment procedure is applied to annual curves (sites) of each cell. Then medoids are chosen as the curves in each cell which are the most similar to all the other aligned curves in the same cell. The similarity of the curves is determined by a metric (see below). As a result of this procedure, for each Voronoi cell a new 1-dimensional array (an additional annual curve, also called local representative) is created, which summarizes the information carried by all sites of the cell. So, having completed the second step, we have a set of Voronoi cells and a set of their local representatives - one for each cell.Step 3. Clustering of the local representatives, formed at the previous step. The number K of clusters is assigned a priori, before executing the algorithm, and the clustering algorithm used on the local representatives is the K-medoid algorithm. All clusters are being labeled, and the label of each cluster refers also to all local representatives forming it. Next, all sites of each Voronoi cell get the same cluster label as the one that was assigned to the local representative of this cell. Thus, after completion of this step, the entire initial data array will be divided into clusters. In other words, for each site it will be indicated which one of K clusters it belongs to.Aggregation phase. Since the Bootstrap phase is repeated B times, and every time a Voronoi tessellation is created randomly, the resulting cluster distributions are expected to be different. Thus, for each site (year) a frequency distribution of cluster assignments along the B replicates is provided. At the Aggregation phase these frequencies are calculated for each site, and the cluster label, which was encountered more often than others, is finally assigned to a site. As the result of this majority vote procedure the final partitioning of the data array into clusters is formed.Parameter selection. The above mentioned variables B, L, and K are the input parameters of the BVKMA algorithm.In all runs of the algorithm we kept the number of bootstrap replications B equal to 1000. This value turned out to be sufficient to provide the robustness of the results.The expected length of aVoronoi cell L was varied significantly in order to encompass all possible numbers ofVoronoi cells n. The inevitable restriction imposed by the algorithm on n is K+1≤n≤N, expressing the fact that the number of Voronoi cells should be greater than the number of clusters, but cannot exceed the number of sites in the dataset. Hence, using the equality L=N/n, one can easily derive the restrictions on it: 1≤L≤N/(K+1). Thus, for each number of clusters K we executed the algorithm with various possible values of the expected length of a Voronoi cell L.To determine the most adequate value of L we used the average entropy estimator implemented in BVKMA. The mean entropy E is the measure of the misclassification of the data during clustering. Therefore, the optimal value of L is the one, providing the minimum of E.We restricted the number of clusters K to be equal to 2, 3 or 4. We have not enlarged this number because the amount of sites (years) is relatively small (maximum 85). To tune K, we applied another built-in estimator of the BVKMA algorithm - the so-called λ-criterion (for more details, see (Abramowicz et al. 2017; Sangalli et al. 2010)). Again, the optimal value of K is the one, providing the minimum of λ.Thus, the way to find the optimal values of the parameters for each dataset was the following. First, for each K the optimal value of L was determined with the help of the entropy criterion. After that, we found the optimal number of clusters K, using the λ-criterion, among the cases of optimal values of L.The optimal values of the parameters and corresponding values of the statistical indicators, resulting from our analysis, are presented in Table 1. Table 1. The cases of optimal parameter selection and their numerical characteristicsCaseKLEλ(a1)3200.710.48(a2)4120.680.37(b1)3100.610.93(b2)3150.550.86(c1)4120.610.81(c2)4100.590.81(d1)2150.600.45(d2)380.580.67In addition to the numeric input parameters discussed above, for running BVKMA one has to set a metric to quantify the similarity between annual curves, and a family of warping functions necessary for the Alignment procedure. Our choice of these two functional parameters is the same as in (Abramowicz et al. 2017). Namely, we used the normalized L2-based distance as a metric, and the group of positive slope affine transformations as a family of warping functions. More details and definitions can be found in (Abramowicz et al. 2017; Vantini 2012).RESULTS AND DISCUSSIONAs a result of applying the BVKMA algorithm to different types of data in the different studied cases, we have obtained either 2 or 3 clusters. Typically, the resulting cluster assignment led to one cluster less than the prescribed number K. This means that in the final year-by-year cluster assignment by majority vote one of the clusters never comes out as the modal one.The results of applying the algorithm are depicted in Fig. 2: the cluster distributions over time (left side), and the corresponding medoids of each cluster (right side). The medoids represent intra-annual variability of the data, thus the left end of each medoid corresponds to the beginning of the year, and the right end - to the end of the year. They may be shifted in the direction of the abscissa due to the Alignment procedure. The Alignment is essential in the process of clustering, thought it has no physical significance for representation of the resulting medoids. In fact, the medoid of the cluster is the most representative curve in the cluster transformed in abscissa as a result of the Alignment (shifted and stretched/compressed). Nevertheless, the overall shape of the curve, subjected to such transformation, is preserved. Fig. 2. The results of clustering of smoothed annual curves (left), and the corresponding medoids (right) for two study periods: 1926-2010 (1) and 1951-2010 (2). The clustered data are: titanium content in the bottom sediment core from Lake Donguz-Orun (a); oxygen-18 isotope content in the ice core from Elbrus (b); monthly average temperature at Teberda and Terskol weather stations (c); monthly sum of precipitation at the same weather stations (d). The colors of the medoids match the colors used in the diagrams of cluster distribution over time for each type of data The data on titanium concentrations in the Donguz- Orun sediment core show the most prominent intercluster diversity (Fig. 2 (a1, a2, right)). In the original study by Alexandrin et al. (2018) titanium was related to precipitation, having significant correlation (r = 0.44) with annual precipitation measured at Teberda weather station. However, clustering fulfilled for the two parameters showed different results (Fig. 2 (a1, a2, d1, d2, left)). It might be related to mild correlation strength, but also to different factors driving intra-annual patterns of variability of two parameters. Titanium is sometimes claimed to mimic terrigenous runoff, and thus reflecting precipitation. However, precipitation during the cold season may generate runoff only in spring during snow melt, hence the intra-winter distribution of precipitation would not be related to the spring peak of runoff, but the total amount of precipitation in the winter will matter. Hence, we underline that the results of the BVKMA clustering algorithm should be always interpreted keeping in mind possible disagreement in intra-annual variability of interconnected parameters.On the contrary, the temperature and especially oxygen-18 records reveal similar and stable seasonal patterns (Fig. 2 (b1, b2, c1, c2, right)), and therefore the differences among clusters are less significant for these types of data. Temperature is known to have larger correlation distance than precipitation. In this regard, we cannot find a realistic explanation for different clustering results for temperature measured on two weather stations having similar results for precipitation. Hence we interpret these results as follows. The instrumental period may be not long enough to obtain reasonable clusters for a parameter with stable intra-annual variability, such as temperature. Moreover, some inter-cluster time boundaries occur closely in timing for some series of data (Fig. 2): in the 1940s (b1, c1, left) and in the late 1960s (b2, c2, left).We also find very close inter-cluster boundaries in the titanium and oxygen-18 data (Fig. 2 (a1, b2, left)).The precipitation data series of meteorological stations Teberda and Terskol have a similar structure of cluster distributions over time: a large cluster encompassing most part of the studied period followed by a small cluster attributed to the latter period (Fig. 2 (d1, d2, left)).Intra-annual patterns revealed by the respective medoids (Fig. 2 (d1, d2, right)) also have a similarity: for both weather stations we can observe maximum values of precipitation in the middle of the season (summer) for the first period (green) and two local maxima (spring and autumn) with reduction in summer for the second period (magenta). These results show that, in contrast to very stable intra-annual variations of temperature, those of precipitation are variable enough to be reasonably clustered into several periods. The consistency of these periods for two remote weather stations may indicate that the results of the clustering catch common underlying forcing of changed precipitation seasonality in 2000s.Originally, the BVKMA algorithm was applied for a 6000- year long varved sedimentary sequence (Abramowicz et al. 2017). It had proven to be suitable for registering centennial to millennial scale variations in the distribution of the seasonal values of the selected parameters, thus providing important paleoclimatic implications. In this study, we apply the BVKMA algorithm for significantly shorter sequences (60 and 85 years long). The climatic variations (temperature and precipitation) as well are their proxies (sedimentary Ti-values and ice core δ18O) at such a short time scale were obviously incomparably smaller than those for the half Holocene time span.

The two-three clusters provided by the algorithm tend to represent minor fluctuations - especially clear with the curves of temperature and δ18O. A certain incoherence of the Ti-values can be attributed to the uncertainty of distinguishing the annual layers in varved sediments (done with the use of geochemical markers rather than direct visual observation in the case of Lake Donguz-Orun).

On the time scale of centuries to millennia the physical basis for cluster analysis of the paleoclimatic data is much more robust. Application of the BVKMA algorithm for shorter sequences provides a necessary basis for its application for the longer ones that are expected for lake sediments, ice core data and possibly other sources of paleoclimatic information in the Caucasus.

The seasonal patterns of four types of proxy and meteorological data series from the Elbrus region (Ti concentrations, δ18O, temperature, and precipitation) are derived by applying the clustering algorithm Bagging Voronoi K-Medoid Alignment, separately for two periods: 1926-2010 and 1951-2010.

The time dynamics of clusters and the corresponding cluster medoids are obtained.

The seasonal patterns of oxygen-18 and temperature data occurred to be relatively similar and unchangeable.

A notable degree of consistency of the resulting clusters and their time boundaries for oxygen-18 and temperature data is figured out. This result is encouraging for future attempts to define modern analogues of past climates using this clustering technique.

A less pronounced, but still the observable consistency of the cluster distributions is found for Ti and O-18 data, as well as for precipitation data of meteorological stations Teberda and Terskol.

Our results demonstrate that for parameters with relatively stable intra-annual patterns of variability (like temperature) the usual length of instrumental period may be not enough for its reasonable division into sub-periods. Highly variable parameters (like precipitation), on the contrary, may be reasonably clustered even inside several decades of data.

Also, we underline that the results of the BVKMA clustering algorithm should be always interpreted keeping in mind possible disagreement in intra-annual variability of interconnected parameters, as we demonstrated in titanium data from Donguz-Orun sediment core and precipitation from Teberda weather station.