Cette conférence publique sera donnée par Mme Dre Marj TONINI en vue de sa demande d’habilitation
Clustering is the task of grouping a set of objects in such a way to minimize the inter-group similarity and maximize the dissimilarity among different groups. Cluster analysis is widely used in geo-environmental studies to discover potential patterns in large datasets.
In this talk I will introduce two clustering techniques. The first is based on a probabilistic approach allowing to estimate the departure from the random distribution of a stochastic point process. More specifically, I will illustrate the performance of spatio-temporal permutation scan statistics (STPSS) to detect clusters of different geo-environmental events, to locate them in space and in time, and to assess their statistical significance. The second is a data driven technique based on machine learning, namely Self Organising Map (SOM). SOM is an unsupervised competitive learning neural network allowing to represent a high-dimensional dataset as a two-dimensional discretized pattern; it performs particularly well to identify clusters from a multivariate dataset.
I will demonstrate how these techniques work and how they can be used to identify significant clusters, to characterize patterns and to discover trends in large geo-environmental datasets by introducing three case studies.
STSS is applied to detect the spatio-temporal clusters of landslides causing damage in Switzerland and of flash floods in China. As a fundamental part of these investigations, detected clusters are related with the surrounding meteorological conditions, allowing to identify possible triggering factors. For flash floods in China, the large spatial and temporal scale (daily data from 1950 to 2015) enables considerations in the context of climatic changes.
In the third case study, SOM is applied to the agricultural census data at municipality scale in Western Mediterranean Areas. The codebooks resulting from SOM are then grouped using hierarchical clustering method to form the final partitioning, where similar units are aggregated into larger clusters. At the end of the entire process, results are mapped under a GIS environment, allowing to visualize the revealed patterns over the geographic space.