The English version of this website is provided through automatic translation.

8 Jun 2017

09:10 to 10:00

Symposium

Cluster Trees, near Neighbor Graphs, and Continuum Percolation

Sanjoy Dasgupta

Geometry Understanding in Higher Dimensions

8 Jun 2017

09:10 to 10:00

Listen to audio

Abstract

What information does the clustering of a finite data set reveal about the underlying distribution from which the data were sampled? This basic question has proved elusive even for the most widely-used clustering procedures. One natural criterion is to seek clusters that converge (as the data set grows) to regions of high density. When all possible density levels are considered, this is a hierarchical clustering problem where the sought limit is called the "cluster tree". We give a simple algorithm for estimating this tree that implicitly constructs a multiscale hierarchy of near-neighbor graphs on the data points. We show that the procedure is consistent, answering an open problem of Hartigan. We also obtain rates of convergence, using a percolation argument that gives insight into how near-neighbor graphs should be constructed.

Documents and media

Speaker(s)

Sanjoy Dasgupta

UCSD

Events

Jean-Daniel Boissonnat

Welcome

Sanjoy Dasgupta

Cluster Trees, near Neighbor Graphs, and Continuum Percolation…

Bertrand Michel

A Statistical Approach to Topological Data Analysis

Topological Graphs for Data Analysis: Structure, Stability and Statist…

Quentin Mérigot

Computational Geometry, Optimal Transport and Applications

The Materials Genome in Action

Triangulating Manifolds

Algorithmic Aspects of Topological Data Analysis

Combinatorial Macbeath Regions for Semi-Algebraic Set Systems…

Valerio Pascucci

Interactive Visualization of High Dimensional Data: can we Deal with C…

See also

Jean-Daniel Boissonnat, chair Computer Sciences and Digital Technologies

Geometry Understanding in Higher Dimensions