Amphithéâtre Maurice Halbwachs, Site Marcelin Berthelot
Open to all
-

In computing, a data structure is a way of organizing data into an internal model that makes it easier to process. An important example is the nearest neighbor search. If P is a given finite set of points and x a query point, we want to determine the point in P closest to x. We can, of course, examine all the points of P one by one and select the one closest to x, but we can do much better. For example, if the points of P lie in the Euclidean plane, we can construct a Voronoi diagram of P. Finding x's nearest neighbor then boils down to locating x in the diagram, which we can do in logarithmic rather than linear time.

In dimensions greater than 2, the problem is much more difficult and, to be effective, nearest-neighbor search algorithms have to be content with providing approximate solutions. On the other hand, it may be sufficient to have the solution only with a good probability. In this case, probabilistic algorithms once again prove highly effective. The lecture presents different partitions of the ambient space and shows how to take advantage of the intrinsic dimensionality of the set P. We also present the LSH (local sensitive hashing) technique, which is currently the best technique for finding neighbors in high dimensions.