Amphithéâtre Guillaume Budé, Site Marcelin Berthelot
Open to all
-

Abstract

The seminar given by Philippe Flajolet, Director of Research at INRIA, was devoted to probabilistic algorithms on large masses of data. He illustrated another major algorithmic principle, the positive exploitation of randomness, based here on the use of pseudo-random hash functions. Such a function will associate information of a fixed size (e.g. 32 or 64 bits) with any input information (word, text, image, sound, etc.). Being pseudo-random, it will evenly distribute the input information among the hashed values, breaking their potential regularities. P. Flajolet has demonstrated the effectiveness of these algorithms in a large number of seemingly unrelated problems: the approximate counting of words in a book with very little memory and no dictionary, the classification of a large number of documents according to their similarity, the detection of intrusions and viruses in a network, etc. Although these algorithms are very simple, their analysis requires particularly sophisticated mathematics.

Speaker(s)

Philippe Flajolet