Amphithéâtre Marguerite de Navarre, Site Marcelin Berthelot
Open to all
-

Abstract

In this talk, I will review how the concepts of optimal transport can be applied to analyze different machine learning methods, in particular for sampling and training neural networks. The focus will be on the use of optimal transport to study dynamic flows in the space of probability distributions. The first example will be flow matchingsampling , based on advection field regression. In its simplest case (diffusion models), this approach presents a gradient structure similar to optimal transport. I will then discuss Wasserstein gradient flows, where the flow minimizes a functional in the geometry of optimal transport. This framework allows us to model and understand the training dynamics of the probability distribution of neurons in two-layer networks. Finally, the last example explores the modeling of token probability distribution evolution in deep transform networks. This approach requires a modification of the optimal transport structure to incorporate the softmax normalization specific to attention mechanisms.

Speaker(s)

Gabriel Peyré

CNRS Research Director, École normale supérieure

Events