Salle 5, Site Marcelin Berthelot
Open to all
-

Abstract

In a Bayesian stochastic framework, the optimal estimation of a response y from data x is obtained by maximizing the conditional probability of y knowing x. However, the estimation of this conditional probability again suffers from the curse of dimensionality if it is only assumed to be locally regular. We therefore need to introduce much stronger regularity conditions.

Many learning algorithms linearize the estimation of y by performing a change of variable that transforms the d-dimensional vector x into a d'-dimensional vector Φ(x) . The estimation of y is based on the scalar product Φ(x)> + b where the vector w and the bias b are optimized to minimize the empirical risk calculated on the training data. The calculation of w as a function of the training data is obtained by inverting an affinity matrix that makes explicit the correlation between the training data. For a quadratic risk, the representation theorem shows that theoptimal w is obtained by linear combination of Φ(x'), where x' are the training examples.

To control the generalization error, the empirical risk can be regularized by introducing a Tikhnonov penalty, proportional to the norm of w squared. This regularization guarantees that the inversion of the affinity matrix is stable. In general, we show that a stable estimate of y as a function of x necessarily has good generalization properties.