Stochastic and conditional gradients for neural networks

Abstract

Most supervised learning methods, including neural networks, are formalized as an optimization problem in which the mean of the errors on the observed data is minimized with respect to the parameters of the prediction model. However, statistical learning gives rise to specific optimization problems, since a mean, or more generally an expectation, is minimized. This specificity makes it natural and efficient to use so-called "stochastic gradient" methods, where the model is updated very frequently, after only a few observations.

This talk presents some recent advances in stochastic gradient optimization, using "variance reduction". For "convex" problems (corresponding to a neural network with no hidden layer), these advances enable us to achieve in theory and in practice an exponential rate of convergence (in the number of iterations) towards the global optimum. The presentation also introduces so-called "conditional gradient" methods, which enable incremental learning where neurons are added to models one after the other.

Stochastic and conditional gradients for neural networks

Abstract

Speaker(s)

Francis Bach

Events

Data science mapping

2018 challenges presentation (1)

Bias-Complexity trade-off

Challenges 2018 (2)

The curse of large dimensions

Dimensionality reduction and denoising

Fourier analysis, filtering and sampling

Image denoising in a few formulas

Transforms and wavelet bases

Tackling a machine learning competition : methodology and practic…

Bayesian and linear kernel learning

Kernel regression and convex optimization

Kernel classification and SVM

Federated learning for medical data

Gradient descent and neural networks

Stochastic and conditional gradients for neural networks

See also