Kernel classification and SVM

Abstract

Kernel-based classification algorithms provide a relatively simple mathematical and algorithmic framework for developing learning algorithms. They separate two classes by adjusting a separating hyperplane, after performing a change of variable that associates a vectorΦ( x) with a datum x. Support vector machines optimize the position of the hyperplane by minimizing the empirical risk regularized by a margin criterion. The margin measures the minimum distance between the points of each class and the hyperplane. This minimization can be rewritten as a convex optimization problem under linear constraints, which depends on the scalar products k(x, x') = between the training data.

The same type of result can be obtained with a change of variableΦ(x) by replacing the kernel with k(x, x') = Φ(x), Φ(x') >. Optimization can be performed directly from the kernel values, by calculating the dual variables of the Lagrangian associated with regularized risk minimization. Mercer's theorem proves that any positive definite kernel can be obtained with a change of variableΦ(x). The main difficulty is to find a change of variables that reduces the risk of generalization. We study the properties of polynomial kernels and Gaussian kernels.

Abstract

Speaker(s)

Stéphane Mallat

Events

Data science mapping

2018 challenges presentation (1)

Bias-Complexity trade-off

Challenges 2018 (2)

The curse of large dimensions

Dimensionality reduction and denoising

Fourier analysis, filtering and sampling

Image denoising in a few formulas

Transforms and wavelet bases

Tackling a machine learning competition : methodology and practic…

Bayesian and linear kernel learning

Kernel regression and convex optimization

Kernel classification and SVM

Federated learning for medical data

Gradient descent and neural networks

Stochastic and conditional gradients for neural networks

See also