The origins : cybernetics and the perceptron

Abstract

This lecture reviews the ideas behind neural networks, starting with the theory of cybernetics initiated by Wiener, the importance of hierarchical structures, and Rosenblatt's perceptron. Cybernetics provides a perspective on dynamic systems. Intelligence is defined as the ability to adapt over time. This adaptation optimizes a trajectory to reach a goal. In cybernetics, adaptation takes place through a feedback loop that adapts control parameters to reduce a measure of error relative to the goal to be achieved. Unlike an open-loop system, it is not necessary to model the environment, but only to react to the disturbances it introduces on the trajectory to reach the goal. Gradient descent learning algorithms for neural networks follow this principle. They progressively optimize the weights of the network in order to reduce the prediction error.

The article " The architecture of complexity " by H. Simons in 1962 shows that the existence of hierarchical structures is another element that simplifies the analysis and control of dynamic systems. These hierarchies can be found in most systems in the sciences, humanities and symbolic systems. They are also found in the architecture of convolutional deep neural networks.

Rosemblatt's perceptron, introduced in 1957, defines the first learning algorithm on a neural network. It has a single layer and binary output to classify data into two possible classes. Learning takes place via a gradient descent that minimizes an average of the deviations from the decision frontier. We show that this gradient descent follows Hebb's rule, observed in biology. Hebb's rule observes that two neurons that are excited simultaneously will strengthen the link between them. We also demonstrate that Rosemblatt's algorithm converges to a solution that depends on the initial conditions if the training data are linearly separable, and does not converge if they are not separable.

To avoid these convergence problems, the cost function optimized by the perceptron must be regularized. Vapnik's support vector machines introduce a margin criterion which guarantees that the boundary separates the points of two different classes at best, which implies the uniqueness of the convergence point and eliminates non-convergence in the case of non-separable data.

The origins : cybernetics and the perceptron

Abstract

Speaker(s)

Stéphane Mallat

Events

Introduction to deep neural networks

Introducing 7 data challenges 2019 (1)

Applications of deep neural networks

Introducing 7 data challenges 2019 (2)

Neural network approximations and regularity

Presentation of the 2018 challenge winners

The origins : cybernetics and the perceptron

Weakly supervised learning for visual recognition

Universal single-layer hidden network approximation

Natural language

Approximation error with a hidden layer and regularity

Automatic video analysis

Maximum likelihood network optimization

Deep reinforcement learning

Gradient descent and gradient backpropagation

Convergence of stochastic gradient descent

See also