Deep neural networks for music audio signals

Abstract

As in many other fields, deep neural networks have enabled major advances in the processing of musical audio signals. This seminar presents the specificities of these signals and the adaptations required of deep neural networks for their modeling.

In a first part, we recall some elements of audio signal processing (Fourier, CQT, harmonic sinusoidal model, source-filter model). In the traditional machine-learning approach, these elements are used to build " hand-crafted features " given as input to classification algorithms.

In a second part, we show how deep neural networks (in particular convolutional neural networks) can be used to perform " feature learning ". We first recall the fundamental differences between the 2D image and time/frequency representations. We then discuss the choice of input (spectrogram, CQT or raw-waveform), the choice of convolution filter shape, autoregressive neural models, and the different ways of injecting a priori knowledge (harmonicity, source/filter) into these networks.

In a third part, we present the different learning paradigms used in the music audio domain : classification, encoder-decoder (source separation, constraints on latent space), metric learning (triplet loss) and semi-supervised learning.

Deep neural networks for music audio signals

Abstract

Speaker(s)

Geoffroy Peeters

Events

Architectures and applications of convolutional neural networks

2020 Data Challenges (1)

The architecture of complexity

2020 Data Challenges (2)

Approximation and separability

Presentation of the 2019 challenge winners

Symmetry groups and parsimony

Deep neural networks for music audio signals

Convolutions and time-frequency representations

Artificial intelligence and natural intelligence: towards bio-inspired AI

Wavelet transforms

Deep learning applications for histology

Wavelets for images

Physiological models of hearing

Wavelet scattering networks

See also