We were interested in the mechanisms underlying the modulation of auditory perception by attention, and particularly in the involvement of cerebral electrical oscillations in this process. The lecture began with a presentation of these oscillations and their now well-established role in certain aspects of sensory perception and cognition. Work from Charles Schroeder's laboratory highlighting the role of these oscillations in the effects of attention on the response of the primary auditory cortex in the macaque was then discussed. In a study published in 2013 (Lakatos P., Musacchia G., O'Connel M.N., Falchier A.Y., Javitt D.C. and Schroeder C.E., Neuron, 2013), neuronal activity in layer L3 of the primary auditory cortex was analyzed using multi-electrode recordings, during which the animal's attention was (or was not) focused on sound sequences. The macaques had previously been trained to attend and sustain their attention to these sequences in order to detect the presence of a deviant frequency. These sequences each consisted of a pure tone lasting 25 milliseconds, with a frequency of either 5.7 kHz or 16 kHz. This "sound puff" was repeated at regular intervals with a frequency of 1.6 Hz for the 5.7 kHz sound puffs and 1.8 Hz for the others. Delta-type electrical oscillations were observed, with the rhythmicity of sound. In the cortical region whose characteristic frequency is that of the sound burst (5.7 kHz or 16 kHz), these oscillations passed through their phase of maximum depolarization during the "sound burst", thus promoting the generation of sound-synchronous action potentials. In a cortical region whose characteristic frequency is not that of the sound burst, waves with the same rhythmicity as the sound stimulation were also detected, but the sound stimulation coincided with the phase corresponding to the hyperpolarization of neurons, thus inhibiting their discharge. The electrical waves observed are therefore present throughout the auditory cortex, but they are in phase opposition to the sound waves in the cortical region tuned in frequency with the sound burst, while they are in phase in the untuned regions. Because these electrical oscillations continue for a few seconds after the sound has stopped, the origin of these rhythmic fluctuations, timed to sound repetitions, has been attributed to a mechanism whereby neural networks are trained by sound over several cycles. All in all, these results suggest that attention to an expected rhythmic sound sequence trains neuronal networks throughout the auditory cortex in such a way that their oscillatory activity becomes coherent and adjusted to the rhythmicity of the sound. These results have been extended to theta waves synchronized by sound repetitions of corresponding frequencies. Like delta oscillations, these are driven by a rhythmic auditory stimulus to which the monkeys pay attention. These findings were reinforced by the simultaneous presentation of two sounds to which the monkey alternates its attention. The recordings revealed an increase in attention-induced neuronal discharges and current density in the cortical region tuned in frequency to the sound burst. More importantly, they demonstrate that the association of the delta wave depolarization maximum with the sound stimulation maximum is an effect of attention. In several cortical areas, in addition to the one tuned in frequency to the sound stimulus, attention is responsible for synchronizing the activity of neural networks. When sound stimuli are present simultaneously, only the one on which attention is focused drives delta oscillatory activity across a range of cortical regions. These findings lead to a prediction that has been validated experimentally: the amplitude of neuronal discharges in response to an ignored sound stimulus in the region tuned in frequency to that stimulus depends on its temporal relationship with the stimulus to which attention is being paid. Attention thus acts as a temporo-spectral filter for neural activity in the primary auditory cortex. These findings reinforce the idea that fluctuations in the excitability of distributed sets of neurons form the context in which specific sensory content is processed (Buzsaki G. and Chrobak J.J., Curr. Opin. Neurobiol., 1995). Attention modulates oscillatory activity in the supragranular L3 layer by top-down mechanisms that remain to be determined.
The situation of sound competition naturally refers to the cocktail partyeffect, which led us to take an interest in another work, published in the journal Neuron in 2013 (Zion Golumbic E.M., Schroeder C., Neuron, 2013). Low-frequency oscillations are of particular interest because their period falls within the time scale of speech envelope fluctuations. The aim of the study was to examine how attention influences the neural representation of expected or ignored speech in a cocktail party situation. The hypothesis tested was: does attention entrain low-frequency neural oscillations whose phase is "timed" to that of the stream of listened speech, thus forming an amplified internal representation of the stream of listened speech? This hypothesis of selective entrainment is attractive for several reasons. On the one hand, the flow of natural speech is quasi-rhythmic, in terms of both prosodic and syllabic levels; these rhythms lead to temporal regularities that allow for a cerebral training effect. On the other hand, if training aligns the phases of high excitability of low-frequency oscillations with the instants when salient events occur in the stream of speech listened to, the amplitude of the neuronal discharges that coincide with these events will increase. Cortical activity was analyzed by electrocorticography in six patients with severe epilepsy, in the pre-operative period. In humans, electrocorticography has a very good signal-to-noise ratio and very good spatial resolution (< 5 mm2). For each electrode, the frequency band of the neural signal that best represents the temporal structure of speech in its phase and/or amplitude fluctuations was determined. The results show that the percentage of brain sites for which there is phase coherence between brain oscillations and the speech stream to which the patient is paying attention is high for low-frequency oscillations (delta and theta), and lower for alpha oscillations. The percentage of sites for which there is amplitude coherence is always lower, and as a general rule, this coherence is associated with higher-frequency oscillations. All in all, the phase of low-frequency oscillations and the amplitude of high-frequency oscillations follow the envelope of the speech being listened to. Phase coherence is not limited to the primary auditory cortex, but extends to the high-level integration regions involved, in particular, in language processing, multisensory processing and attention control. High-frequency amplitude coherence, on the other hand, is almost exclusively present in the primary auditory cortex. The attention paid to a speaker in a cocktail party situation leads to a cortical response whose characteristics distributed over the cortex are very similar to those observed when a single speaker speaks. On the other hand, phase and amplitude coherence diminish if attention is switched from one speaker's speech to that of the other. Some so-called "non-selective" brain sites show significant tracking of both listened and ignored speakers, although their responses are biased in the direction of the listened speaker. Others have a response that appears selective for the speech being listened to: they are located almost exclusively outside the primary auditory cortex. Tracking of listened speech in these "selective" sites increases as the sentence progresses, indicating that these regions are able to use the spectro-temporal regularities of listened speech to progressively refine representations of the listened stimulus. This adaptation effect does not appear to exist at the level of lower hierarchical auditory nuclei. These results provide an empirical basis for the idea that selective attention in the cocktail partymodel relies on a top-down control of neuronal excitability over time (Lakatos P., Schroeder C., Neuron, 2009). The product of this interaction is the formation of a dynamic neural representation of the temporal structure of the stream of speech being listened to, with associated amplifying and temporal filter effects. These results are in line with the idea of active sensing, whereby the brain dynamically models its internal representation of the stimulus, and particularly of natural, continuous stimuli, in response to environmental and contextual demands.
The cellular basis of brain oscillations - the product of interactions between glutamatergic pyramidal neurons and GABAergic inhibitory interneurons - was then evoked. As György Buzsaki and James Chrobak demonstrated in 1995 (Buzsaki G. and Chrobak J.J., Curr. Opin. Neurobiol., 1995), inhibitory interneurons play an essential role in the structuring and spatio-temporal segregation of these oscillations. For example, the frequency of oscillations depends on the duration of inhibition exerted by the interneurons.