Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimising some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perceptual learning is, literally, an integral part of value learning; in the sense that perception is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimised through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimisation of a quantity [free energy] that bounds the probability of sensory input, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of agents, by considering the probabilistic behaviour of an ensemble of agents belonging to the same class.
This treatment unifies value and perceptual learning and suggests that value is simply the probability of sensory input expected by an agent. This means that acting to maximise value is the same as acting to minimise surprise; in other words, sampling the environment so that is conforms to our expectations. In this way, exchange or interactions with the environment are maintained within bounds that preserve the integrity of the agent. Clearly, the surprise of a sensory exchange depends on some representation or perceptual model of that exchange. We show that this model emerges naturally as the internal states of the agent optimise the free energy bound above.