Abstract
We introduce instant codes defined on finite alphabets of symbols, and prefix codes that can be represented by binary trees. We prove Shannon's theorem, through Kraft's lemma, which shows that the minimum length of these codes is bounded inferiorly by entropy. The Shannon code is an instantaneous code that arbitrarily approaches entropy as it is computed over blocks of increasing length. The optimal prefix code is obtained using Huffman's algorithm.
The notion of entropy is extended to real-valued variables by the notion of differential entropy, which is not always positive. We demonstrate an asymptotic equipartition result that verifies that the joint density of a large number of independent random variables is almost constant over typical sets, whose volume depends on the entropy. Entropy therefore defines the volume of the domain in which a random variable is concentrated.
Shannon's entropy is linked to Fisher's information through maximum entropy probability models. A probability model is defined on the basis of observables that are moments corresponding to the expectation of functions of the data. Boltzmann's theorem shows that the maximum entropy distribution is an exponential probability distribution, whose parameters can also be calculated by maximum likelihood.