Abstract
Under regularity assumptions, the maximum likelihood estimator is shown to be consistent. Fisher information is defined as the variance of the score, which is the gradient of the log likelihood. It is also shown to be the Hessian of the negative log likelihood. The latter is additive for independent random variables.
The main result is the Cramer-Rao bound. This expresses a lower bound on the variance of an estimator of the parameters of a probability distribution, based on the inverse of the Fisher information. Finding a good parameterization of a probability distribution therefore comes down to minimizing this inverse. More precisely, we demonstrate that under assumptions of regularity, asymptotic maximum likelihood estimation has an asymptotically Gaussian distribution, whose covariance is the inverse of Fisher's information.