Calculus of variations can be used to show why the Gaussian is interesting. It's a limiting distribution of several families, sum of IID variables is approximately Gaussian - but it is also a distribution that conveys our ignorance of the data. Given a distribution
The way to show it involves calculus of variations and Lagrange multipliers. We encode the 3 conditions of the distribution function (integrates to 1, mean and variance given), and combine it with the entropy in the Lagrangian:
Now differentiating with respect to
Completing the square and solving for the Lagrange multipliers using the constraints for mean and variance, we arrive at the Gaussian distribution. This holds similarly for the multivariate case.
So having a Gaussian as a prior distribution for observed data is equivalent to saying that we know nothing about the data except its mean and variance. Once again - cool :)
2 comments:
That is assuming the data variation occurs in a continuous spectrum. As an experimentalist, we know that is not always the case ;)
Gaussian- The Darwinian theory of selection springs to mind. Although it is tempting to believe the monk who loved his peas and apply natural selection to discrete genetic entities instead.
I will leave the actual maths to the experts..
for some reason I'm not notified of comments - and am surprised to see them here :)
Gaussian is nice if you really know nothing of the underlying processes generating the data - assuming the gaussian form is then the least assumptions you can make. If you actually know how your experiments are screwed up (of course yours are not - unlike mine as you well know :), other choices can be better.
Post a Comment