We covered a paper by Schadt and others last week that dealt with
selecting the best model for the data. We want to select the simplest model that explains the most data, and there is a tradeoff between model fit and complexity.
In practice, people use various information criteria to formalize the tradeoff, usually of the form
 = - ln p(D | M) + f(M))
where

is the information,
)
is the probability of the data given the model, and
)
is a penalty for model complexity.
Matti asked me last week what the rationale behind choosing

is. I'll write about the 2 widely used forms (
Akaike Information Criterion (AIC) this time, and
Bayesian Information Criterion (BIC) the next), but don't really know the answer to the weights of

compared to
)
either :)
Here's the intuition for AIC (the idea is presented in both MacKay and Bishop books): we have a set of models

, and want to select the one with highest probability
)
after seeing the data

. This is given by
 = \frac{p(D | M_i) p(M_i)}{p(D)})
If we take prior over models to be uniform, we just need to evaluate the evidence for each model.
Pick one model

, and say it has

tunable parameters. Select one of them,

, and let's assume it's prior distribution is flat with width
 = \frac{1}{\delta_{prior}})
We have
 = \int P(D | M, w)p(w | M) dw)
Suppose
)
is sharply peaked around

, and the width of the peak is

Then the probability of

will be all from that region, and given by
 \delta_{MAP})
, since the integral drops to 0 outside the peak. Combining that with the prior, and taking the log we get
 \simeq ln p (D | M, w_{MAP}) - ln \frac{\delta{prior}}{\delta_{MAP}})
Now repeating the similar argument over all parameters (taking

integrals), we get
 \simeq ln p (D | M, \underline w_{MAP}) - K ln \frac{\delta{prior}}{\delta_{MAP}})
The first part of the expression is the fit of the model of the data, and the second part is a linear penalty of the number of parameters, scaled by the log of the fold-difference between the size of the prior and posterior parameter space.
No comments:
Post a Comment