Sunday, April 13, 2008

Least squares intuition

The ideas for these will probably keep coming from Chris Bishop's book. Today, I liked the intuition behind the least squares solution for the (a bit generalized) linear regression problem.

The general problem is this: given a set of data points , we want to find a predictor function of the form


that minimizes the mean-squared error

The are basis functions that allow for richer models, and the are weights of the basis. For ordinary linear regression, we can take , but in general, we can try to match the output with any basis functions - gaussians, sinusoids, sigmoids, etc.

Now, consider a -dimensional space whose axes are given by the regression targets . Then any basis function evaluated at the data points is also a point in this space:

If the number of basis functions is less than the number of data points , then the linear combinations of the basis function values define a linear subspace of .
Particularly, is a point in this subspace for any choice of .

Now for the cherry on the cake - the choice of weights that minimize error corresponds to the choice of that is the projection of the given data vector onto the subspace spanned by the basis functions.

Perhaps obvious (and proof omitted), but I thought it was nice that the world is consistent :)

No comments:

Post a Comment