The data set
X consists of 10 points on a plane. Model
states, that the points have been generated by a sixth order
polynomial, whose weights are drawn from a Gaussian distribution with
a zero mean and standard deviation (std) 2 and a Gaussian noise with
std 0.1 is added. The problem is to find these weights.
Figure shows the results. There are many different
polynomials that fit quite well to the data. The ML solution does the
fitting best, but the weights of the polynomial are large and the
polynomial has a complicated form. The MAP solution takes the prior
distributions into account, and the result is smoother. Bayesian
learning takes into account all polynomials and weights them with
their posterior probability. It solves the tradeoff between under- and
overfitting. Note that the error fractiles are closer in the parts of
the polynomial that have data points.
![]() |