Variational Bayesian (VB) learning provides even stronger tools
against overfitting.
VB version of PCA by [13] approximates the joint
posterior of the unknown quantities using a simple multivariate distribution.
Each model parameter is described a posteriori
using independent Gaussian distributions.
The means can then be used as point estimates
of the parameters, while the variances
give at least a crude estimate of the reliability of these point estimates.
The method in [13] does not extend to missing values
easily, but the subspace learning algorithm (Section 3)
can be extended to VB.
The derivation is somewhat lengthy, and it is omitted here together
with the variational Bayesian learning rules because of space limitations;
see [12] for details. The computational complexity of
this method is still per iteration, but the VB version is in practice about
2-3 times slower than the original subspace learning algorithm.