next up previous
Next: Pruning and local minima Up: Ensemble Learning Previous: Ensemble Learning

Linear computational complexity

Each variable in the model yields one multiplicative term in $ p(\mathbf{X},
\boldsymbol{\theta})$. The terms can be expressed in the form $ p($variable$ \mid$   parents$ )$. The parents can be either computational nodes, such as summation, addition or switch, or other variables. The difficult part of the cost function is $ \left< \ln p(\mathbf{X}, \boldsymbol{\theta}) \right>$ which is taken over $ q(\boldsymbol{\theta})$. The logarithm splits the product of simple terms into a sum. If each of the simple terms can be computed in constant time, the overall computational complexity is linear.

In general, the computation time is constant if the parents are independent according to $ q(\boldsymbol{\theta})$. The independence is violated if any variable receives inputs from a latent variable through multiple paths or from two latent variables which are dependent according to $ q(\boldsymbol{\theta})$.

According to our experience, almost maximally factorial $ q(\boldsymbol{\theta})$ suffice for latent variable models. It seems that a good model structure is usually more important than a good approximation of the posterior probability of the model. Density estimates of continuous valued latent variables offer a large advantage over point estimates in being robust against over-fitting and providing a cost function suitable for learning model structures. With ensemble learning the density estimates are almost as efficient as point estimates.


next up previous
Next: Pruning and local minima Up: Ensemble Learning Previous: Ensemble Learning
Harri Valpola 2001-10-01