We have presented a novel method to speed up learning methods based on optimizing an objective function depending on a probability distribution, such as variational Bayesian learning. Taking into the account the Riemannian structure among the variational parameters, the natural conjugate gradient algorithm is efficiently used to update both latent variables and model parameters at the same time. A simple form of the approximate distribution translates into a simple metric and a low computational overhead, but even for more complicated approximating distributions a simple approximation of the metric can provide significant speedups at very low extra computational cost.