For better performance it can be useful to combine natural gradient learning with some standard superlinear optimization algorithm. One such algorithm is the nonlinear conjugate gradient (CG) method [13]. The conjugate gradient method is a standard tool for solving high dimensional nonlinear optimization problems. During each iteration of the conjugate gradient method, a new search direction is generated by conjugation of the residuals from previous iterations. With this choice the search directions form a Krylov subspace and only the previous search direction and the current gradient are required for the conjugation process, making the algorithm efficient in both time and space complexity [13].
The extension of the conjugate gradient algorithm to Riemannian
manifolds is done by replacing the gradient with the natural gradient.
The resulting algorithm is known as the Riemannian conjugate gradient
method [14,15].
In principle this extension is relatively simple,
as it is sufficient that all the vector operations take into account
the Riemannian nature of the problem space.
Therefore, the line searches are performed along geodesic curves and
the old gradient vectors
defined in a
different tangent space are transformed to the tangent space at
the origin of the new gradient by parallel transport along a
geodesic [14].