For better performance it can be useful to combine natural gradient learning with some standard superlinear optimization algorithm. One such algorithm is the nonlinear conjugate gradient (CG) method [13]. The conjugate gradient method is a standard tool for solving high dimensional nonlinear optimization problems. During each iteration of the conjugate gradient method, a new search direction is generated by conjugation of the residuals from previous iterations. With this choice the search directions form a Krylov subspace and only the previous search direction and the current gradient are required for the conjugation process, making the algorithm efficient in both time and space complexity [13].
The extension of the conjugate gradient algorithm to Riemannian manifolds is done by replacing the gradient with the natural gradient. The resulting algorithm is known as the Riemannian conjugate gradient method [14,15]. In principle this extension is relatively simple, as it is sufficient that all the vector operations take into account the Riemannian nature of the problem space. Therefore, the line searches are performed along geodesic curves and the old gradient vectors defined in a different tangent space are transformed to the tangent space at the origin of the new gradient by parallel transport along a geodesic [14].