Special States

Next: Bars Problem Up: Avoiding Nonglobal Minima Previous: Rebooting

Special States

There are some special states in the learning algorithm. The states are layer-specific and can be active together. For example after rebooting, only the sources are updated for a while by keeping the weights constant. When learning an MLP-like network, the uppermost sources can be kept constant while updating the rest of the network. After adding a new layer to the network, the old parts can be kept constant and just learn the new part.

New neurons can be ``kept alive'' encouraging new neurons to be used instead of dampened off. This is achieved by modifying the propagated derivatives of the cost function in the multiplication node. The last term of derivate in equation (),

$\begin{displaymath}2 \mathrm{Var}\left\{s_2\right\} \frac{\partial C}{\partial \mathrm{Var}\left\{s_1 s_2\right\}} \left< s_1 \right> \end{displaymath}$

(6.8)

is ignored. It effectively means that the mean of s₁ is adjusted as if $\mathrm{Var}\left\{s_2\right\}$ were zero, in other words as if there were no uncertainty about s₂. In this way the cost function may increase at first due to overoptimistic adjustments, but it may pay off later on by escaping early pruning.

Next: Bars Problem Up: Avoiding Nonglobal Minima Previous: Rebooting

Tapani Raiko
2001-12-10