next up previous contents
Next: Hierarchical Nonlinear Factor Analysis Up: Building Blocks for Hierarchical Previous: Update Rule

   
Form of the Cost Function

The form of the part of the cost function that an output of a neuron affects is shown to be of the form $a \left< \cdot \right> + b [(\left< \cdot \right>-
\left< \cdot \right>_{\text{current}})^2 + \mathrm{Var}\left\{\cdot\right\}] + c\left< \exp \cdot \right> + d$. If the output is connected directly to another variable, this can be seen directly from Equation ([*]). When the output is connected to multiple variables, the sum of the affected costs is of the same form. Now one has to prove that the form stays the same when the signals are fed through addition and multiplication nodes.

If the cost function has the predefined form for s1 + s2, it has the same form for s1, when regarding s2 constant. This can be shown using ([*]) and ([*]):

 
C = $\displaystyle a\left< s_1+s_2 \right>+ b\left[\left(\left< s_1+s_2 \right>-\lef...
...s_2 \right>_{\text{current}}\right)^2+\mathrm{Var}\left\{s_1+s_2\right\}\right]$  
    $\displaystyle + c\left< \exp (s_1 + s_2) \right> + d$ (4.37)
  = $\displaystyle a\left< s_1 \right> + b\left[\left(\left< s_1 \right>-\left< s_1 \right>_{\text{current}}\right)^2+\mathrm{Var}\left\{s_1\right\}\right]$  
    $\displaystyle + (c\left< \exp s_2 \right>)\left< \exp s_1 \right> + \left(d + a\left< s_2 \right> + b \mathrm{Var}\left\{s_2\right\}\right).$  

It can also be seen from ([*]) that when c=0 for the sum s1 + s2, it is zero for the addend s1. This means that the outputs of product and nonlinear nodes can be fed through addition nodes.

If the cost function is of the predefined form with c=0 for the product s1 s2, it is similar for s1, when regarding s2constant. This can be shown using ([*], [*])

 
C = $\displaystyle a\left< s_1s_2 \right>+ b\left[\left(\left< s_1s_2 \right>-\left<...
... \right>_{\text{current}}\right)^2+\mathrm{Var}\left\{s_1s_2\right\}\right] + d$ (4.38)
  = $\displaystyle \left(a\left< s_2 \right>+2b\mathrm{Var}\left\{s_2\right\}\left< s_1 \right>_{\text{current}}\right)\left< s_1 \right>$  
    $\displaystyle + \left[b\left(\left< s_2 \right>^2+\mathrm{Var}\left\{s_2\right\...
...ft< s_1 \right>_{\text{current}}\right)^2+\mathrm{Var}\left\{s_1\right\}\right]$  
    $\displaystyle + \left(d-b\mathrm{Var}\left\{s_2\right\}\left< s_1 \right>_{\text{current}}^2\right).$  

When one calculates the partial derivatives of a cost of this form

\begin{displaymath}C_p = a \overline{s} + b [(\overline{s}-\overline{s}_{\text{current}})^2 + \widetilde{s}] +
c\left< \exp s \right> + d,
\end{displaymath} (4.39)

one finds out that they are simply
$\displaystyle \frac{\partial C_p }{ \partial \overline{s}}$ = a (4.40)
$\displaystyle \frac{\partial C_p }{ \partial \widetilde{s}}$ = b (4.41)
$\displaystyle \frac{\partial C }{ \partial \left< \exp s \right>}$ = c. (4.42)


next up previous contents
Next: Hierarchical Nonlinear Factor Analysis Up: Building Blocks for Hierarchical Previous: Update Rule
Tapani Raiko
2001-12-10