next up previous
Next: 1. Minimization of the Up: 3. : algorithm for Previous: 3. : algorithm for

   
1. The VQ model and the cost function

To demonstrate the approach, we will describe ${\rm IVGA_{VQ}}$ where each variable group is modeled using a different VQ. From now on, we will refer to the cost function of a single variable group leaving out the group index g.

The vector quantization model used for a single variable group consists of codebook vectors ${\boldsymbol \mu}(i), \; i = 1, \ldots, C$ (described by their means and a common variance) and indices of the winners w(t) for each data vector ${\boldsymbol x}(t), \; t = 1,
\ldots, N$.

For finding $-\ln p({\boldsymbol x}\vert H)$ we use variational EM-algorithm with $\theta=\{{\boldsymbol \mu},w\}$ as missing observations. In the E-phase, an upper bound of the cost is minimized [2,5]. The rest of the parameters are included in H, i.e. we use ML estimates for the following: c, the hyper parameter governing the prior probability for a codebook vector to be a winner; ${\boldsymbol \sigma}^2_{\boldsymbol x}$, the diagonal elements of the common covariance matrix; and ${\boldsymbol \mu_\mu}$ and ${\boldsymbol \sigma}^2_{\boldsymbol \mu}$, the hyper parameters governing the mean and variance of the prior distribution of the codebook vectors.

The upper bound for $-\ln p({\boldsymbol x}\vert H) = -\ln \int p({\boldsymbol x}, \theta \vert H) d\theta$is obtained from

\begin{displaymath}\textstyle
-\ln \int p({\boldsymbol x}, \theta \vert H) d\th...
...rac{p({\boldsymbol x}, \theta\vert H)}{q(\theta)} d\theta \, ,
\end{displaymath}

where the inequality follows from the convexity of $-\ln x$ by Jensen's inequality. H and $q(\theta)$ are alternately optimized but instead of finding the intractable global optimum $q(\theta) =
p(\theta \vert {\boldsymbol x}, H)$, we restrict to finding the optimum among the tractable family of distributions of the form $q({\boldsymbol \mu},w) = q({\boldsymbol \mu})q(w)$. Substituting $p({\boldsymbol x}, \theta \vert H) = p({\boldsymbol x}\vert w, {\boldsymbol \mu...
...mbol \mu}\vert
{\boldsymbol \mu_\mu}, {\boldsymbol \sigma}^2_{\boldsymbol \mu})$, denoting the integral over $q(\theta)$ by $E\{\}$ and arranging the terms then yields

\begin{eqnarray*}\textstyle
{\rm errors} \;\;\;\;\;\;\;\;\;\;& \;{\rm winners} ...
...mu_\mu}, {\boldsymbol \sigma}^2_{\boldsymbol \mu})} \right\} } .
\end{eqnarray*}


After substituting with the gaussian model for the codebook vectors the cost function becomes[*]


  \begin{eqnarray*}
L & = & \sum_{j=1}^d \sum_{t=1}^N \frac{\tilde{\mu}_j(w(t)) + ...
...{\mu}_j(i) - {\mu_\mu}_j(i))^2}{2 {\sigma^2_\mu}_j(i)} \right].
\end{eqnarray*}


where $\bar{\boldsymbol \mu}$ and $\tilde{\boldsymbol \mu}$ are the mean and variance of the posterior distribution of the codebook vectors, and d is the dimensionality of the subspace, i.e. the size of the variable group.



 
next up previous
Next: 1. Minimization of the Up: 3. : algorithm for Previous: 3. : algorithm for
Krista Lagus
2001-08-28