In the approximation of the means and variances, the inputs of each function are assumed to be uncorrelated. If there exist i and j such that the value of is propagated in more than one separate route to the parameters of the function fj, then the parameters are correlated and the computation of mean and variance may be inaccurate. This should not be a very severe restriction; in an MLP with one hidden layer, for example, no parameter (weight) affects any output neuron in more than one separate route. Should the approximation turn out to be too inaccurate, some of the cross terms may be taken into account.
We have assumed that the errors made in the discretisation of the parameters are mutually uncorrelated. It can be argued that the estimate of the description length would be more accurate if we took into account the dependence between parameters: a change in the value of one parameter might be partially compensated by a suitable change in the values of the others. Assuming uncorrelated errors effectively penalises parametrisations with strong dependencies between parameters, since the description length could be made shorter using a parametrisation which removes the dependencies. Usually it is desirable to favour parametrisations with small dependencies, and thus the assumption of uncorrelated discretisation errors is reasonable.
In order to have a decodable message, the accuracies of the parameter values should be coded and sent before sending the truncated parameters. Wallace and Freeman [11] argue that the values and the accuracies of the parameters are not independent, and one can construct a decodable message with almost the same code length we have used.