The experiment setting is to reconstruct the missing values and the mean square error of the reconstructions are used for the comparison. The two data sets that are used are speech data and Boston housing data. Ignorability of the data collection mechanism [3] is assumed here. The collection mechanism is nonignorable, for instance, when out-of-scale measurements are marked as missing.
The first data set consists of real-world Finnish speech spectrograms spoken by several individuals. Short term spectra are windowed to 30 dimensions with a standard preprocessing procedure for speech recognition. It is clear that a dynamic model [12] would give better reconstructions, but in this case the temporal information is left out to ease the comparison of the models. Half of the about 5000 samples are used as test data with some missing values. Missing values are set in four different ways to measure different properties of the algorithms (Figure 3):
The second data set is Boston housing data, which is publicly available at [2]. It concerns housing values in suburbs of Boston. Data set contains 506 vectors of 13 dimensions excluding one binary attribute. Four of the 13 values were c
ommon to each town, which consist of 1 to 50 suburbs. 70% of the data vectors are used as training data and the rest as testing data, which has 10% of its values missing randomly.