Interpretation.

Next: Evaluation. Up: Interpretationevaluation, and use Previous: Interpretationevaluation, and use

Interpretation.

The interpretation of the findings of exploratory data analysis depends, of course, on the application. There exist, however, some general methods that may aid in the interpretation process.

Some caution is due at the very beginning of the interpretation: although the map metaphor may be useful for intuitively understanding what kinds of applications would be worthwhile, it does not necessarily hold all the way through to the interpretation of the map. Road maps, for example, basically only scale the distances, but the SOM may transform the locations of the data items in a highly nonlinear manner. Therefore it is not sensible to try to interpret the vertical and horizontal axes of the map in general, although in some special cases as in Publication 2 there may exist straightforward interpretations. If desired, simple interpretations may be sought by displaying auxiliary information on the map display and by inspecting its distribution. In Publication 2 the longer axis of the map seems to correlate with the overall economical welfare as measured by the GNP (gross national product) per capita.

Since the SOM tries above all to preserve local structures, the interpretation of the map should predominantly be done locally, based on the local relations of the data items on the map. The global structure is often useful as well, however. Different properties of the reference vectors and of the data items can be visualized on the map display to aid in the interpretation, as was discussed in Section 6.4.2.

If the data items come from a time-varying process, it is possible to visualize the trajectory of the successive samples on the map, and thereby to monitor the state of the process on an easily understandable visual display (cf., e.g., Alander et al., 1991; Kangas, 1994; Kasslin et al., 1992; Kohonen, 1995c; Tryba and Goser, 1991). Such trajectory displays are used in Publication 1.

Yet another method that aids in the interpretation of the maps, provided that some external information like class labels is available, is to plot the labels on the organized map. The distribution of the samples of each class, plotted on the map as a density histogram, may also help in the interpretation process. Distributions of samples containing different types of background EEG activity have been displayed in Publication 1, and different discussion topics (Usenet newsgroups) in Publication 4.

If the distributions of the known classes are overlapping such displays can even be used to explore the degree of overlap in different types of samples, whereby it may be possible to gain insight into whether the classes actually co-exist or whether new kinds of features should be added to the data items to make the classes more easily separable.

Displays of the reference vectors may also be useful. The methods for visualizing high-dimensional data discussed in Section 6.1 could be used, as well as some application-specific visualization methods like the head-shaped displays in Publication 1.

Next: Evaluation. Up: Interpretationevaluation, and use Previous: Interpretationevaluation, and use

Sami Kaski
Mon Mar 31 23:43:35 EET DST 1997