Abstract:
It has been shown earlier that the Self-Organizing Map (SOM) can be applied to the analysis and visualization of similarities of words in their usage in short contexts formed of adjacent words. These SOMs of words, called word category maps, have many potential applications. One of the application areas is information retrieval and data mining of textual document collections where the word category maps can be used in document encoding as has been demonstrated in the WEBSOM project. This paper concentrates on the question of how to create good word category maps, and specifically, how to compare different map instances. A general map comparison method that takes into account both the map topology and nonlinearity of the SOM is used. The paper presents results of comparisons in two experiments: first related to the number of words on a map, and second concerning the neighborhood type used in the SOM algorithm.