Markus Koskela: TRECVID 2007 abstract

PicSOM Experiments in TRECVID 2007

Markus Koskela, Mats Sjöberg, Ville Viitaniemi, Jorma Laaksonen, and Philip Prentis. Online Proceedings of the TRECVID 2007 Workshop. Gaithersburg, MD, USA. November 2007.

PDF version available here.

Our experiments in TRECVID 2007 include participation in the high-level feature extraction, search, and video summarization tasks, using a common system framework based on multiple parallel Self-Organizing Maps (SOMs).

In the high-level feature extraction task, we applied a method of representing semantic concepts as class models on parallel SOMs, combined with external text search results. This year, we introduced a further post-processing stage in which the concepts' temporal and inter-concept co-occurrences were analyzed. We submitted the following six runs:

A_PicSOM_1_6: Required visual baseline
A_PicSOM_2_5: Visual features and text search
A_PicSOM_3_3: Visual features using variable convolution and text search
A_PicSOM_4_4: Visual features using variable convolution
A_PicSOM_5_2: Visual features, text search, and temporal context based on training set
A_PicSOM_6_1: Visual features, text search, and temporal context based on validation set

The results show that the temporal and inter-concept co-occurrence analysis improved the results considerably. On the other hand, inclusion of the text search worsened the results, leading to overall degradation of performance also on the subsequent runs. For this reason, we later executed additional runs in which the co-occurrence post-processing stage was employed without the text search.

In the search task, we submitted a total of six fully-automatic runs. In this year's experiments, we augmented the baseline ASR/MT search and content-based retrieval runs with high-level semantic concepts and pseudo relevance feedback. The overall settings for the six runs were as follows:

F_A_1_PicSOM_1_6: Required text search baseline
F_A_1_PicSOM_2_5: Required visual baseline
F_A_2_PicSOM_3_4: Text search and visual features
F_A_2_PicSOM_4_3: Text search and visual features, with pseudo relevance feedback
F_A_2_PicSOM_5_2: Text search and visual features, semantic concepts
F_A_2_PicSOM_6_1: Text search and visual features, semantic concepts, with pseudo relevance feedback

In this year's experiments, retrieval based on the visual features performed very poorly, and consequently the text baseline outperformed also the combined run with both visual and text features. In the further experiments, the inclusion of both the semantic concepts and pseudo relevance feedback resulted in performance improvement.