Markus Koskela: TRECVID 2005 abstract

PicSOM Experiments in TRECVID 2005

Markus Koskela, Jorma Laaksonen, Mats Sjöberg, and Hannes Muurinen. Online Proceedings of the TRECVID 2005 Workshop. Gaithersburg, MD, USA. November 2005.

PDF version available here.

Our experiments in TRECVID 2005 include participation in the high-level feature extraction and search tasks. In the high-level feature extraction task, we applied a method of representing semantic concepts as class models on a set of parallel Self-Organizing Maps (SOMs). We submitted one run, A_PicSOM_1, in which we applied a feature selection scheme for each concept separately. The results showed that the SOM-based class models can be used for representing semantic concepts on multimodal feature indices and that the proposed method is suitable for detecting video shots with specific semantic content.

In the search task, we submitted a total of seven runs (three automatic, three manual, and one interactive run). Our main motivation was to study the utilization of parallel multimodal features and class models compared to using only text-based queries. The overall settings for the runs were as follows:

F_A_1_SOM-F1_7: a baseline automatic run using only ASR/MT output
F_A_2_SOM-F2_3: an automatic run using ASR/MT output, multimodal features, and class models
F_A_2_SOM-F3_5: an automatic run using multimodal features and class models
M_A_1_SOM-M1_6: a baseline manual run using only ASR/MT output
M_A_2_SOM-M2_4: a manual run using ASR/MT output and multimodal features
M_A_2_SOM-M3_2: a manual run using ASR/MT output, multimodal features, and class models
I_A_2_SOM-I_1: an interactive run

Both in the automatic and manual experiments, we observed that the proposed method is able to combine the text query, multimodal features and class models successfully. In both cases, the overall best results are obtained using all three information sources with the MAP value being nearly double when compared to text-only search. Our small-scale interactive search experiments were performed with our prototype retrieval interface supporting only relevance feedback -based retrieval. Still, the experiments demonstrate that the proposed method can also be used in an interactive setting, where the search is guided with iterative feedback from the user.