Markus Koskela, Jorma Laaksonen, Mats Sjöberg, and Hannes Muurinen. Online Proceedings of the TRECVID 2005 Workshop. Gaithersburg, MD, USA. November 2005.
PDF version available here.
Our experiments in TRECVID 2005 include participation in the
high-level feature extraction and search tasks.
In the high-level feature extraction task, we applied a method of
representing semantic concepts as class models on a set of parallel
Self-Organizing Maps (SOMs). We submitted one run,
A_PicSOM_1
, in which we applied
a feature selection scheme for each concept separately.
The results showed that the SOM-based class models can be used for
representing semantic concepts on multimodal feature indices and that
the proposed method is suitable for detecting video shots with
specific semantic content.
In the search task, we submitted a total of seven runs (three automatic, three manual, and one interactive run). Our main motivation was to study the utilization of parallel multimodal features and class models compared to using only text-based queries. The overall settings for the runs were as follows:
F_A_1_SOM-F1_7
: a baseline automatic run using only
ASR/MT output
F_A_2_SOM-F2_3
: an automatic run using ASR/MT
output, multimodal features, and class models
F_A_2_SOM-F3_5
: an automatic run using multimodal
features and class models
M_A_1_SOM-M1_6
: a baseline manual run using only
ASR/MT output
M_A_2_SOM-M2_4
: a manual run using ASR/MT output and
multimodal features
M_A_2_SOM-M3_2
: a manual run using ASR/MT output,
multimodal features, and class models
I_A_2_SOM-I_1
: an interactive run
Both in the automatic and manual experiments, we observed that the proposed method is able to combine the text query, multimodal features and class models successfully. In both cases, the overall best results are obtained using all three information sources with the MAP value being nearly double when compared to text-only search. Our small-scale interactive search experiments were performed with our prototype retrieval interface supporting only relevance feedback -based retrieval. Still, the experiments demonstrate that the proposed method can also be used in an interactive setting, where the search is guided with iterative feedback from the user.
You are at: CIS → People → Markus Koskela → Publications → TRECVID 2005
Page maintained by markus.koskela (at) hut.fi, last updated Tuesday, 20-Feb-2007 14:19:29 EET