Next: Tiled convolutional neural networks. Up: Summary of References Related Previous: Convolutional learning of spatio-temporal Contents

Subsections

Convolutional Deep Belief Networks on CIFAR-10 [49]

Original Abstract

We describe how to train a two-layer convolutional Deep Belief Network (DBN) on the 1.6 million tiny imagesdataset.When training a convolutional DBN, one must decide what to do with the edge pixels of teh images. Asthe pixels near the edge of an image contribute to the fewest convolutional filter outputs, the model maysee it fit to tailor its few convolutional filters to better model the edge pixels. This is undesirable becaue itusually comes at the expense of a good model for the interior parts of the image. We investigate several waysof dealing with the edge pixels when training a convolutional DBN. Using a combination of locally-connectedconvolutional units and globally-connected units, as well as a few tricks to reduce the effects of overfitting,we achieve state-of-the-art performance in the classification task of the CIFAR-10 subset of the tiny imagesdataset.

Main points

Detectors
- Harris3D
- Cuboid
- Hessian
- Dense sampling
Descriptors
- HOG/HOF
- HOG3D
- ESURF (extended SURF)
Datasets
- KTH actions
  - 6 human action classes
  - walking, jogging, running, boxing, waving and clapping
  - 25 subjects
  - 4 scenarios
  - 2391 video samples
- UCF sport actions
  - 10 human action classes
  - winging, diving, kicking, weight-lifting, horse-riding, running, skateboarding, swinging, golf swinging and walking
  - 150 video samples
- Hollywood2 actions
  - 12 action classes
  - answering the hone, driving car, eating, fighting, geting out of the car, hand shaking, hugging, kissing, running, sitting down, sitting up, and standing up.
  - 69 Hollywood movies
  - 1707 video samples

Next: Tiled convolutional neural networks. Up: Summary of References Related Previous: Convolutional learning of spatio-temporal Contents

Miquel Perello Nieto 2014-11-28