Next: Towards Real-Time Image Understanding
Up: Summary of References Related
Previous: On the saddle point
Contents
Subsections
Face representation is a crucial step of face recognition systems. An optimal face representation should be discriminative, robust, compact, and very easy-to-implement. While numerous hand-crafted and learning-based representations have been proposed, considerable room for improvement is still present. In this paper, we present a very easy-to-implement deep learning framework for face representation. Our method bases on a new structure of deep network (called Pyramid CNN). The proposed Pyramid CNN adopts a greedy-filter-and-down-sample operation, which enables the training procedure to be very fast and computation-efficient. In addition, the structure of Pyramid CNN can naturally incorporate feature sharing across multi-scale face representations, increasing the discriminative ability of resulting representation. Our basic network is capable of achieving high recognition accuracy (85.8
- New deep structure Pyramid CNN
- Labeled Faces in the Wild (LFW)
- faces
- 1680 of the people have two or more distinct photos
- Detected by Viola-Jones detector
- http://vis-www.cs.umass.edu/lfw/
- State-of-the-art performance on LFW benchmark ()
- Good face representation
- Identity-preserving: Same person pictures close in feature space
- Abstract and Compact: from high to low dimensionality
- Uniform and Automatic: NO hand-crafted and hard-wired parts
- Pyramid CNN
- ID-preserving Representation Learning: Loss functions measures
distance in output feature space
- Convolutions and Down-sampling
- Deeper give best results, but increases rapidly the training time
- Each CNN own private output layer and gets the input from the previous
shared layer
- Only the output of the last level network is used for the represetnation
- The rest of the outputs is just for training
- Results
- 164 incorrect predictions
- Some of them are incorrectly labeled
- Others are very difficult for humans, because of the age or pose
- On LFW benchmark achieves state-of-the-art and close to human on croped images
- With ROC curve as a mesure there is an improvement of 0.07-0.12 with Baseline
- Face recognition does not contemplate affine transformations or perspectives,
- Can be difficult to apply in task such as ImageNet, where the object can be
in any place and position
Next: Towards Real-Time Image Understanding
Up: Summary of References Related
Previous: On the saddle point
Contents
Miquel Perello Nieto
2014-11-28