Siamese Neural Networks for One-shot Image Recognition (2015)
Contents
- Abstract
- Approach
- Deep Siamese Networks for Image Verification
- Model
- Learning
0. Abstract
Siamese NN : rank similarity between inputs
- not just to new data
- but also to “new class”
1. Approach
learn image representations, via supervised “metric-based approach”
with “SIAMESE neural network”
& reuse network’s features for one-shot learning
Employ “Siamese CNN”
- 1) capable of learning generic “image feature”
- 2) easily trained using standard optimization techniques
- 3) does NOT rely upon domain-specific knowledge
Pairing with the HIGHEST score ( according to verification network )
\(\rightarrow\) HIGHEST probability for one-shot task
2. Deep Siamese Networks for Image Verification
Siamese Net : 1990s에 처음 소개
- consists of twin networks
This paper
- use weighted \(L_1\) distance between twin feature vectors \(\mathbf{h_1}\) & \(\mathbf{h_2}\)
- sigmoid activation 사용
- cross-entropy objective
2-1. Model
2-2. Learning
(1) Loss Function
Notation
- \(M\) : minibatch size
- \(\mathbf{y}\left(x_{1}^{(i)}, x_{2}^{(i)}\right)\) : length- \(M\) vector which contains the labels for the minibatch
- \(y\left(x_{1}^{(i)}, x_{2}^{(i)}\right)=1\) : same class
- \(y\left(x_{1}^{(i)}, x_{2}^{(i)}\right)=0\) : different class
Regularized CE
- \(\begin{gathered} \mathcal{L}\left(x_{1}^{(i)}, x_{2}^{(i)}\right)=\mathbf{y}\left(x_{1}^{(i)}, x_{2}^{(i)}\right) \log \mathbf{p}\left(x_{1}^{(i)}, x_{2}^{(i)}\right)+ \left(1-\mathbf{y}\left(x_{1}^{(i)}, x_{2}^{(i)}\right)\right) \log \left(1-\mathbf{p}\left(x_{1}^{(i)}, x_{2}^{(i)}\right)\right)+\boldsymbol{\lambda}^{T} \mid \mathbf{w} \mid ^{2} \end{gathered}\).