S\(^4\)L : Self-Supervised Semi-Supervised Learning

Abstract
Introduction
Related Work
1. Semi-supervised Learning
2. Self-supervised Learning
Methods
1. Self-supervised Semi-supervised Learning
2. Semi-supervised Baselines

0. Abstract

propose the framework of \(S^4L\) ( = Self-Supervised Semi-Supervised Learning )

\(\rightarrow\) use it do derive 2 novel semi-supervised image classification methods

1. Introduction

Hypthoesize that self-supervised learning techniques could dramatically benefit from a small amount of labeled examples
Bridge self-supervised & semi-supervised learning

(1) Semi-supervised Learning

use both labeled & unlabeled datasets
standard for evaluating semi-supervised algorithms :
- (1) start with LABELED dataset
- (2) keep only portion of labels
- (3) treat the rest as UNLABELED
add consistency regularization losses
- on the unlabeled data
- measure the discrepancy between predictions made on perturbed unlabeled data points
- result :
  - by minimizing this loss, models implicitly push the decision boundary away from high-density parts of the unlabeled data

2 additional approaches for semi-supervised laerning

(1) Pseudo-Labeling
- imputes approximate classes on unlabeled data
- model = trained by ONLY LABLED dataset
(2) conditional entropy minimization
- UNLABELED data : encourgaged to make confident predictions on some class

(2) Self-supervised Learning

various pretext ( surrogate ) tasks
Use only unsupervised data

3. Methods

focus on semi-supervised image classification problem

Notation

assume an (unknown) data distn : \(p(X, Y)\)
- labeled traning set : \(D_l\) ……. sampled from \(p(X, Y)\)
- unlabeled traning set : \(D_u\) ……. sampled from \(\)p(X)

Objective Function :

\(\min _\theta \mathcal{L}_l\left(D_l, \theta\right)+w \mathcal{L}_u\left(D_u, \theta\right)\).

(1) Self-supervised Semi-supervised Learning

2 prominent self-supervised techniques :

(1) predicting image rotation
(2) exemplar

a) \(S^4\) L-Rotation

rotation degree : (0,90,180,270) \(\rightarrow\) 4-class classification

( also apply it to LABELED datasets )

Loss function : \(\mathcal{L}_{r o t}=\frac{1}{ \mid \mathcal{R} \mid } \sum_{r \in \mathcal{R}} \sum_{x \in D_u} \mathcal{L}\left(f_\theta\left(x^r\right), r\right)\).

b) \(S^4\) L-Exemplar

Cropping … produce 8 different instances of each images

implement \(\mathcal{L}_u\) as the batch hard triplet loss with a soft margin

applied to all 8 instances of each image

(2) Semi-supervised Baselines

a) Virtual Adversarial Traning (VAT)

idea ) making the predicted labels ROBUST around input data point against local perturbation
VAT loss for model \(f_{\theta}\) :
- \[\mathcal{L}_{\mathrm{vat}}=\frac{1}{ \mid \mathcal{D}_u \mid } \sum_{x \in \mathcal{D}_u} \mathrm{KL}\left(f_\theta(x) \mid \mid f_\theta(x+\Delta x)\right)\]
  where \(\Delta x=\arg \max \operatorname{KL}\left(f_\theta(x) \mid \mid f_\theta(x+\delta)\right)\)

b) Conditional Entropy Minimization (EntMin)

assumption ) unlabeled data indeed has one of the classes that we are training on
adds a loss for unlabeled data that, when minimized,

\(\rightarrow\) Encourages the model to make CONFIDENT predictions on unlabeled datasets
\(\mathcal{L}_{\text {entmin }}=\frac{1}{ \mid \mathcal{D}_u \mid } \sum_{x \in \mathcal{D}_u} \sum_{y \in Y}-f_\theta(y \mid x) \log f_\theta(y \mid x)\).

\(\rightarrow\) consider both loss …. \(\mathcal{L}_u=w_{v a t} \mathcal{L}_{\mathrm{vat}}+w_{\text {entmin }} \mathcal{L}_{\text {entmin }}\)

c) Pseudo-Label

1) Train model only on LABELED data

then, make predictions on UNLABELED data
predictions, whose confidence is above certain threshold

\(\rightarrow\) add it to training data
retrain the model

Twitter Facebook LinkedIn

(paper 23) S4L

Seunghan Lee

S\(^4\)L : Self-Supervised Semi-Supervised Learning

Contents

0. Abstract

1. Introduction

(1) Semi-supervised Learning

(2) Self-supervised Learning

3. Methods

(1) Self-supervised Semi-supervised Learning

a) \(S^4\) L-Rotation

b) \(S^4\) L-Exemplar

(2) Semi-supervised Baselines

a) Virtual Adversarial Traning (VAT)

b) Conditional Entropy Minimization (EntMin)

c) Pseudo-Label

You May Also Enjoy

(paper 23) S4L

Seunghan Lee

S\(^4\)L : Self-Supervised Semi-Supervised Learning

Contents

0. Abstract

1. Introduction

2. Related Work

(1) Semi-supervised Learning

(2) Self-supervised Learning

3. Methods

(1) Self-supervised Semi-supervised Learning

a) \(S^4\) L-Rotation

b) \(S^4\) L-Exemplar

(2) Semi-supervised Baselines

a) Virtual Adversarial Traning (VAT)

b) Conditional Entropy Minimization (EntMin)

c) Pseudo-Label

You May Also Enjoy