Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Abstract
Auxiliary augmentation-aware self-supervision
1. Summary
2. Details

0. Abstract

Learning representations invariant to DA

\(\leftrightarrow\) However, might be harmful to certain downstream tasks

( if they rely on the characteristics of DA )

Solution : propose AugSelf

optimize an auxiliariy self-supervised loss
learns the difference of augmentation params

1. Auxiliary augmentation-aware self-supervision

(1) Summary

AugSelf

by predicting the difference between 2 augmentation params \(\omega_1\) & \(\omega_2\)
encourages self-supervised RL ( ex. SimCLR, SimSiam ) to preserve augmentation-aware information,

that could be useful to downstream tasks
negligible additional training cost

\(\rightarrow\) Add an auxiliary self-supervision loss, which learns to predict the difference between augmentation parameters of 2 randomly augmented views

(2) Details

Notation

\(t_\omega\) : augmentation function
- composed of different types of augmentations
- augmentaiton parameter : \(\omega=\left(\omega^{\text {aug }}\right)_{\operatorname{aug} \in \mathcal{A}}\)
  - \(\mathcal{A}\) : the set of augmentations
  - \(\omega^{\text {aug }}\) : augmentation-specific parameter
\(\mathbf{v}_1=t_{\omega_1}(\mathbf{x})\) and \(\mathbf{v}_2=t_{\omega_2}(\mathbf{x})\) : 2 randomly augmented views

Loss function :

\(\mathcal{L}_{\text {AugSelf }}\left(\mathbf{x}, \omega_1, \omega_2 ; \theta\right)=\sum_{\text {aug } \in \mathcal{A}_{\text {Augself }}} \mathcal{L}_{\text {aug }}\left(\phi_\theta^{\text {aug }}\left(f_\theta\left(\mathbf{v}_1\right), f_\theta\left(\mathbf{v}_2\right)\right), \omega_{\text {diff }}^{\text {aug }}\right)\).
- \(\mathcal{A}_{\text {Augself }} \subseteq \mathcal{A}\) : set of augmentations for augmentation-aware learning
- \(\omega_{\text {diff }}^{\text {aug }}\) : difference between two augmentation-specific parameters \(\omega_1^{\text {aug }}\) and \(\omega_2^{\text {aug }}\)
- \(\mathcal{L}_{\text {aug }}\) : augmentation specific loss

Easy to incorporate AugSelf into SOTA unsupervised learning methods

with negligible additional training cost

ex) SimSiam + SelfAug

loss function : \(\mathcal{L}_{\text {total }}\left(\mathbf{x}, \omega_1, \omega_2 ; \theta\right)=\mathcal{L}_{\text {SimSiam }}\left(\mathbf{x}, \omega_1, \omega_2 ; \theta\right)+\lambda \cdot \mathcal{L}_{\text {AugSelf }}\left(\mathbf{x}, \omega_1, \omega_2 ; \theta\right)\).

Random Cropping : \(\omega_{\text {diff }}^{\text {crop }}=\omega_1^{\text {crop }}-\omega_2^{\text {crop }}\)
Random Horizontal Flipping : \(\omega_{\text {diff }}^{\text {flip }}=\mathbb{1}\left[\omega_1^{\text {flip }}=\omega_2^{\text {flip }}\right]\)
Color Jittering : \(\omega_{\text {diff }}^{\text {color }}=\omega_1^{\text {color }}-\omega_2^{\text {color }}\)
- normalize all intensities into [0,1] …….. \(\omega^{\text {color }} \in[0,1]^4\)

Twitter Facebook LinkedIn

(paper 39) AugSelf

Seunghan Lee

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Contents

0. Abstract

1. Auxiliary augmentation-aware self-supervision

(1) Summary

(2) Details

You May Also Enjoy