Meta-Weight-Net : Learning an Explicit Mapping For Sampling Weighting

Abstract
Introduction
The Proposed Meta-Weight-Net Learning Method
1. Meta Learning Objective
2. MW-Net learning method
3. Algorithm

0. Abstract

DNN : biased training data에 overfit 우려!

\(\rightarrow\) 자주 사용되는 해결책 : re-weighting strategy

Re-weighting strategy

1) mapping from “training loss” to “sample weight”
2) iterate between “weight recalculating” & “classifier updating”

Recent Approaches

need to manually PRE-specify the …

1) weighting function
2) hyperparameters

Proposal

method capable of ADAPTIVELY learning an

EXPLICIT weighting function directly from DATA

( = use MLP with one hidden layer )

1. Introduction

Sampling re-weighting approach

2개의 contradictive ideas for constructing loss-weight mapping

[ 방법 1 ] monotonically INCREASING

크게 틀릴수록 weight \(\uparrow\)
ex) AdaBoost, hard negative mining, focal loss

[ 방법 2 ] monotonically DECREASING

작게 틀릴수록 weight \(\uparrow\)
ex) self-paced learning(SPL), iterative reweighting

Problems with 기존 방법들

(1) need to manually set a specific form of weighting function, based on certain assumptions o training data

\(\rightarrow\) underlying data에 대해 little knowledge 밖에 없는 현 상황상 infeasible

(2) hyper-parameter 또한 정해야

Proposal

“Adaptive” sampling weighting strategy

automatically learn “explicit weighting function” from data
main idea : parameterize the weighting function as MLP
propose “META-WEIGHT-NET”

2. The Proposed Meta-Weight-Net Learning Method

2-1. Meta-learning objective

Notation

training set : \(\left\{x_{i}, y_{i}\right\}_{i=1}^{N}\)
label vector over \(c\) classes : \(y_{i} \in\{0,1\}^{c}\)
\(f(x, \mathbf{w})\) : classifier ( = DNN )
loss function : \(L_{i}^{\text {train }}(\mathbf{w})=\frac{1}{N} \sum_{i=1}^{N}\ell\left(y_{i}, f\left(x_{i}, \mathrm{w}\right)\right)=\ell\left(y_{i}, f\left(x_{i}, \mathbf{w}\right)\right)\)(for simplicity)

WEIGHTED loss :

\(\mathbf{w}^{*}(\Theta)=\underset{\mathbf{w}}{\arg \min } \mathcal{L}^{\text {train }}(\mathbf{w} ; \Theta) \triangleq \frac{1}{N} \sum_{i=1}^{N} \mathcal{V}\left(L_{i}^{\text {train }}(\mathbf{w}) ; \Theta\right) L_{i}^{\text {train }}(\mathbf{w})\).

Meta-Weight-Net (MW-Net)

\(\mathcal{V}\left(L_{i}(\mathbf{w}) ; \Theta\right)\) ,

MLP with one hidden layer ( 100 nodes )
ReLU, 마지막엔 Sigmoid

Meta learning process

MW-Net에 있는 parameter들은 \(M\) 개의 meta data ( \(\left\{x_{i}^{(m e t a)}, y_{i}^{(m e t a)}\right\}_{i=1}^{M}\) ) 를 사용해서 학습

\(\Theta^{*}=\underset{\Theta}{\arg \min } \mathcal{L}^{\operatorname{meta}}\left(\mathbf{w}^{*}(\Theta)\right) \triangleq \frac{1}{M} \sum_{i=1}^{M} L_{i}^{m e t a}\left(\mathbf{w}^{*}(\Theta)\right)\).

2-2. MW-Net learning method

optimal \(\Theta^{*}\) 와 \(\mathbf{w}^{*}\)는 2개의 nested loops of optimization으로 계산한다.

(1) Classifier Learning function ( eq 3)

\(\hat{\mathbf{w}}^{(t)}(\Theta)=\mathbf{w}^{(t)}-\alpha \frac{1}{n} \times \sum_{i=1}^{n} \mathcal{V}\left(L_{i}^{t r a i n}\left(\mathbf{w}^{(t)}\right) ; \Theta\right) \nabla_{\mathbf{w}} L_{i}^{t r a i n}(\mathbf{w}) \mid _{\mathbf{w}^{(t)}}\).

(2) Update parameters of MW-Net ( w.r.t \(\Theta^{*}\) ) ( eq 4)

\(\Theta^{(t+1)}=\Theta^{(t)}- \beta \frac{1}{m} \sum_{i=1}^{m} \nabla_{\Theta} L_{i}^{m e t a}\left(\hat{\mathbf{w}}^{(t)}(\Theta)\right) \mid _{\Theta^{(t)}}\).

(3) Update parameters of classifier ( w.r.t \(\mathbf{w}^{*}\) ) ( eq 5)

\(\mathbf{w}^{(t+1)}=\mathbf{w}^{(t)}-\alpha \frac{1}{n} \times \sum_{i=1}^{n} \mathcal{V}\left(L_{i}^{t r a i n}\left(\mathbf{w}^{(t)}\right) ; \Theta^{(t+1)}\right) \nabla_{\mathbf{w}} L_{i}^{t r a i n}(\mathbf{w}) \mid _{\mathbf{w}^{(t)}}\).

2-3. Algorithm

Twitter Facebook LinkedIn

[meta] (paper 9) Meta-Weight-Net ; Learning an Explicit Mapping For Sampling Weighting

Seunghan Lee

Meta-Weight-Net : Learning an Explicit Mapping For Sampling Weighting

Contents

0. Abstract

Re-weighting strategy

Recent Approaches

Proposal

1. Introduction

Sampling re-weighting approach

Problems with 기존 방법들

Proposal

2. The Proposed Meta-Weight-Net Learning Method

2-1. Meta-learning objective

Meta-Weight-Net (MW-Net)

Meta learning process

2-2. MW-Net learning method

2-3. Algorithm

You May Also Enjoy