Self-Labeling via Simultaneous Clustering and Representation LearningPermalink


ContentsPermalink

  1. Abstract
  2. Introduction
  3. Method
    1. Self-labeling


0. AbstractPermalink

combining (1) clustering + (2) representation learning

doing it naively…leads to degenerate solutions


solution : propose a method, that maximizes the information between labels & input data indicies


1. IntroductionPermalink

self-supervision tasks : mostly done by new pretext task

But, task of classification is sufficient for pre-training

( of course…. provided that labels are given )

focus on obtaining the labels automatically ( with self-labeling algorithm )


Degeneration problem ?

solve by adding the constraint, that the labels must induce an equipartition of the data ( = maximizes the information between data indicies & labels )


2. MethodPermalink

(1) self-labeling method

(2) interpret the method as optimizing laels & targets of CE loss


(1) Self-labelingPermalink

Notation :

  • x=Φ(I) : DNN
    • map images (I) to feature vectors (xRD )
  • I1,,IN : Image data
  • y1,,yN{1,,K} : Image labels
  • h:RDRK : classification head
  • p(y=xi)=softmax(hΦ(xi)) : class probabilities


Train model & head parameters, with average CE loss

  • E(py1,,yN)=1NNi=1logp(yixi).

requires labelled dataset

( if not, requires a self-labeling mechanism )


[ Self-labeling mechanism ]

  • achieved by jointly optimizing , w.r.t

    • (1) model hΦ
    • (2) labels y1,,yN
  • but if fully unsupervised …. leads to degenerate solution

    ( = trivially minimized by assigning all data points to a single (arbitrary) label )


Solution?

  • first, encode the labels as posterior distn q(yxi)

    • (Before) E(py1,,yN)=1NNi=1logp(yixi).
    • (After) E(p,q)=1NNi=1Ky=1q(yxi)logp(yxi).

    ( optimizing q = reassigning labels )

  • to avoid degeneracy…

    add the constraint that the label assignments must partition the data in equally-sized subsets

  • objective function :

    • minp,qE(p,q) subject to y:q(yxi){0,1} and Ni=1q(yxi)=NK.

Categories: ,

Updated: