[Paper Review] 14. Analyzing and Improving the Image Quality of StyleGAN

Contents

  1. Abstract
  2. Introduction
  3. Removing Normalization Artifacts
    1. Generator architecture revised
    2. Instance Normalization Revisited


0. Abstract

StyleGAN : SOTA

  • data-driven unconditional generative image modeling


This paper proposes changes in both model architecture & training methods

  • redesign the generator normalization
  • revisit progressive growing
  • regularize generator


Result

  • improved image quality

  • \(G\) becomes easier to invert

    ( makes it possible to reliably attribute a generated image to a particular network )


1. Introduction

StyleGAN

Mapping network \(f\) :

  • transforms \(\mathbf{z} \in \mathcal{Z}\) into intermediate latent code \(\mathbf{w} \in \mathcal{W}\)

  • then, affine transform produces styles that control the layers of synthesis network \(g\),

    via AdaIN(Adaptive Instance Normalization)

  • stochastic variation


This paper

  • solely focus on \(\mathcal{W}\),

    as it is the relevant latent space from synthesis network’s point of view

  • (1) redesign the normalization used in \(G\) ( which removes the artifacts )

  • (2) analyze artifacts, related to progressive growing

    • propose an alternative design!

      ( = training starts by focusing on LOW-resolution…then progressively shifts to HIGH-~ )


2. Removing Normalization Artifacts

observe that StyleGAN exhibit characteristic blob-shaped artifacts


[ Problem of AdaIN operation ]

  • AdaIN : normalize mean/std of feature map separately,

    \(\rightarrow\) destroy any info, found in the magnitudes of the features relative to each other

    ( proof(?) : when normalization is removed… droplet disappears! )


2-1. Generator architecture revised

Revise several details of StyleGAN

figure2

Figure 2(a)

  • original Style GAN

Figure 2(b)

  • expand 2(a) in full detail ( by showing weights & biases )
  • [1] breaking AdaIN operation in 2 parts
    • 1) Normalization
    • 2) Modulation ( = linear transformation )
  • [2] Style Block
    • 1) [1]-2 Modulation
    • 2) Convolution
    • 3) Normalization
  • original StyleGAN : applies bias & noise within the style block

Figure 2(c)

  • proposed StyleGAN : applies bias & noise OUTSIDE the style block
  • when normalizing both … only “std” is needed ( not “mean” )


2-2. Instance Normalization Revisited

Figure 2(d)

2 key points

  • 1) remove the normalization
  • 2) scale the convolution weights + normalize ( by std )


[ scale the convolution weights ]

\(w_{i j k}^{\prime}=s_{i} \cdot w_{i j k}\).

  • \(w\) and \(w^{\prime}\) : original and modulated weights
  • \(s_{i}\) : scale, corresponding to the \(i\)th input feature map
  • \(j\) and \(k\) : enumerate the output feature maps and spatial footprint of the convolution


[ normalize by std ]

\(w_{i j k}^{\prime \prime}=w_{i j k}^{\prime} / \sqrt{\sum_{i, k} w_{i j k}^{\prime}{ }^{2}+\epsilon}\).

  • std : \(\sigma_{j}=\sqrt{\sum_{i, k} w_{i j k}^{\prime}{ }^{2}}\)

Tags:

Categories:

Updated: