[Paper Review] 14. Analyzing and Improving the Image Quality of StyleGAN

Abstract
Introduction
Removing Normalization Artifacts
1. Generator architecture revised
2. Instance Normalization Revisited

0. Abstract

StyleGAN : SOTA

data-driven unconditional generative image modeling

This paper proposes changes in both model architecture & training methods

redesign the generator normalization
revisit progressive growing
regularize generator

Result

improved image quality
\(G\) becomes easier to invert

( makes it possible to reliably attribute a generated image to a particular network )

1. Introduction

StyleGAN

Mapping network \(f\) :

transforms \(\mathbf{z} \in \mathcal{Z}\) into intermediate latent code \(\mathbf{w} \in \mathcal{W}\)
then, affine transform produces styles that control the layers of synthesis network \(g\),

via AdaIN(Adaptive Instance Normalization)
stochastic variation

This paper

solely focus on \(\mathcal{W}\),

as it is the relevant latent space from synthesis network’s point of view
(1) redesign the normalization used in \(G\) ( which removes the artifacts )
(2) analyze artifacts, related to progressive growing
- propose an alternative design!
  
  ( = training starts by focusing on LOW-resolution…then progressively shifts to HIGH-~ )

2. Removing Normalization Artifacts

observe that StyleGAN exhibit characteristic blob-shaped artifacts

[ Problem of AdaIN operation ]

AdaIN : normalize mean/std of feature map separately,

\(\rightarrow\) destroy any info, found in the magnitudes of the features relative to each other

( proof(?) : when normalization is removed… droplet disappears! )

2-1. Generator architecture revised

Revise several details of StyleGAN

Figure 2(a)

original Style GAN

Figure 2(b)

expand 2(a) in full detail ( by showing weights & biases )
[1] breaking AdaIN operation in 2 parts
- 1) Normalization
- 2) Modulation ( = linear transformation )
[2] Style Block
- 1) [1]-2 Modulation
- 2) Convolution
- 3) Normalization
original StyleGAN : applies bias & noise within the style block

Figure 2(c)

proposed StyleGAN : applies bias & noise OUTSIDE the style block
when normalizing both … only “std” is needed ( not “mean” )

2-2. Instance Normalization Revisited

Figure 2(d)

2 key points

1) remove the normalization
2) scale the convolution weights + normalize ( by std )

[ scale the convolution weights ]

\(w_{i j k}^{\prime}=s_{i} \cdot w_{i j k}\).

\(w\) and \(w^{\prime}\) : original and modulated weights
\(s_{i}\) : scale, corresponding to the \(i\)th input feature map
\(j\) and \(k\) : enumerate the output feature maps and spatial footprint of the convolution

[ normalize by std ]

\(w_{i j k}^{\prime \prime}=w_{i j k}^{\prime} / \sqrt{\sum_{i, k} w_{i j k}^{\prime}{ }^{2}+\epsilon}\).

std : \(\sigma_{j}=\sqrt{\sum_{i, k} w_{i j k}^{\prime}{ }^{2}}\)

Twitter Facebook LinkedIn

[Paper Review] 14.(G arch) Analyzing and Improving the Image Quality of StyleGAN

Seunghan Lee