[Paper Review] 21. Editing in Style

Abstract
Related Works
1. GAN-based Image Editing
Local Semantics in Generative Models
1. Feature Factorization
2. Local Editing

0. Abstract

ability to control & condition the output is still limited

\(\rightarrow\) introduce a simple & effective method for making local, semantically-aware edits to a target output image

goal is NOT to propose new GAN,

BUT to offer local editing method for its output

( by changing the style of specific objects )

(1) GAN-based Image Editing

semantic image editing

1) latent code-based : for GLOBAL attribute editing
2) activation-based : for LOCAL ~

Latent code-based

learn a manifold in latent space
perform semantic edits, by traversing paths along this manifold
example )
- use AE to disentangle image into semantic subspaces & reconstruct the image
- global changes in color/light/;pose/…

Activation-based

directly manipulate specific SPATIAL positions on

activation tensor, at certain CNN layer
example )
- GAN Dissection controls the presence/absence of objects at given position

This paper focuses on latent code-based approach for local editing

neither rely on external supervision
nor involves complex spatial blending operations

2. Local Semantics in Generative Models

(1) Feature Factorization

DFF (Deep Feature Factorization)

explains CNN’s learned representation, via salicency maps
with this, it has been shown that…

CNNs learns features that act as (1) semantic object & (2) object-part detectors

Inspired by DFF, conduct a similar analysis

apply spherical k-means to \(C\)-dim activation vectors

( activation tensor : \(\mathbf{A} \in \mathbb{R}^{N \times C \times H \times W}\) )
clustering generates a tensor of cluster membership

( membership : \(\mathbf{U} \in\{0,1\}^{N \times K \times H \times W}\) )
- \(K\) : user-defined

Result

at certain layers of generator,

cluster correspond well to semantic objects & parts
each pixel in the heatmap is color-coded to indicate cluster index

\(M_{k, c}\) : Contribution of channel \(c\) to semantic cluster \(k\)

via cluster memberships, \(\mathbf{U} \in\{0,1\}^{N \times K \times H \times W}\)
\(\boldsymbol{M}_{k, c}=\frac{1}{N \dot{H} \dot{W}} \sum_{n, h, w} \mathbf{A}_{n, c, h, w}^{2} \odot \mathbf{U}_{n, k, h, w}\).
- feature maps of \(\mathbf{A}_{l}\) ~ N(0,1)
  
  \(\rightarrow\) contribution : 0~1

(2) Local Editing

a) Style GAN review

latent vector \(z\) ~ prior
\(z\) is transformed to intermediate latent vector \(\boldsymbol{w} \in \mathbb{W}\)

\(\rightarrow\) show better *disentanglement properties
\(\mathbf{A} \in \mathbb{R}^{(C \times H \times W)}\) : input to a convolutional layer
\(w\) : alters feature maps, via a per-layer style
motivated by style transfer

b) Conditioned Interpolation

Notation

target image : \(S\)
reference image : \(R\)

would like to transfer the appearance of a specified local object/part from \(R\) to \(S\)

[ Global transfer ]

\(\sigma^{G}=\sigma^{S}+\lambda\left(\sigma^{R}-\sigma^{S}\right)\).

where \(0 \leq \lambda \leq 1\)

[ Selective local editing ]

control style interpolation with matrix transfomration
\(\sigma^{G}=\sigma^{S}+Q\left(\sigma^{R}-\sigma^{S}\right)\).
- \(Q\) : diagonal matrix ( where \(q \in[0,1]^{C}\) )
  
  ( \(q\) : query vector )

c) Choosing the query

best query \(q\) = one that favor channels that..

affect the ROI (region of interest)
while ignoring channels that have an effect outside the ROI

[ Simple Approach ]

use \(M_{k^{\prime}, c}\)
clipping \(\boldsymbol{q}_{c}=\min \left(1, \lambda \boldsymbol{M}_{k^{\prime}, c}\right)\)

where \(\boldsymbol{q}_{c}\) is the \(c\)-th channel element of \(\boldsymbol{q}\),
updates all channels at same time

[ Proposed Approach ]

sequential approach
first set the most relevant channel to the maximum slope of 1,

before raising the slope of the second-most relevant, …
solve this by sorting channels based on \(M_{k^{\prime}}\)

& greedily assigning \(q_c=1\) to most relevant channels

\[\begin{gathered} \underset{\boldsymbol{q}_{c}}{\arg \min } \boldsymbol{q}_{c}\left[\boldsymbol{M}_{k^{\prime}, c}-\rho\left(1-\boldsymbol{M}_{k^{\prime}, c}\right)\right] \\ \quad \quad \quad \text { s.t. } \sum_{c=1}^{C} \boldsymbol{q}_{c}\left(1-\boldsymbol{M}_{k^{\prime}, c}\right) \leq \epsilon , \quad 0 \leq \boldsymbol{q}_{c} \leq 1 \end{gathered}\]

Twitter Facebook LinkedIn

[Paper Review] 21.(Analysis,Manipulation) Editing in Style

Seunghan Lee

[Paper Review] 21. Editing in Style

Contents

0. Abstract

(1) GAN-based Image Editing

Latent code-based

Activation-based

2. Local Semantics in Generative Models

(1) Feature Factorization

\(M_{k, c}\) : Contribution of channel \(c\) to semantic cluster \(k\)

(2) Local Editing

a) Style GAN review

b) Conditioned Interpolation

c) Choosing the query

You May Also Enjoy

[Paper Review] 21.(Analysis,Manipulation) Editing in Style

Seunghan Lee

[Paper Review] 21. Editing in Style

Contents

0. Abstract

1. Related Works

(1) GAN-based Image Editing

Latent code-based

Activation-based

2. Local Semantics in Generative Models

(1) Feature Factorization

\(M_{k, c}\) : Contribution of channel \(c\) to semantic cluster \(k\)

(2) Local Editing

a) Style GAN review

b) Conditioned Interpolation

c) Choosing the query

You May Also Enjoy