[DL輪読会]Self-Attention Generative Adversarial Networks

467 Views

June 04, 18

#Deep Learning #Self-Attention #Generative Adversarial Networks #SAGANs #Spectral Normalization

スライド概要

2018/06/01
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 84.7K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 59.4K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 54.2K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 38.8K

【拡散モデル勉強会】拡散モデルのサンプラーまとめ

Deep Learning JP 33.3K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 32K

各ページのテキスト

DEEP LEARNING JP [DL Papers] Self-Attention Generative Adversarial Networks Shizuma Kubo, Matsuo Lab http://deeplearning.jp/ 1

http://deeplearning.jp/

書誌情報 • 著者: Han Zhang(Rutgers Univ), Ian Goodfellow(Google Brain) Dimitris Metaxas(Rutgers Univ), Augustus Odena(Google Brain) • submitted on 21 May 2018 on arXiv (https://arxiv.org/abs/1805.08318) 2

Goodfellowさんがツイートしていた GANによるClass-conditionalな画像の生成においてSOTAを達成 3

要点 • Spectral normalizationをGANのgeneratorに適用することで学習が安定することを示した. • Two-timescale update rule(TTUR)を用いることでGANの学習速度が向上することを示した． • Self-Attentionの機構をGANに取り入れたSelf-Attention Generative Adversarial Networks(SAGANs)を提案した． • ImageNetのデータセットについて，IS, FIDの両指標でSOTAの精度を達成した． 4

ConditionalなGANの難しさ • CNNベースのGANは成功しているが，ImageNetの画像の生成のようなマルチクラスでトレーニングする場合（ex. conditional GAN）に難しさがある． Textureによって判断されるカテゴリは綺麗に生成出来る(海，空，景色, etc). Geometryによって判断されるカテゴリは生成が難しい．（数値はFID/FIDは小さいほどよい) 5

ConditionalなGANの難しさ • Convolutionは局所的な受容野(local receptive ﬁeld)を持つため，大局的な部分は層を多層に重ねることによってのみ考慮される． – 小さなモデルでは表現が出来ないし，多層なモデルではパラメータの学習が難しく，学習が不安定になる． – カーネルを大きくして受容野を大きくとっても計算効率は落ちる． • Self-Attentionを使うことで大局的な部分を考慮することと計算効率をよりよいバランスで行うことができる． 6

比較モデル（conditionalGAN） • 既存手法として比較するのはACGANs(Odena et al, 2017)と SNGAN-projection(Miyato et al, 2018) ACGANs SNGAN-projection cGANs with Projection Discriminator (Miyato et al, 2018 の図より) 7

SNGAN-projection DiscriminatorのResblock Conv層にSpectral normalization GeneratorのResblock 8

Source-Target-AttentionとSelf-Attention 引用: http://deeplearning.hatenablog.com/entry/transformer 9

10.

Self-Attention 10

11.

Self-Attention 11

12.

Self-Attention ガンマは0で初期化する．はじめはより簡単な局所的な部分のウェイトを多くして徐々に大局的な部分まで見れるようにガンマを大きくする． 12

13.

Techniques to stabilize (Spectral normalization/TTUR) • SNGAN-projection(Miyato et al, 2018)でdiscriminatorに適用していた spectoral normalizationをgeneratorにも適用した． – SNGAN-projection(Miyato et al, 2018)では，DiscriminatorのLipschitzs定数をコントロールし安定性を向上させた． – Generatorもspectoral normalizationの恩恵を受けることを示した． – Generatorの1更新に対するDiscriminatorの更新回数を減らすことが可能となり，計算量をへらすことができた． – 学習が安定することも示した． • Two-timescale update rule (TTUR) (Heusel et al, 2017)を適用した． – Generator側の学習率をDiscriminator側の学習率より小さくするとナッシュ均衡解に収束することが示せる． – 学習率: generator 0.0004/discriminator 0.0001 13

14.

Techniques to stabilize の評価 (a) Spectral Normalization Only Discriminator (b) Spectral Normalization Discriminator/Genarator (c) Spectral Normalization and TTUR 14

15.

Self-attention評価 • Low levelの特徴マップにattentionを適用させるよりhigh levelの特徴マップに attentionを適用させたほうが効果が大きい． • 同等のパラメータを持つResidualと比較してもattentionが機能していることが分かる． 15

16.

Attentionの効果 • Attentionの重みを可視化．Attentionを使用した最後のGeneratorの層の Attention mapを可視化している． • 左の画像の各色の点がそれぞれ各地点のAttention mapに対応する． • 各地点と色やテクスチャが近い部分にAttentionが施されている． 16

17.

既存手法との比較 • SAGANと既存手法のInception Score/FIDの比較 17

18.

まとめ • Spectral normalizationをGANのgeneratorに適用することで学習が安定することを示した. • Two-timescale update rule(TTUR)を用いることでGANの学習速度が向上することを示した． • Self-Attentionの機構をGANに取り入れたSelf-Attention Generative Adversarial Networks(SAGANs)を提案した． • ImageNetのデータセットについて，IS, FIDの両指標でSOTAの精度を達成した． 17

19.

参考文献 • Self-Attention Generative Adversarial Networks (本紹介論文) https://arxiv.org/pdf/1805.08318.pdf • cGANs with Projection Discriminator https://arxiv.org/pdf/1802.05957.pdf • Spectral Normalization for Generative Adversarial Networks https://arxiv.org/pdf/1802.05957.pdf • GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium https://arxiv.org/pdf/1706.08500.pdf • [DL輪読会] Spectral Norm Regularization for Improving the Generalizability of Deep Learning/Spectral Normalization for GANs https://www.slideshare.net/DeepLearningJP2016/dl-spectral-normregularization-for-improving-the-generalizability-of-deep-learningspectralnormalization-for-gans 19