>100 Views
January 27, 21
スライド概要
2021/01/08
Deep Learning JP:
http://deeplearning.jp/seminar-2/
DL輪読会資料
DEEP LEARNING JP [DL Papers] Toward Fast and Stabilized GAN Training for Highfidelity Few-shot Image Synthesis 1 Takumi Ohkuma, Nakayama Lab M2 http://deeplearning.jp/
2 ( • ) • Few-shot Learning, • DL • • GAN , etc
3 • Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis • ICLR2021 Under Review • Anonymous • URL https://openreview.net/forum?id=1Fqg133qRaI
4 GAN • • • • Light-weight GAN GPU 1 excitation module 1024 1024 GAN Generator Self-supervised Discriminator 100 skip-layer channel-wise
5 FFHQ [1] Size:1024 1024 :10 (RTX 2080-Ti 1 ) Nature Photograph (unsplash.com) Size:1024 1024 :20 (RTX 2080-Ti 1 )
6 • GAN e.g., BigGAN[2], StyleGAN2[3] • • • [4] [5]
7 GAN GAN 1. 2. 3. ← ← ← Discriminator D
8 Light-weight GAN • Light-Weight GAN • Light-Weight GAN • G (Generator) Generator Skip-layer channel-wise excitation module • • Self-supervised Discriminator • Discriminator
9 Skip-layer Channel-wise excitation module Up-Sampling • • • ResBlock[6] connection module (SLE Module) G Skip- Skip-layer channel-wise excitation
10 SLE Module SLE Module ( ) ResBlock 1. channel-wise • ResBlock element-wise • Channel-wise Conv ( ) 1 • Element-wise Conv 2. Connection Skip-layer • • Skip-connection Channel-wise
11 Generator G • Upsample, Conv, BatchNorm, ) GLU( • 1 Conv SLE Module • • 2 • 4 128 8 channel- wise • 512, 1024 3
12 GAN • GAN Mode-collapse • • Discriminator • WGAN GPU • • GAN • Self-supervised Discriminator [7]
13 Self-supervised Discriminator • Discriminator Encoder Decoder • Decoder Decoder
14 Discriminator Real image • 5% • 16% • • • • Decoder • 1024% CNN logit 1 8% 1/8 Real/Fake Decoder 128% 128% Down Sampling Crop )*+,- ) 8% 16% Up sampling + Conv + BatchNorm + GLU Real/Fake Loss Decoder 4 8% Crop
15 Adversarial Loss, • • Loss Adversarial Loss G D D Reconstruction Loss D
16 (1000 • ) Baseline, StyleGAN2 • Baseline • • • • • DCGAN [8] Spectral Normalization [9] Exponential-moving-average on Generator [10] Differentiable-augmentation [11] GLU ( ) [12] Baseline SLE Module, 256$ • • 1024$ 8 StyleGAN2 • @0.5 Training time, vram (GPU )
17 256$ • (FID score, • StyleGAN2 20 GPU hours • StyleGAN2 Obama FFHQ Ablation Study • • ) 5 GPU hours Finetuning SLR Module (Skip), mode-collapse (decoder)
18 2 1024% • • vram RTX TITAN • • StyleGAN2 • Decoder StyleGAN2
19 Real Data StyleGAN2
20 Real Data StyleGAN2
21 • • Style GAN (5 day 2 ) Baseline • 1 GPU Days 8 StyleGAN2 • • • • 16 10GPU Days StyleGAN2 FFHQ FFHQ … Baseline
22 Back-Tracking • Back-Tracking • •
23 Style-mixing • SLE Module StyleGAN2 Style-mixing
24 • GAN • Generator SLE Module, Discriminator • 1 GPU Day 100
25 1024 • • • • • StyleGAN2 • StyleGAN2 • • • StyleGAN2 V100 8 Lightweight GAN 80 GPU Day
26 1. Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proc. of CVPR, 2019. 2. Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018. 3. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. Analyzing and improving the image quality of stylegan. In Proc of CVPR, 2020. 4. Sangwoo Mo, Minsu Cho, and Jinwoo Shin. Freeze discriminator: A simple baseline for fine-tuning gans. arXiv preprint arXiv:2002.10964, 2020. 5. Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Training generative adversarial networks with limited data. arXiv preprint arXiv:2006.06676, 2020. 6. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proc. of CVPR 2016. 7. Ngoc-Trung Tran, Viet-Hung Tran, Bao-Ngoc Nguyen, Linxiao Yang, and Ngai-Man Man Cheung. Self-supervised gan: Analysis and improvement with multi-class minimax game. Advances in NeurIPS, 2019. 8. Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
27 9. Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018. 10. Yasin Yazıcı, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Georgios Piliouras, and Vijay Chandrasekhar. The unusual effectiveness of averaging in gan training. arXiv preprint arXiv:1806.04498, 2018. 11. Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, and Song Han. Differentiable augmentation for data-efficient gan training. arXiv preprint arXiv:2006.10738, 2020. 12. Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. In Proc of. ICML, 2017.