【輪読会】Learning Continuous Image Representation with Local Implicit Image Function (CVPR2021)

543 Views

November 19, 21

#deep learning #Deep Learning #Super-resolution #Implicit Neural Representation #Image Enhancement #Neural Networks

スライド概要

2021/11/19
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 87.3K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 59.9K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 58.4K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 41.4K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 37.9K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 37.3K

各ページのテキスト

DEEP LEARNING JP [DL Papers] Learning Continuous Image Representation with Local Implicit Image Function (CVPR2021) Presenter: Kazutoshi Akita (Toyota Technological Institute, IntelligentInformation Media Lab) http://deeplearning.jp/ 1

http://deeplearning.jp/

前提知識 • 三次元形状の連続関数表現 𝑎𝑥 2 + 𝑏𝑦 2 + 𝑐𝑧 2 = 𝑑 引用：http://sssiii.seesaa.net/article/407308186.html 2

http://sssiii.seesaa.net/article/407308186.html

前提知識 • 三次元形状の連続関数表現 𝑎1 𝑥 2 + 𝑏1 𝑦 2 + 𝑐1 𝑧 2 = 𝑑1 𝑎2 𝑥 2 + 𝑏2 𝑦 2 + 𝑐2 𝑧 2 = 𝑑2 𝑎3 𝑥 2 + 𝑏3 𝑦 2 + 𝑐3 𝑧 2 = 𝑑3 ・・・引用：Interpolating and Approximating Implicit Surfaces from Polygon Soup - U.C. Berkeley Computer Graphics Research 3

http://graphics.berkeley.edu/papers/Shen-IAI-2004-08/

前提知識 • Implicit neural representation 座標からシグナルへのマッピング 𝑎1 𝑥 2 + 𝑏1 𝑦 2 + 𝑐1 𝑧 2 = 𝑑1 𝑎2 𝑥 2 + 𝑏2 𝑦 2 + 𝑐2 𝑧 2 = 𝑑2 𝑎3 𝑥 2 + 𝑏3 𝑦 2 + 𝑐3 𝑧 2 = 𝑑3 ・・・ 0 𝑓θ 𝒙 = ቊ 1 NN（MLP）によって暗に獲得 ⇒ Implicit Neural Representation 4

概要 • 従来の超解像は，CNNの構造上，整数倍の拡大しか不可能 • Implicit Neural Representationを用いた連続関数表現により，任意倍率の拡大が可能な超解像を提案 5

提案手法 • Local implicit image function (LIIF) 𝑠 = 𝑓θ (𝑧, 𝑥) 𝑠: RGB value 𝑧: latent code 𝑥:2D coordinate 𝑥𝑞 𝑀(𝑖) 𝑓θ 𝑧∗ 超解像の定式化 𝐼 𝑖 𝑥𝑞 = 𝑓θ (𝑧 ∗ , 𝑥𝑞 − 𝑣 ∗ ) 𝑧 ∗ : nearest latent code from 𝑥𝑞 𝑣 ∗ : coordinate for 𝑧 ∗ 6

提案手法 • 工夫①：Feature unfolding – 周辺8画素もconcatで統合してlatent codeとする – latent code 𝑧 ∗ をリッチに 𝑥𝑞 𝑓θ concat 𝑧∗ 7

提案手法 • 工夫②：Local ensemble – 最近傍のlatent codeを使うだけでは， latent codeが突然切り替わり不自然 ∗ 𝑧00 𝑆00 ∗ 𝑧01 𝑆01 𝑥𝑞 – 周辺4つのlatent codeでアンサンブル 𝑆10 ∗ 𝑧10 𝑓θ ∗ 𝑧11 8

提案手法 • 工夫③：cell decoding 𝑠 = 𝑓θ (𝑧, 𝑥) 𝑠 = 𝑓θ (𝑧, [𝑥, 𝑐]) 𝑐 = 𝑐ℎ 𝑐𝑤 : height and width of query pixel 1 1 x4超解像であれば， 𝑐 = [4 , 4] 定性的には，拡大倍率で条件付け？ 9

10.

提案手法 • 学習 10

11.

実験結果 • 定量評価学習した倍率（In-distribution）では，MetaSRと同等の性能学習していない倍率（Out-of-distribution）でMetaSRを上回る性能 11

12.

実験結果 • 定性評価学習していない倍率（x30）でも他手法より自然かつ鮮明な超解像が可能 12

13.

実験結果 • 各工夫の効果確認 -c: cell-decodingなし -u: feature unfoldingなし –e: local ensembleなし –d: LIIFのlayerを5⇒3 Cell-decodingにより性能低下する場合ありその他の工夫については利用により性能向上 13

14.

実験結果 • Cell-decodingの定性評価 x30の超解像 ⇒ cell-1/30が適切な設定適切なcell-decodingで鮮明な超解像が可能 14

15.

まとめ • Implicit Neural Representationを用いて画像の連続表現を獲得し，整数倍に限らない拡大が可能な超解像モデルを提案． • 学習した拡大倍率（x1-x4）を超える倍率（e.g. x30）においても高精細な超解像画像を生成 15