Structure Learning via Deep Unfolding for Compressed Sensing Algorithms

>100 Views

May 29, 24

#圧縮センシング #深層学習 #ISTA #DARTS #ニューラルアーキテクチャ探索

スライド概要

長久紘士

@hisa_roborobo

スライド一覧

大阪大学大学院　基礎工学研究科　システム創成専攻　飯國研究室所属　修士2年　主に機械学習・深層学習に興味あり

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

ダウンロード

関連スライド

スパース信号復元に対する深層展開における教師あり学習と教師なし学習の比較

長久紘士 5K

圧縮センシングアルゴリズムの構造学習

長久紘士 362

情報を考慮した画像の縮小（Content Aware Scaling）

長久紘士 >100

学振特別研究員になるために～2025年度申請版

学振 dc1 dc2 jsps pd

大上雅史 775.7K

研究に使える便利なフリーソフト ImageJ

imagej 放射線技師

片山豊 363.1K

StampFlyで学ぶマルチコプタ制御

伊藤恒平 362K

各ページのテキスト

Structure Learning via Deep Unfolding for Compressed Sensing Algorithms Koshi Nagahisa Iiguni Lab Graduate School of Engineering Science, Osaka University

Outline • Introduction • Related work • Proposed method • Experiment • Conclusion 2/17

Introduction｜Compressed Sensing[1] 3/17 Estimate a sparse signal 𝒙∗ ∈ ℝ𝑵 (where most elements are zero) from the data 𝒚 ∈ ℝ𝑀 obtained by linear observation 𝒚 = 𝑨𝒙∗ + 𝒆（𝑀 < 𝑁) Observation vector Sparse vector you want to estimate Applications : 𝑀 𝒆 MRI (Magnetic Resonance Imaging) [2] Image recovered by Image recovered 1/4 data collection compressed sensing from complete data 𝑁 Estimate [1] D. L. Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, April 2006, doi: 10.1109/TIT.2006.871582. [2] M. Lustig, D. L. Donoho, J. M. Santos and J. M. Pauly, "Compressed Sensing MRI," IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 72-82, March 2008, doi: 10.1109/MSP.2007.914728.

Optimization problem for Compressed Sensing 4/17  Compressed sensing can be performed by solving the optimization problem. objective function ෝ = argmin 𝒙 𝒙 ∈ ℝ𝑁 Data fidelity term 𝒚∈ℝ𝑀 𝑨 ∈ ℝ 𝑀×𝑁 𝒙 ∈ℝ𝑁 𝑀<𝑁 1 𝒚 − 𝑨𝒙 𝟐𝟐 + 𝜆 𝒙 1 2 The aim is to minimize the difference between 𝒚 and 𝑨𝒙 to ensure that 𝒚 = 𝑨𝒙. Regularization term We estimate 𝒙 that ensures sparsity by minimizing the L1 norm of 𝒙 𝑁 𝒙 𝟏 = ෍ |𝑥𝑖 | 𝑖=1  regularization parameter 𝜆 adjusts the influence of the data fidelity term and the regularization term.

ISTA (Iterative Shrinkage Thresholding Algorithm) 5/17  ISTA gives the solution to the optimization problem through iterative computations. ISTA step size 𝒓(𝑡) = 𝒙(𝑡) − 𝛾𝑨⊤ 𝑨𝒙(𝑡) − 𝒚 ←Gradient descent step derived from 𝒚 − 𝑨𝒙 𝟐𝟐 𝒙(𝑡+1) = 𝑆𝜆𝛾 𝒓 𝑡 ←Shrinkage step derived from 𝜆 𝒙 1 𝑆𝜆𝛾 (𝑟) 𝑆𝜆𝛾 (⋅) ： Soft thresholding function

ISTA (Iterative Shrinkage Thresholding Algorithm)  The convergence properties of ISTA are greatly affected by the step size 𝛾. Convergence properties of ISTA (𝜆 = 10) Converges faster diverge 6/17

Related Work | Deep Unfolding[3] 7/17  Applying deep learning to signal reconstruction algorithms can result in significantly superior convergence properties compared to conventional methods. process A process B + process C Signal flow graph of iterative algorithm Neural network [3]K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proc. the 27th International Conference on International Conference on Machine Learning, Jun. 2010, pp. 399–406.

Deep Unfolding 8/17  Unfolding the iterative algorithm, we utilize deep learning techniques to properly learn the parameters 𝛾 0 , 𝛾 1 , …. Input 𝒚 Input 𝒚 Loss function Unfold in the time direction output 𝒙 ISTA LISTA (Learned ISTA) We can learn the step size for each iteration through backpropagation.

Deep Unfolding  The step sizes in LISTA take appropriate values for each iteration, resulting in faster convergence compared to ISTA with fixed step sizes 𝜆 = 10 9/17

10.

Deep Unfolding 10/17 Problem : Deep unfolding has a fixed network structure and has no flexibility. And it is unclear whether this structure has the best convergence properties. Conventional deep unfolding has a fixed structure Ideal deep unfolding Loss function Loss function I want to learn the structure with the step sizes. Originally it's , but may perform better.

11.

NAS (Neural Architecture Search) [4] 11/17  NAS learns the structure of the neural network itself along with the parameters of the neural network. Designing the structure of a neural network Setting hyperparameters Training the parameters of the neural network The process of a conventional neural network The process of NAS  Naive NAS constructs a lot of models to select the optimal one, resulting in high performance but taking an enormous computational cost. [4]B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” arXiv preprint arXiv:1611.01578, 2016.

12.

DARTS (Differentiable Architecture Search)[5] 12/17  Making the search space differentiable enables efficient execution of NAS. node operation output … input 3 𝑜1 (𝑥) 𝑜2 (𝑥) 𝑜3 (𝑥) conv 3 × 3 𝑤2 pooling ReLU 𝑜ҧ 𝑥 = ෍ 𝑤𝑘 × 𝑜𝑘 𝑥 𝑤1 ＋ 𝑤3 𝑜(𝑥) ҧ 𝑘=1 The structure parameter 𝛼𝑘 for the operation 𝑜𝑘 𝑤𝑘 = exp(𝛼𝑘 ) σ3𝑘 ′=1 exp(𝛼𝑘 ′ ) (𝑤1 + 𝑤2 + 𝑤3 = 1) [5] Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "DARTS: Differentiable Architecture Search." International Conference on Learning Representations. 2018.

13.

Proposed method 13/17  We apply DARTS to deep unfolding in order to more suitable structures for compressed sensing. Proposed algorithm：AS-ISTA（Architecture Searched-ISTA） 𝑡 exp 𝛽𝑟,𝑘 (𝑡) 𝑤𝑟,𝑘 = 𝒓(𝑡) = 𝑤𝑟,1 × 𝑓 𝒙 𝑡 (𝑡) + 𝑤𝑟,2 × 𝑔 𝒙 𝑡 (𝑡) + 𝑤𝑥,2 × 𝑔 𝒓 𝑡 𝒙(𝑡+1) = 𝑤𝑥,1 × 𝑓 𝒓 𝑡 (𝑡) 𝑡 σ2𝑘 ′ =1 exp(𝛽𝑟,𝑘 ′) (𝑡) (𝑡) (𝑡) 𝑤𝑥,𝑘 = Gradient descent step： 𝑓 𝒙 = 𝒙 − 𝛾 (𝑡) 𝑨⊤ 𝑨𝒙 − 𝒚 (𝑘 = 1,2) exp(𝛽𝑥,𝑘 ) 𝑡 σ2𝑘 ′ =1 exp(𝛽𝑥,𝑘 ′) (𝑘 = 1,2) Shrinkage step： 𝑔 𝒙 = S𝜆𝛾(𝑡) 𝒙 𝑡 𝑡 𝑡 𝑡 Structure parameter: 𝛽𝑟,1 , 𝛽𝑟,2 , 𝛽𝑥,1 , 𝛽𝑥,2 Step size: 𝛾 (𝑡)

14.

Proposed method 14/17  Appearance of AS-ISTA structure  The gradient descent step 𝑓 and the reduction step 𝑔 are connected in parallel.  We learn the structure parameters and step sizes alternately. structure parameter Input 𝒚 step size Loss function

15.

Experiment｜Simulation Settings 15/17  Generation of training data 75 𝒚 = 𝑨 ～𝑁 0,1 × 𝒙∗ + 𝒆 𝒆 ～𝑁 0, 𝜎 2 （Signal to Noise ratio 10dB） 150 Percentage of non-zero components ：8% non-zero elements ～𝑁 0,1  Settings of deep learning  Maximum iterations : 100 1 ෝ 1 (object function of ISTA)  Loss function : 𝒚 − 𝑨ෝ 𝒙 𝟐𝟐 + 𝜆 𝒙 2 (𝑡) (𝑡) 𝑡 𝑡  Initial values of the structure parameters : 𝛽𝑟,1 = 𝛽𝑟,2 = 𝛽𝑥,1 = 𝛽𝑥,2 = 0 (𝑡) (𝑡) (𝑡) (𝑡) 𝑤𝑟,1 = 𝑤𝑟,2 = 𝑤𝑥,1 = 𝑤𝑥,2 = 0.5

16.

Result 16/17  Compare the estimation accuracy (MSE : Mean Squared Error) at each iteration of ISTA, LISTA with only step size 𝛾 learned by conventional deep unfolding, and ASISTA (proposed) 𝜆 = 1.0 𝜆 = 0.1 MSE MSE 𝜆 = 10  AS-ISTA converges most quickly to the ISTA estimated solution. It is also found to converge within 100 iterations for small regularization parameter λ.  We find that the proposed deep unfolding can learn parameters with better convergence properties than conventional deep unfolding.

17.

Conclusion 17/17  Summery  ISTA, the algorithm that performs compressed sensing, can be applied to deep learning techniques to learn parameters that has good convergence properties.  DARTS achieves structural learning by connecting operations in parallel and learning weights.  Compared to conventional deep unfolding, the proposed method incorporating structural learning (DARTS) is faster in convergence speed and shows good convergence properties at various λ.  Future work  I will apply proposed method to other compressed sensing algorithms to see whether I can get proper structural learning.  I will see if the results obtained from this sparse signal reconstruction problem can be applied to actual image processing problems.

18.

Appendix｜Comparison of learned step sizes 18/17  step size of AS-ISTA exhibits a zigzag pattern, gradually decreasing as it converges. 𝜆 = 10 𝜆=1 𝜆 = 0.1

19.

Appendix｜learned structure of AS-ISTA 19/17 These graphs show that the weights of the gradient descent step. 𝜆 = 10 Gradient descent step 𝜆=1 𝜆 = 0.1 Shrinkage step