Light Roasted Use of Caffe in Yahoo! JAPAN

>100 Views

March 24, 16

#Caffe #Deep Learning #画像認識 #物体検出 #顕著性マップ

スライド概要

2016年3月20日にYahoo! JAPANで開催された「Deep Learning Tokyo 2016」におけるYahoo! JAPAN 山下による画像処理におけるCaffeの活用に関する発表資料となります。

Yahoo!デベロッパーネットワーク

@ydnjp

スライド一覧

2023年10月からSpeaker Deckに移行しました。最新情報はこちらをご覧ください。 https://speakerdeck.com/lycorptech_jp

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

深層学習による自然言語処理入門: word2vecからBERT, GPT-3まで

Yahoo!デベロッパーネットワーク 190.3K

ゼロから始める転移学習

Yahoo!デベロッパーネットワーク 91.8K

ヤフーにおける WebAuthn と Passkey の UX の紹介と考察 #idcon #fidcon

idcon fidcon

Yahoo!デベロッパーネットワーク 81K

OpenID Connectとネイティブアプリを取り巻く仕様と動向 Yahoo! JAPANの取り組み #openid #openid_tokyo

openid openid_tokyo

Yahoo!デベロッパーネットワーク 64.4K

運用業務とスクラムは本当に組み合わせにくいのか︖運用業務が大半を占めるプロダクト開発での試行錯誤

devsumi

Yahoo!デベロッパーネットワーク 42.6K

ヤフーのオンプレ機械学習基盤AIPFについて #ml_kubernetes

ml_kubernetes

Yahoo!デベロッパーネットワーク 33.1K

各ページのテキスト

Light Roasted Use of Caﬀe in Yahoo! JAPAN deep learning use case in Yahoo! JAPAN #1 Naoaki Yamashita Yahoo! JAPAN, Osaka Branch 2016-‐‑‒03-‐‑‒20 http://www.yahoo.co.jp/

http://www.yahoo.co.jp/

Introduction We use Caﬀe widely for the challenges which improve the services of Yahoo! JAPAN with the images. Straight forward use via ﬁne-‐‑‒tuning, (Thanks for BVLC pre-‐‑‒trained models!!) • Image Classiﬁcation Tasks • Image Features Extraction for other ML methods (Ranking, etc) are easy to use and apply to production systems. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 2

Introduction However application of other rich / complicated DNN modes are still limited now… Lack of • Well labeled large domain data • Resources include good skills : training of deep models without pre-‐‑‒ training is still diﬃcult for me.. :'-‐‑‒( • Quality • etc And simplicity and quality are key factors to apply to production system. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 3

Introduction Under these limitations, for production system, we try to use Caﬀe in non-‐‑‒straight forward way with customized models and layers. For example • Case 1. Object Area Ratio Estimation (Face) idea + implementation • Case 2. Saliency Map idea + implementation Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 4

Case 1. Area Ratio Estimation (Face) Area Ratio Estimation (Face): Estimate the area ratio of the speciﬁc objects in an image. Facial Area = 23% Using the output as one of features to improve the results of search engine. We have face detectors, but try to use Caﬀe to make it simple and fast. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 5

Case 1. Area Ratio Estimation (Face) Model: : image, : area ratio, :estimated ratio Loss: optimization: TODO: • Make Training / Test Data ( : image, area ratio) • Implement Loss layer for Caﬀe 6 Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止

Case 1. Area Ratio Estimation (Face) #Training Data = 21000, #Test Data= 5064 Model: Just use BVLC GoogLeNet model GoogLeNet ! 1dim [0,1] ! Implemented Loss layer !Train! Not Bad Add Loss (Train / Test) Test result (prediction – grand truth) Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 7

Case 1. Area Ratio Estimation (Face) Now evaluating how it works in a search engine.. Before After 1 2 3 4 5 6 1 2 3 4 5 6 Large Large Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 Small Small 8

Case 2. Saliency Map Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. (scholarpedia) " We use this map to crop a image for several aspect ratios. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 9

10.

11.

Case 2. Saliency Map Ex) Recommendation timeline on Yahoo! JAPANʼ’s news app. Recommendation 2:1 1:1 4:3 We need to make proper images for several devices. 11 Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止

12.

Case 2. Saliency Map Traditional image processing methods are simple and light but not object aware. ! shape / color / texture aware Ex) “Boolean Map Saliency (BMS)” Zhang, Sclaroﬀ (ICCV 2013) Good performance on MIT300 / CAT2000 benchmark. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 12

13.

Case 2. Saliency Map To make this object-‐‑‒aware " Use CNN! As you know, convolutional ﬁlter values seem to have saliency. incepction_̲4a/output inception_̲5b/output Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 13

14.

Case 2. Saliency Map Q1: What Models / Training Data are proper to exploit ﬁlter images? • Broad image categories: Portrait, Landscape, Animal, Sports, Chart/Graph, drawing, etc, etc • Not single object image It would not be adequate to use models trained by ImageNet... Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 14

15.

Case 2. Saliency Map Q1: What Models / Training Data are proper to exploit the ﬁlter images? " “Salient Object Subtilizing (SOS)” Zhang, Ma, Sameki, Sclaroﬀ, Betke, Lin, Shen, Price(CVPR2015) Estimating the number of salient objects in an image such as (0, 1, 2, 3, 4+) = (0,0,0,0,1) = 5-‐‑‒class classiﬁer Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 15

16.

Case 2. Saliency Map SOS: Just count “salient” objects! (it may depend on an image context like text description.) Data set is simple: No region information / No speciﬁc object labels We expect ﬁlters of a trained model reﬂect coarse position of salient objects. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 16

17.

Case 2. Saliency Map Feasibility Check : Use SOS Dataset. http://cs-‐‑‒people.bu.edu/jmzhang/sos.html (Not allowed for commercial use, but we would be able to make similar data because itʼ’s simple!!) Train GoogLeNet (here we used NVIDA DIGITS) (no data argumentation) Training Loss / Accuracy #Train = 5520 Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 17

18.

19.

Case 2. Saliency Map Q2: How to create a proper saliency map from ﬁlter images? • Naïve ﬁlter summarization is noisy.. • It seems that there are good ﬁlters and bad ﬁlters. We want to choose them easily if itʼ’s possible.. Good Not Good Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 19

20.

Case 2. Saliency Map !How to enhance signal and reduce noise? Conv. Nets Conv. Nets + FC Nets + Softmax + propability Information of these layers is not used now. We want to use it! Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 20

21.

Case 2. Saliency Map 1. Estimate the conﬁdence value of the prediction results via entropy to calculate diﬀerential values. 2. Assume the response of the result for each ﬁlter is given by ﬁlter values and diﬀerentials as: response of each ﬁlter. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 21

22.

Case 2. Saliency Map Use the values as the weights for the ﬁlter images: : saliency map here we use to make . Numerically diﬀerential value is given by backprop. (Simonyan, Vedaldi, Zisserman (ICLR2014)) Barckward Convs. / FC. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 22

23.

24.

Case 2. Saliency Map Create a saliency map Forward (Convs) Backward Entropy (conﬁdence level) CNN Resize / Smoothing Input Output Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 24

25.

26.

Case 2. Saliency Map ! But not enough. Bad cases: Calibration of salient position, more training size and better smoothing, etc are needed to improve this. Of course itʼ’s necessary to make private SOS dataset for production systems. Now we are working on Face SOS model to make it a coarse face detector! Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 26

27.

Summary We try to use Caﬀe in non-‐‑‒straight forward way for production systems. The cases, • Object Area Estimation • Saliency Map will be applied for production systems before long! Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引⽤用・転載禁⽌止 27

28.