>100 Views
March 30, 16
スライド概要
2016年3月20日にYahoo! JAPANで開催された「Deep Learning Tokyo 2016」におけるYahoo! JAPAN 大倉によるYahoo!のトップページにおけるニュース記事のレコメンデーションでDeep Learningを活用した事例に関する発表資料となります。
2023年10月からSpeaker Deckに移行しました。最新情報はこちらをご覧ください。 https://speakerdeck.com/lycorptech_jp
YAHOO! JAPAN Deep Learning for News Recommendation deep learning use case in Yahoo! JAPAN #2 Shumpei Okura Yahoo Japan Corporation http://www.yahoo.co.jp/ Deep Learning Tokyo 2016
Front page of Yahoo! JAPAN for Smartphone YAHOO! JAPAN Topics Module Top 6 common articles selected by human experts are displayed. Personalized Module Various articles selected by the system for each user are displayed. Today, I will talk about the system for this module. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 2
Overview of the System YAHOO! JAPAN Browsing histories User features Page view estimator Posted articles Search Engine Today, we introduce 4 key parts of this. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 3
1. Create Article Representation YAHOO! JAPAN Browsing histories User features Page view estimator Posted articles Search Engine Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 4
De-noising Auto-Encoder YAHOO! JAPAN We basically use denoising auto-encoder. Bag of words vector for an article Corrupt Encode Decode h = σ (W̃x + b) y = σ (W'h + b') θ = arg min ∑ L (y, x) Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 5
De-noising Auto-Encoder YAHOO! JAPAN We basically use denoising auto-encoder. Bag of words vector for an article Corrupt Encode Decode h = σ (W̃x + b) y = σ (W'h + b') θ = arg min ∑ L (y, x) This is a good feature of the article. However, these do NOT necessarily construct a good inner product space. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 6
Why focus on the inner product? YAHOO! JAPAN It is necessary to respond within 25 ms from a request. Therefore, we can NOT use complex classifiers. Posted articles Search Engine Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 7
Training with Triplet of Articles YAHOO! JAPAN An article in similar categories Base article An article in non-similar categories h0T h1 > h0T h2 θ = arg min ∑ (x0, x1, x2) ∑ n=0 2 L(yn, xn) - α log σ (h0T h1 - h0T h2) Re-construction loss + Inner product penalty → It can construct better inner product space. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 8
2. Create User Representation YAHOO! JAPAN Browsing histories User features Page view estimator Posted articles Search Engine Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 9
Encoding by Long-Shot Term Memory YAHOO! JAPAN We encode browsing history to user vector by recurrent neural networks. User Vector LSTM LSTM LSTM LSTM NEWS NEWS NEWS NEWS Past Browsing History Current Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 10
Encoding by Long-Shot Term Memory YAHOO! JAPAN News A Click!! News B News C User Vector LSTM LSTM LSTM LSTM NEWS NEWS NEWS NEWS Past Browsing History Current Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 11
Encoding by Long-Shot Term Memory YAHOO! JAPAN Training with click feedbacks by BPTT. News A News B News C User Vector LSTM LSTM LSTM LSTM NEWS NEWS NEWS NEWS Past Browsing History Current Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 12
3. Search & De-duplication YAHOO! JAPAN Browsing histories User features Page view estimator Posted articles Search Engine Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 13
3. Search & De-duplication YAHOO! JAPAN • We have already constructed article vectors and the user vector. • Our remaining task is only to display K nearest articles from the user vector, isn't it? Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 14
3. Search & De-duplication YAHOO! JAPAN • We have already constructed article vectors and the user vector. • Our remaining task is only to display K nearest articles from the user vector, isn't it? NO! Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 15
Naïve Implementation (Bad Example) YAHOO! JAPAN I like soccer. Recent game of my favorite team is Game A. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 16
Naïve Implementation (Bad Example) YAHOO! JAPAN I like soccer. Recent game of my favorite team is Game A. Recommended for you Game A results XX Times Game A results YY Sports Game A results ZZ Journal Game A results WW Soccer Articles of almost same content with different providers 写真: アフロ Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 17
De-duplication with Article Vectors YAHOO! JAPAN Recommended for you Game A results XX Times Game A results YY Sports Game A results ZZ Journal Game A results WW Soccer Article vectors Too similar to the previous one ↓ Remove from the recommended list 写真: アフロ Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 18
After De-duplication YAHOO! JAPAN I like soccer. Recent game of my favorite team is Game A. Recommended for you Game A results XX Times Player interview YY Sports Game B results ZZ Journal Game C results WW Soccer Various results 写真: アフロ Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 19
After De-duplication YAHOO! JAPAN I like soccer. • This approach was awarded to the Young Scientist Award in NLP '16. • We will discuss about it in the poster session of WWW '16 at April. (to appear) Recommended for you Game A results Game C results WW Soccer Various results 写真: アフロ Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 20
4. Page View Estimation for Re-ranking YAHOO! JAPAN Browsing histories User features Page view estimator Posted articles Search Engine Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 21
What articles should be provided to users? YAHOO! JAPAN We can pick up these by user vectors. Want to read Be interest to me Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 22
What articles should be provided to users? YAHOO! JAPAN These articles are also important. Have to read Want to read Be read to everyone Be interest to me Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 23
How do we find such articles? YAHOO! JAPAN • Aggregate from the history log? → Too late. Fresh articles are very important, but they have no histories. • Estimate from static features of articles? → Possible, but less accurate. → Combination of aggregation and estimation. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 24
Estimation using Recurrent Neural Network YAHOO! JAPAN Initial Static features Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 25
Estimation using Recurrent Neural Network YAHOO! JAPAN Estimation for 1st hour FC Initial RNN Static features Dynamic features Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 26
Estimation using Recurrent Neural Network YAHOO! JAPAN Estimation for 2nd hour FC FC Initial RNN RNN Static features Dynamic features Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 27
Estimation using Recurrent Neural Network YAHOO! JAPAN Estimation for 3rd hour FC Aggregation feedback Initial RNN RNN RNN Static features Dynamic features Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 28
Estimation using Recurrent Neural Network YAHOO! JAPAN Aggregation feedback Initial RNN RNN RNN RNN FC FC Static features Dynamic features Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 29
Estimation using Recurrent Neural Network YAHOO! JAPAN No feedback 2 feedbacks FC FC Initial RNN RNN RNN RNN KEY POINT! RNN has an advantage to be able to take variable number of the aggregation feedbacks in the prediction. Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 30
Current Status YAHOO! JAPAN Already applied • De-duplication with article vectors • Re-ranking using impact estimator Lift double!! Under testing • Matching based on user vectors Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 31
Remaining Problems YAHOO! JAPAN • Separation of interest and epidemic • Continuous model updating • Extraction of the story between articles Copyright (C) 2016 Yahoo Japan Corporation. All Rights Reserved. 無断引用・転載禁止 32