191 Views
March 09, 18
スライド概要
2018/2/23
Deep Learning JP:
http://deeplearning.jp/seminar-2/
DL輪読会資料
DEEP LEARNING JP [DL Papers] Generating Wikipedia by Summarizing Long Sequences (ICLR 2018) Toru Fujino, scalab, UTokyo http://deeplearning.jp/ 1
• •s • : •1 • • • . /0 ( -:: 8 p e 2>2 2 > 2ILCG I k goo.gl/wSuuS9 B a Ird i G ItW ) B W DC iI rd I R noI2>> >: i l g
, • • , • • • : • • : ) , • • • • , , , , ( ) 1
1 • • .G 51 • • • • • DC R // L 32 1 2 : : 0 N 64 1 /1 1:1 ( 4 1 ) • • 1) Rush et al. “A Neural Attention Model for Sentence Summarization”, EMNLP 2015 2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016
• (, ,, ),( ) 2) •e • • / • S / 2 A / / 2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016
• • / 2. 2 1
1 • W a • • • 2 00 . • 1 c R G • d https://en.wikipedia.org/wiki/Deep_learning
• goo.gl/wSuuS9 ( )
, • 1 2 3 4
• • • • -
3 43 • 4 • CD 4 M • ac c 43 , • d ac c n r Ly ot ldA 3 43 e • • 4 , • ) 24 : •s i Y 4 CN m 3 43 m 42 u p nY a m 3 43 m e 3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017 4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017
. • • • • . 3 (3 .) .) A .) 3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017 : :
(
•
•
•
• )
2
! " = [! "% , ! "' , … , ! ")" ]
! + = [! +% , ! +' , … , ! +)+ ]
E
E
E
5)
•A
:
D
5) M.-T. Luong et al. “Effective Approaches to Attention-based Neural Machine translation”, EMNLP 2015
•( - E : : : • • • ) • E : A D D D
• • A • , -
• K • ( • K 5 A 6 /6 6 V 2 ) ,
L • • • • • . V K 3 = 11,/1 / ASA
) ( ) • • /( 4) 4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017
9 • • • • • 2 9 / M -1 L 02 - - 2 5 / 9 02 = 2 /
•
• • • • - - - :
•
)& • ( ( ()
• •
•M • / / -,/ p k • W lsr f • n s k • a a > • • -2 - / 2 - y lsr : im - -, - / e - Wy • • ot • L • C M lsr L : d L L c C A /