[DL輪読会]Generating Wikipedia by Summarizing Long Sequences

522 Views

March 09, 18

#deep learning #Deep Learning #Sequence Summarization #RNNs #Attention Mechanism #Text Generation

スライド概要

2018/2/23
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 88.9K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 61.4K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 60.5K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 44.4K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 41.7K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 41K

各ページのテキスト

DEEP LEARNING JP [DL Papers] Generating Wikipedia by Summarizing Long Sequences (ICLR 2018) Toru Fujino, scalab, UTokyo http://deeplearning.jp/ 1

http://deeplearning.jp/

• •s • : •1 • • • . /0 ( -:: 8 p e 2>2 2 > 2ILCG I k goo.gl/wSuuS9 B a Ird i G ItW ) B W DC iI rd I R noI2>> >: i l g

https://goo.gl/wSuuS9

, • • , • • • : • • : ) , • • • • , , , , ( ) 1

1 • • .G 51 • • • • • DC R // L 32 1 2 : : 0 N 64 1 /1 1:1 ( 4 1 ) • • 1) Rush et al. “A Neural Attention Model for Sentence Summarization”, EMNLP 2015 2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016

• (, ,, ),( ) 2) •e • • / • S / 2 A / / 2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016

• • / 2. 2 1

1 • W a • • • 2 00 . • 1 c R G • d https://en.wikipedia.org/wiki/Deep_learning

https://en.wikipedia.org/wiki/Deep_learning

• goo.gl/wSuuS9 ( )

https://goo.gl/wSuuS9

, • 1 2 3 4

10.

• • • • -

11.

3 43 • 4 • CD 4 M • ac c 43 , • d ac c n r Ly ot ldA 3 43 e • • 4 , • ) 24 : •s i Y 4 CN m 3 43 m 42 u p nY a m 3 43 m e 3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017 4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017

12.

. • • • • . 3 (3 .) .) A .) 3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017 : :

13.

[beta]

(

•
•
•
• )

2

! " = [! "% , ! "' , … , ! ")" ]
! + = [! +% , ! +' , … , ! +)+ ]

E
E
E

5)

•A

:

D

5) M.-T. Luong et al. “Effective Approaches to Attention-based Neural Machine translation”, EMNLP 2015

14.

•( - E : : : • • • ) • E : A D D D

15.

• • A • , -

16.

• K • ( • K 5 A 6 /6 6 V 2 ) ,

17.

L • • • • • . V K 3 = 11,/1 / ASA

18.

) ( ) • • /( 4) 4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017

19.

9 • • • • • 2 9 / M -1 L 02 - - 2 5 / 9 02 = 2 /