A System for Retrieving Video Game Music

>100 Views

November 25, 24

スライド概要

profile-image

日本大学 文理学部 情報科学科 北原研究室。 「Technology Makes Music More Fun」を合言葉に、音楽をはじめとするエンターテインメントの高度化に資する技術の研究開発を行っています。

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

(ダウンロード不可)

関連スライド

各ページのテキスト
1.

A System for Retrieving Video Game Music Nihon University Ryusei Hayashi, Tetsuro Kitahara 1

2.

Intro > Background Background ● Video Game Music (VGM) express game and scene feature at the same time Okami / CAPCOM CO., LTD. Undertale / Toby Fox Prolouge ● Prolouge / Masami Ueda ● Once Upon a Time / Toby Fox Last Boss ● Rising Sun / Rei Kondoh ● Hopes And Dreams / Toby Fox 2

3.

Intro > Background Background ● Video Game Music (VGM) express game and scene feature at the same time Japanese Style Chiptune Style ● Prolouge / Masami Ueda ● Once Upon a Time / Toby Fox ● Rising Sun / Rei Kondoh ● Hopes And Dreams / Toby Fox 3

4.

Intro > Background Background ● Video Game Music (VGM) express game and scene feature at the same time Low Tension ● Prolouge / Masami Ueda ● Once Upon a Time / Toby Fox High Tension ● Rising Sun / Rei Kondoh ● Hopes And Dreams / Toby Fox 4

5.

Intro > Background Background ● Difficult to retrieve VGM taking game and scene feature at the same time Let's develop a Yakuza game! 5

6.

Intro > Background Background ● Difficult to retrieve VGM taking game and scene feature at the same time Let's develop a Yakuza game! Found the Yakuza fight scene VGM! 6

7.

Intro > Background Background ● Difficult to retrieve VGM taking game and scene feature at the same time Let's develop a Yakuza game! Found the Yakuza fight scene VGM! Not found Yakuza love scene VGM... 7

8.

EgGMAn - Engine of Game Music Analysis 8

9.

Intro > Purpose Purpose ● Purpose: Retrieve VGM taking game and scene feature at the same time ● Premise: One VGM has been decided for the developing game ● Condition ○ Maintain the game feature of the decided VGM for the developing game ○ Change the scene feature of the decided VGM for the developing game Game Feature Change😊 Scene Feature Maintain😊 9

10.

Intro > Problem Problem ● Condition ○ Maintain the game feature of the decided VGM to be included in the game ○ Change the scene feature of the decided VGM to be included in the game → Partial Contradiction Game Feature Maintain🙁 Scene Feature Change🙁 10

11.

Intro > Solution Solution ● Vectorize VGM with VAE ● Assumption ○ VGM vector form cluster in each scene ○ Constant difference between VGM Scene1 v12 c1 v11 Scene2 v1n vector and center of each scene vm v22 c2 SceneM 2 cm vm v21 v2n 1 vm n 11

12.

Intro > Solution Solution ● Vectorize VGM with VAE ● Assumption ○ VGM vector form cluster in each scene ○ Constant difference between VGM Scene1 Game1 v12 c1 v11 Scene2 v1n vector and center of each scene vm v22 c2 SceneM 2 cm vm vm v21 v2n 1 Game1 n 12

13.

Intro > Solution Solution ● Vectorize VGM with VAE ● Assumption ○ VGM vector form cluster in each scene ○ Constant difference between VGM Scene1 v12 c1 v11 Scene2 v1n vector and center of each scene vm v22 c2 SceneM 2 cm vm v21 v2n 1 vm n 13

14.

Method > Input/Output Input/Output ● Yakuza Fight VGM Input ○ Scene to use Source Music EgGMAn Target Scene ■ ● VGM for the developing game Source Scene ■ ○ Love Source Music ■ ○ Fight Another scene to put VGM Output ○ Target Music ■ VGM to attach to Target Scene Yakuza Love VGM 14

15.

Method > Vectorization Vectorization ● Source Music Source Music ○ Convert Source Music to vector z with VAE VAE z 15

16.

Method > Vectorization Vectorization ● Source Music ○ ● Convert Source Music to vector z with VAE Source Scene ○ Create set of VGM P for use in Source Scene ○ Convert set P to set of vector Pz in VAE ○ Compute the center pc of the set Pz z p2z p1z pc pmz 16

17.

Method > Vectorization Vectorization ● Source Music ○ ● ● Convert Source Music to vector z with VAE Source Scene ○ Create set of VGM P for use in Source Scene ○ Convert set P to set of vector Pz in VAE ○ Compute the center pc of the set Pz Target Scene ○ z pc q2z q1z qc Create set of VGM Q for use in Target Scene z ○ Convert set Q to set of vector Q in VAE ○ Compute the center qc of the set Qz qnz 17

18.

Method > Compute Vector Compute Vector ● Target Music ○ Predict Target Music vector z’ z vik - ci = vjk - cj pc qc 18

19.

Method > Compute Vector Compute Vector ● Target Music ○ Predict Target Music vector z’ vik - ci = vjk - cj c c z - p = z’ - q z z’ pc qc 19

20.

Method > Compute Vector Compute Vector ● Target Music ○ Predict Target Music vector z’ vik - ci = vjk - cj z - pc = z’ - qc z’ = z + qc - pc z z’ pc qc 20

21.

Method > Retrieval Retrieval ● Target Music ○ Predict Target Music vector z’ ○ Compute distance from vector z’ to VGM ○ Sort VGM based on distance Sort z’ q2z q1z qnz 21

22.

Method > Preprocessing Preprocessing ● Pass Spectrogram of VGM to VAE ● Method ○ Extract 10 to 30 second segment of VGM ○ Detect beats in the extracted segment ○ Select Beats ■ First beat ■ Farthest beat from it in less than 10 second ○ Extract the segment formed by selected beats ○ Convert the extracted segment to a spectrogram 0 10 30 10 Hz: 1024 Time: 256 22

23.

Method > Dataset Dataset ● Tag Table ○ Create from Audiostock ○ Link and save the ID and Tag ○ Link and save the ID and MP3 ● Scene Set ○ Collect about 170 frequently used Scene in games as words ○ Time, Weather, and 7 other types 23

24.

Method > Dataset > Similarity Table Similarity Table ● Method ● Similarity ○ Store Scene in row 1, Tag in column 1 ○ Vectorize Tag and Scene in Word2vec ○ Store i, j-th similarity between i-th ○ Compute cosine similarity between Scene and j-th Tag Tag and Scene 24

25.

Method > Dataset > Scene Table Scene Table ● Replace Tag in Tag Table with Scene ● Method ○ Extract Tag from the Tag Table ○ Extract the similarity of the extracted Tag from Similarity Table ○ Extract Scene with extracted similarity greater than threshold ○ Store extracted Scene in the Scene Table 25

26.

Method > VAE VAE (Variational Auto-Encoder) ● Structure Loss Function ● ○ Encoder: Convert data to vector ○ MSE: Error in input and reconstruct data ○ Decoder: Reconstruct vector to data ○ KLD: Error in vector and normal distribution MSE Encoder Input Data Decoder z KLD Normal Distribution Reconstruct Data 26

27.

Method > VAE > Structure > Encoder Encoder ● Convert data to vector ● Implement by convolution (→) Time: 256 Hz: 1024 1024*256*1 1024*256*32 1024*16*32 1024*1*32 1*256*560 1*16*560 1*1*560 FCN 560 Sampling 32 (Vector) 27

28.

Method > VAE > Structure > Decoder Decoder ● Reconstruct vector to data ● Implement by deconvolution (→) Time: 256 Hz: 1024 1024*1*32 1024*16*32 1024*256*32 1*1*560 1*16*560 1*256*560 1024*256*1 FCN 560 FCN 32 (Vector) 28

29.

Method > VAE > Loss Function > MSE MSE (Mean Squared Error) ● Error in input and reconstruct vector data ● Ensures that vector reflect data MSE Input Data Reconstruct Data 29

30.

Method > VAE > Loss Function > KLD KLD (Kullback-Leibler Divergence) ● Error in vector and normal distribution ● Ensure vector continuity Encoder 𝛍 Sampling Decoder 𝛔 KLD Normal Distribution 30

31.

Experiment > Preliminary Experiment Preliminary Experiment ● Purpose: Confirm if VAE can train VGM or not ● Prepare ● ○ Randomly extract 5120 MP3 from the Tag Table ○ Split the extracted MP3 3:1 for training:validation data Execute ○ Train VAE with training data ○ Visualize the loss function for training and validation data ○ Reconstruct training and validation data with trained VAE 31

32.

Experiment > Preliminary Experiment Preliminary Experiment ● ● Loss Function ○ Training Data: Continue to decrease ○ Validation Data: Stop to decrease ○ VAE caused overtraining Reconstruct ○ Need another way to consider tempo 32

33.

Experiment > Operational Experiment Operational Experiment ● Purpose: Confirm the practicality of EgGMAn in game development ● Prepare ● ○ Develop the front end of EgGMAn ○ Distribute EgGMAn and Survey Execute ○ Ask them to use EgGMAn during development ○ Ask them to fill out Survey after development Survey How predictable was the retrieval? How suitable was the retrieval for game? How suitable was the retrieval for scene? 33

34.

Experiment > Operational Experiment Operational Experiment ● Global Game Jam 2024 ○ ○ Information Not suitable at all Not suitable Even ■ Place: Tokyo Univ. of Tech. Suitable ■ Term: 2 days Very suitable ■ Team: 5 teams (30 people) How predictable was the retrieval? Result ■ Participant: 5 people ■ Good Review: 3 people How suitable was the retrieval for game? How suitable was the retrieval for scene? 34

35.

Outro > Conclusion Conclusion ● ● ● Intro ○ Purpose: Retrieve VGM taking game and scene feature at the same time ○ Premise: One VGM to be included in the game has been decided Method ○ Convert Source Music to vector z with VAE ○ Compute center pc, qc of Source Scene, Traget Scene ○ Compute Target Music vector with z’ = z + qc - pc Experiment ○ Preliminary Experiment: Confirm if VAE can train VGM or not ○ Operational Experiment: Confirm the practicality of EgGMAn in game development 35

36.

Outro > Future Future ● Objective Evaluation ○ ○ ● Evaluate the validity of Assumption ■ VGM vector form cluster in each scene ■ Constant difference between VGM vector and center of each scene Evaluate the retrieval accuracy Subjective Evaluation ○ Evaluate retrieval accuracy 36

37.

Outro > Acknowledgment Acknowledgment ● Thank you for valuable advice and feedback ● Prof. Shigeyuki Hirai / Kyoto Sangyo Univ. ● Mr. Kenji Kojima / CAPCOM CO., LTD. ● Mr. Tomoya Kishi / CAPCOM CO., LTD. ● Mr. Takaaki Ichijo / HEAD-HIGH CO., LTD. ● Dr. Akinori Ito / Tokyo Univ. of Tech. ● Prof. Koji Mikami / Tokyo Univ. of Tech. 37

38.

A System for Retrieving Video Game Music Nihon University Ryusei Hayashi, Tetsuro Kitahara 38