>100 Views
October 11, 24
スライド概要
This document presents a method for generating walking bass lines using hidden Markov models. It proposes three methods for defining hidden states and evaluates them objectively and subjectively. Method 3, which considers pitch class and metrical position, performed best by generating bass lines that were harmonically congruent, sequentially smooth, and preferred by an expert bassist over the ground truth. While effective, the approach has limitations like only handling short, simple chord progressions and more work is needed to address longer, more complex progressions and additional musical factors.
日本大学 文理学部 情報科学科 北原研究室。 「Technology Makes Music More Fun」を合言葉に、音楽をはじめとするエンターテインメントの高度化に資する技術の研究開発を行っています。
Generating Walking Bass Lines with HMM Ayumi Shiga and Tetsuro Kitahara Nihon University, Japan Twitter: @tetsurokitahara
Motivation Creating bass lines is a mandatory skill for jazz bassists Why? Bass lines are not described in scores of jazz music An example of jazz scores (Take The "A" Train) An example of bass lines P1 M3 P5 M6 P1 P5 P1 m2 P1 P1 M3 M3 P1 M7 m7 M7 (from the root note of the chord)
Motivation But, creating bass lines is not easy for novice players Why? They have to consider both C simultaneity & sequentiality simultaneity (harmonic congruency) We aim at sequentiality (smooth succession) Generating a bass line for a given chord progression Input: Output: C C D D
Related work There have been many works for harmonization, but only a few dealt with generating bass lines ● Dias et al. SMC 2013 – contour-based approach (user specifies the direction of note transition etc.) Kunimatsu et al. PDPTA 2015 No data-driven – using genetic programming probabilistic approach for walking bass lines ● Ramalho et al. JNMR 1999 – memorise & reuse existing bassline fragments ● Piedra B.Sc.thesis@UPF, 2015 – developed a bass-line generator for EDM ●
Problem statement Input C C D - 4 measures (4/4) - 1 chord for each measure - {C, C#, D, ..., B} × {maj, min} - key: C major - No melody data Output - A sequence of quarter notes - No rests D
Formulation with HMM Input: C Observation C Emission C D D Transform to a sequence of quarter-note-level chord labels C C C C C C C D D D D D D D D Hidden state State transition 3 methods for defining hidden states Method 1: The simplest, octave-ignored method ● Method 2: The simple but non-octave-ignored method ● Method 3: The octave-ignored but metrical-positionconsidered method ●
3 methods for defining hidden states Method 1 The simplest, octave-ignored method 94 5 6 7 8 9 11 0 770 1 11 10 1 0 to 11 corresponding each pitch classes Method 2 The simple but non-octave-ignored method 33 28 29 30 31 32 33 35 36 31 31 36 37 34 33 36 note numbers
3 methods for defining hidden states Method 3 Pitch class: Metrical position: The octave-ignored but metrical-positionconsidered method 94 5 6 01 2 3 7 8 9 11 0 12 3 0 770 0 123 1 11 10 1 0 1 2 3 pitch class + 12 * metrical position Hidden state: 9 16 29 42 7 20 33 47 Why? Bass note selection depends on its metrical position 0 19 31 36 1 23 34 37 To enable the distribution of emission probability to be learnt separately
Book Data "Jazz Bass Running 104 vol.1" Web site "Projazz Lab" We collected 206 four-bar bass lines with chord labels. (103 for training, and the other 103 for test)
Example 1 Play Play Play
Example 2 Play Play Play
Example 3 Play Play Play
Objective evaluation Criteria the higher, the better 1: Rate of concordance with the ground truth 2: Rate of the root note of the chord at each bar 3: Rate of the root note of the chord at the first beat of each bar 4: Rate of the chord note at the first beat of each bar 5: Rate of dissonant notes (m2 from any chord notes) 6: Rate of flat motions (a pitch to the same pitch) 7: Rate of conjunct motions (m2 or M2) 8: Rate of distinct motions (more than M2) 9: Num. of pitch classes appearing in the bass line the closer to the ground truth, the better
Objective evaluation (%) (個) Method 3 is the best for 7 criteria 100 10 90 9 80 8 70 7 60 6 50 40 5 30 4 20 3 10 0 2 評価 1 評価 2 評価 3 Method 1手法 1 評価 4 評価 5 評価 6 評価 7 手法 3 2正解 Method Method 3 手法 2 評価 8 評価 9 Ground truth
Objective evaluation (%) (個) Method 3 avoids monotonous bass lines 100 10 90 9 80 8 70 7 Num of appearing pitch classes 60 6 50 40 5 Rate of flat motions 30 4 20 Rate of root notes 3 10 0 2 評価 1 評価 2 評価 3 Method 1手法 1 評価 4 評価 5 評価 6 評価 7 手法 3 2正解 Method Method 3 手法 2 評価 8 評価 9 Ground truth
Objective evaluation (%) (個) Method 3 generates bass lines congruent with chord progression 100 10 90 9 80 8 Rate of chord notes at 1st beat 70 Rate of root notes st 60 at 1 beat 7 6 50 40 5 30 4 20 3 10 0 2 評価 1 評価 2 評価 3 Method 1手法 1 評価 4 評価 5 評価 6 評価 7 手法 3 2正解 Method Method 3 手法 2 評価 8 評価 9 Ground truth
Objective evaluation (%) (個) Method 3 generates sequentially smooth bass lines 100 10 90 9 80 8 70 7 60 6 50 Rate of conjunct motions 40 5 30 4 20 3 10 0 2 評価 1 評価 2 評価 3 Method 1手法 1 評価 4 評価 5 評価 6 評価 7 手法 3 2正解 Method Method 3 手法 2 評価 8 評価 9 Ground truth
Subjective evaluation We asked an expert bassist with 25 years of experience to evaluate generated bass lines 1: Num of musically inappropriate notes 2: Overall quality (1-5) 3: Overall smoothness (1-5) the lower, the better the higher, the better 4: Congruency with the chord progression (1-5) To reduce the evaluator’s burden, we used 50 bass lines selected at random
Subjective evaluation better than the ground truth (個) 2 Method 3 is the best for all criteria 4.5 1.8 1.6 4 1.4 1.2 3.5 1 0.8 3 0.6 0.4 2.5 0.2 0 評価 1 Inappropriate notes Method 1 2 評価 2 Overall quality Method 2 評価 3 評価 4 Overall Conguency 手法 1 手法 2 手法 3 正解 smoothness with chords Method 3 19 Ground truth
Conclusion ● Developed a method for generating walking bass line ● Proposed 3 methods for defining hidden states – Method 1: simplest, octave-ignored – Method 2: simple but non-octave-ignored – Method 3: octave-ignored but metrial-position-considered ● Obj. & subj. evaluations show Method 3 is the best ● Remaining issues: Longer chord progression, more kinds of chords, – Adding ornaments, considering melodies, – Adaptation to the user’s performing skills, – And many –