After pitch estimating, we have got the information of a humming signal and can see each note as a situation. For the reason that the notes are consecutive, we can use pitch sequence to construct a transition model of a piece of music. Before talking about Hidden Markov model (HMM), the Markov model should be mentioned first. The Markov model for melody matching is a probability transition cycle which consists of a series of specific states characterized by pitch. Each states have a transition probability to the other states. It represents a process going through a sequence of discrete states. There are three basic elements to form a Markov model:

image.png

截圖 2025-03-06 凌晨2.06.16.png

Next considering the hidden Markov model for melody matching. In a hidden Markov model of a song, there is no zero-probability transition exists due to every transition might happens. In this approach, every target in database and the query is an observation sequence O = (𝑜1, 𝑜2, … , 𝑜𝑇) , each 𝑜𝑖 is characterized by pitch. First, we construct the hidden Markov model of every song in database. The probability of observation 𝑜𝑖 can be estimated by counting the times 𝑜𝑖 happen and comparing to the total number of times s are encountered:

在隱馬可夫模型 (Hidden Markov Model, HMM) 的旋律匹配中,每個轉移皆有可能發生,因此不存在零轉移機率。 在此方法中,資料庫中的每個目標歌曲以及查詢曲目都被視為一個觀察序列

$$ \bar{O} = (o_1, o_2, \dots, o_T) $$

其中,每個 o_i 皆由音高 (pitch) 來描述。首先,我們為資料庫中的每首歌曲構建隱馬可夫模型,然後透過計算 o_i 出現的次數並與總狀態數相比,來估算觀察值 o_i 的機率。

image.png

For the observations that do not occur in training data, we need to give them a minimal probability 𝑃 𝑚 since we cannot ensure that they will never occur. The last step of building a hidden Markov model is to renormalize the transition probability again. To illustrate, we take the example in Fig. 3.1 and the assumption that all the possible states are {α,β,γ,ω,ϵ}. The resulting transition table is shown in Table 3.3 and Table 3.4. Here we take 𝑃 𝑚 = 0.05 as example. Table 3.3 is the result which we give a small probability to those transitions that are not observed, and Table 3.4 is the result after renormalization of Table 3.3.

對於訓練數據中未出現的觀察值,我們需要為其賦予一個最小機率 P_m,因為我們無法保證它們永遠不會出現。構建隱馬可夫模型的最後一步是重新正規化轉移機率。舉例來說,假設所有可能的狀態為 {α,β,γ,ω,ϵ},其對應的轉移表如表 3.3 和表 3.4 所示。我們以 P_m = 0.05 作為示例。表 3.3 展示了對未觀察到的轉移賦予小機率的結果,而表 3.4 則是對表 3.3 進行正規化後的轉移機率結果。

image.png

Then, a query calculates its own probability from the query’s pitch transition by referring to the hidden Markov model. The similarity probability is calculated from:

image.png

where H is the hidden Markov model.

Let’s give an example to demonstrate how this method works. There are two queries, {     } and {     }, we want to know which of them is more similar to the target {     }. The routes of the first query are →, → → → and → where → means to the value of column 1 and row 5, which is 0.0425. The matching probability of the first query is about 0.0025. By the same rule, the matching probability of the second query sequence {     } is about 0.00006. Hence, the first query sequence is judged to be more similar to {     }.

讓我們舉個例子來說明此方法的運作方式。假設有兩個查詢序列: {α,ϵ,β,γ,ω,ω} 和 {α,β,ω,ϵ,ϵ,γ},我們希望判斷哪個與目標序列 {α,β,β,γ,ω,ω}更相似。 第一個查詢的轉移路徑為 α→ϵ, ϵ→β, β→γ, γ→ω, ω→ω,其中 α→ϵ 對應於轉移矩陣的第 1 列、第 5 行,其機率為 0.0425。因此,第一個查詢序列的匹配機率約為 0.0025。按照相同的計算規則,第二個查詢序列 {α,β,ω,ϵ,ϵ,γ}的匹配機率約為 0.00006。因此,第一個查詢序列被判定與目標序列更相似。

實作 python Markov Model

def build_markov_model(target_diff, min_prob=0.05, state_range=(-3, 7)):
    """
    建立 Markov Model 的轉移機率矩陣,並確保 state 範圍涵蓋指定區間。
    
    參數:
        - target_diff: list[int],目標歌曲的 MIDI difference
        - min_prob: float, 當轉移機率為 0 時的最小值
        - state_range: tuple(int, int), (最小狀態, 最大狀態)
    
    回傳:
        - states: list[int],所有 MIDI 狀態 (從小到大排序)
        - transition_matrix: numpy.ndarray, 轉移機率矩陣
        - state_index: dict, MIDI Number -> Matrix Index 的映射
    """
    # 確保 state 範圍完整
    states = list(range(state_range[0], state_range[1] + 1))
    state_index = {midi: i for i, midi in enumerate(states)}
    n = len(states)

    # 初始化轉移次數矩陣
    transition_counts = np.zeros((n, n))

    # 計算轉移次數 C(s' | s)
    for i in range(len(target_diff) - 1):
        s, s_next = target_diff[i], target_diff[i + 1]
        if s in state_index and s_next in state_index:
            transition_counts[state_index[s_next], state_index[s]] += 1

    print("transition_counts: \\n", transition_counts)

    # 計算轉移機率矩陣
    transition_matrix = np.zeros((n, n))
    for j in range(n):  # j 是前一個狀態 s
        col_sum = np.sum(transition_counts[:, j])  # 計算 Σ C(s' | s)
        if col_sum > 0:
            transition_matrix[:, j] = transition_counts[:, j] / col_sum
        else:
            transition_matrix[:, j] = 0  # 若無轉移數據,設為 0

    # 設置最小機率 P_m
    transition_matrix[transition_matrix == 0] = min_prob

    # 正規化 (每個 column 之和 = 1)
    transition_matrix /= transition_matrix.sum(axis=0, keepdims=True)

    return states, transition_matrix, state_index

實作 python Scoring by HMM