Segment gestalt

segment_gestalt ¶

This module implements the segment gestalt function by Tenney & Polansky (1980)

We can broadly categorise the algorithm's limitations to 2 categories:

Soft restrictions
Hard restrictions on what scores we can take, either because the algorithm exhibits undefined behavior when these scores are given, or because it isn't designed for said restrictions.

With these categories in mind, we have the following limitations. The algorithm does not consider these things within its scope (given a monophonic input):

the monophonic music may have stream segregation (i.e. 1 stream of notes can be interpreted as 2 or more separate interspersed entities)
does not consider harmony or shape (see beginning of section 2 for the OG paper for more details)
does not give semantic meaning (we're still stuck giving arbitrary ideals to arbitrary things)

The algorithm has the following restriction to the score:

the score must be monophonic (perception differences) If we consider polyphonic scores, we will need a definition of what a substructure is for said score (in said algorithm) with respect to how we carve the note strutures. Since, in this algorithm, we don't consider stream segregation and other features that require larger context clues, we can just simply define a score substructure “temporally” as a contiguous subsequence of notes. Hence, it is safe to assume that the current algorithm is undefined when it comes to polyphonic music.

Some thoughts (and questions): (1) Should our output preserve the internal structure of the score for segments and clangs? Probably not. Keep in mind we're dealing with monophonic score structures. we just need to provide sufficient information that allows a caller to potentially verify the result and use it elsewhere, hence we simply return 2 lists of separate scores.

Legit think having a separate representation that can index into individual notes will be immensely helpful. But, I'm certain there has to be something I'm missing to decide otherwise (if I had to guess, ambiguity of how musical scores themselves are presented to the musician is chief among them, and maintaining that ambiguity in our internal representation is also paramount)

Also legit think we need well-defined rules to split and merge scores...

On a completely separate and unrelated note, there are 2 pitchmeans with, the exact same implementation and 2 filenames...

segment_gestalt ¶

segment_gestalt(score: Score) -> tuple[list[float], list[float]]

Given a monophonic score, returns clang and segment boundary onsets

Parameters:

score (Score) –

The score to be segmented

Returns:

tuple[list[float], list[float]] –

None if no clangs can be formed, else, 2-tuple of: (sorted list of onsets denoting clangs boundaries, sorted list of onsets denoting segments segment boundaries)

Raises:

Exception –

if the score is not monophonic

Source code in amads/melody/segment_gestalt.py

def segment_gestalt(score: Score) -> tuple[list[float], list[float]]:
    """
    Given a monophonic score, returns clang and segment boundary onsets

    Parameters
    ----------
    score: Score
        The score to be segmented

    Returns
    -------
    tuple[list[float], list[float]]
        None if no clangs can be formed, else, 2-tuple of:
        (sorted list of onsets denoting clangs boundaries,
        sorted list of onsets denoting segments segment boundaries)


    Raises
    ------
    Exception
        if the score is not monophonic
    """
    if not score.ismonophonic():
        raise Exception("score not monophonic, input is not valid.")

    notes: List[Note] = cast(
        List[Note], score.flatten(collapse=True).list_all(Note)
    )

    if len(notes) <= 0:
        return ([], [])

    cl_values = []
    # calculate clang distances here
    for note_pair in zip(notes[:-1], notes[1:]):
        pitch_diff = note_pair[1].key_num - note_pair[0].key_num
        onset_diff = note_pair[1].onset - note_pair[0].onset
        cl_values.append(2 * onset_diff + abs(pitch_diff))

    # combines the boolean map and the scan function that was done in matlab
    if len(cl_values) < 3:
        return ([], [])

    clang_soft_peaks = _find_peaks(cl_values)
    cl_indices = [0]
    # worry about indices here
    # starting index here
    # 1 past the end so we can construct score list easier
    cl_indices.extend([idx + 1 for idx in clang_soft_peaks])
    cl_indices.append(len(notes))

    clang_onsets = list(map(lambda i: (notes[i].onset), cl_indices[:-1]))

    if len(clang_onsets) <= 2:
        return (clang_onsets, [])

    # we can probably split the clangs here and organize them into scores
    clang_scores = _construct_score_list(
        notes, zip(cl_indices[:-1], cl_indices[1:])
    )
    # calculate segment boundaries
    # we need to basically follow segment_gestalt.m
    # (1) calculate individual clang pitch means
    mean_pitches = [pitch_mean(score, weighted=True) for score in clang_scores]

    # (2) calculate segment distances
    seg_dist_values = []
    # calculating segment distance...
    for i in range(len(clang_scores) - 1):
        local_seg_dist = 0.0
        # be careful of the indices when calculating segdist here
        local_seg_dist += abs(mean_pitches[i + 1] - mean_pitches[i])
        # first first distance
        local_seg_dist += (
            notes[cl_indices[i + 1]].onset - notes[cl_indices[i]].onset
        )
        # first of next clang to last of distance
        local_seg_dist += abs(
            notes[cl_indices[i + 1]].key_num
            - notes[cl_indices[i + 1] - 1].key_num
        )
        local_seg_dist += 2 * (
            notes[cl_indices[i + 1]].onset - notes[cl_indices[i + 1] - 1].onset
        )
        seg_dist_values.append(local_seg_dist)
    if len(seg_dist_values) < 3:
        return (clang_onsets, [])

    seg_soft_peaks = _find_peaks(seg_dist_values)
    assert seg_soft_peaks[-1] < len(cl_indices) - 1
    seg_indices = [0]
    # do we need to add 1 here? where do we add 1
    # worry about indices here
    seg_indices.extend([cl_indices[idx + 1] for idx in seg_soft_peaks])
    seg_indices.append(len(notes))

    segment_onsets = list(map(lambda i: (notes[i].onset), seg_indices[:-1]))
    return (clang_onsets, segment_onsets)