Skip to content

Mtype tokenizer

MelodyTokenizer

MelodyTokenizer()

Abstract base class for tokenizing melodies into n-grams.

Source code in amads/algorithms/mtype_tokenizer.py
10
11
12
def __init__(self):
    """Initialize the tokenizer."""
    self.ioi_data = {}  # Dictionary to store IOI information for notes

Functions

tokenize

tokenize(score: Score) -> List[List]

Tokenize a melody into phrases. (Unimplemented abstract method.)

Parameters:

  • score (Score) –

    A Score object containing a melody

Returns:

  • list[list]

    List of tokenized phrases

Raises:

  • NotImplementedError

    if this method of MelodyTokenizer is called

Source code in amads/algorithms/mtype_tokenizer.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def tokenize(self, score: Score) -> List[List]:
    """Tokenize a melody into phrases. (Unimplemented abstract method.)

    Parameters
    ----------
    score : Score
        A Score object containing a melody

    Returns
    -------
    list[list]
        List of tokenized phrases

    Raises
    ------
    NotImplementedError
        if this method of MelodyTokenizer is called
    """
    raise NotImplementedError

FantasticTokenizer

FantasticTokenizer()

Bases: MelodyTokenizer

This tokenizer produces the M-Types as defined in the FANTASTIC toolbox [1].

An M-Type is a sequence of musical symbols (pitch intervals and duration ratios) that represents a melodic fragment, similar to how an n-gram represents a sequence of n consecutive items from a text. The length of an M-Type can vary, just like n-grams can be of different lengths (bigrams, trigrams, etc.)

The tokenizer takes a score as the input, and returns a dictionary of unique M-Type (n-gram) counts.

Attributes:

  • phrase_gap (float) –

    Time gap in seconds that defines phrase boundaries

  • tokens (list) –

    List of tokens after tokenization

References

[1] Müllensiefen, D. (2009). Fantastic: Feature ANalysis Technology Accessing STatistics (In a Corpus): Technical Report v1.5

Source code in amads/algorithms/mtype_tokenizer.py
59
60
61
def __init__(self):
    super().__init__()
    self.tokens = []

Functions

tokenize

tokenize(score: Score) -> List

Tokenize a melody into M-Types.

Parameters:

  • score (Score) –

    Score object containing melody to tokenize

Raises:

  • ValueError

    if score has more than one part, if the part has concurrent notes (IOI == 0) or if a Note in the part has a tie.

Returns:

  • list

    List of M-Type tokens

Source code in amads/algorithms/mtype_tokenizer.py
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def tokenize(self, score: Score) -> List:
    """Tokenize a melody into M-Types.

    Parameters
    ----------
    score : Score
        Score object containing melody to tokenize

    Raises
    ------
    ValueError
        if score has more than one part, if the part has concurrent
        notes (IOI == 0) or if a Note in the part has a tie.

    Returns
    -------
    list
        List of M-Type tokens
    """
    # Extract notes and calculate IOIs using get_notes
    notes = score.calc_differences(["ioi-ratio", "interval"])
    if len(notes) != 1:
        raise ValueError("score has more than one Part")
    notes = notes[0]
    tokens = []

    # Skip if phrase is too short
    if len(notes) < 2:
        return tokens

    for note in notes[1:]:
        pitch_interval = note.get("interval")
        pitch_interval_class: Optional[str] = self.classify_pitch_interval(
            pitch_interval
        )
        ioi_ratio_class = self.classify_ioi_ratio(note.get("ioi_ratio"))
        token = MType(pitch_interval_class, ioi_ratio_class)
        tokens.append(token)
    return tokens

classify_pitch_interval

classify_pitch_interval(pitch_interval: Optional[int]) -> Optional[str]

Classify pitch interval according to Fantastic's interval class scheme.

Parameters:

  • pitch_interval (int or None) –

    Interval in semitones between consecutive notes

Returns:

  • Optional[str]

    Interval class label (e.g. 'd8', 'd7', 'u2', etc.) 'd' = downward interval, 'u' = upward interval, 's' = same pitch, and 't' = tritone. Returns None if input is None

Source code in amads/algorithms/mtype_tokenizer.py
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
def classify_pitch_interval(
    self, pitch_interval: Optional[int]
) -> Optional[str]:
    """Classify pitch interval according to Fantastic's interval class scheme.

    Parameters
    ----------
    pitch_interval : int or None
        Interval in semitones between consecutive notes

    Returns
    -------
    Optional[str]
        Interval class label (e.g. 'd8', 'd7', 'u2', etc.)
        'd' = downward interval,
        'u' = upward interval,
        's' = same pitch, and
        't' = tritone.
        Returns None if input is None
    """
    # Clamp interval to [-12, 12] semitone range
    if pitch_interval is None:
        return None

    if pitch_interval < -12:
        pitch_interval = -12
    elif pitch_interval > 12:
        pitch_interval = 12

    # Map intervals to class labels based on Fantastic's scheme
    return self.interval_map[pitch_interval]

classify_ioi_ratio

classify_ioi_ratio(ioi_ratio: Optional[float]) -> Optional[str]

Classify an IOI ratio into relative rhythm classes.

Parameters:

  • ioi_ratio (float or None) –

    Inter-onset interval ratio between consecutive notes

Returns:

  • str or None

    'q' for quicker (<0.8119), 'e' for equal (0.8119-1.4946), and 'l' for longer (>1.4946). Returns None if input is None.

Source code in amads/algorithms/mtype_tokenizer.py
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def classify_ioi_ratio(self, ioi_ratio: Optional[float]) -> Optional[str]:
    """Classify an IOI ratio into relative rhythm classes.

    Parameters
    ----------
    ioi_ratio : float or None
        Inter-onset interval ratio between consecutive notes

    Returns
    -------
    str or None
        'q' for quicker (<0.8119),
        'e' for equal (0.8119-1.4946), and
        'l' for longer (>1.4946).
        Returns None if input is None.
    """
    if ioi_ratio is None:
        return None
    elif ioi_ratio < 0.8118987:
        return "q"
    elif ioi_ratio < 1.4945858:
        return "e"
    else:
        return "l"

MType

MType(
    pitch_interval_class: Optional[str], ioi_ratio_class: Optional[str]
)

A class for representing M-Types.

Source code in amads/algorithms/mtype_tokenizer.py
210
211
212
213
214
215
216
217
def __init__(
    self,
    pitch_interval_class: Optional[str],
    ioi_ratio_class: Optional[str],
):
    self.pitch_interval_class = pitch_interval_class
    self.ioi_ratio_class = ioi_ratio_class
    self.integer = self.encode()

Functions

__repr__

__repr__()

Return a string representation of the MType.

Source code in amads/algorithms/mtype_tokenizer.py
253
254
255
def __repr__(self):
    """Return a string representation of the MType."""
    return f"MType {self.integer}"