Complexity
complexity
¶
This module provides functionality for measuring the complexity of discrete sequences using the LZ77 compression algorithm.
Author: Huw Cheston (2025)
References
-
Ziv, J., & Lempel, A. (1977). A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3), 337–343. https://doi.org/10.1109/TIT.1977.1055714
-
Cheston, H., Schlichting, J. L., Cross, I., & Harrison, P. M. C. (2024). Rhythmic qualities of jazz improvisation predict performer identity and style in source-separated audio recordings. Royal Society Open Science, 11(11). https://doi.org/10.1098/rsos.240920
lz77_encode
¶
lz77_encode(input_list: list[Hashable]) -> list
Runs the LZ77 compression algorithm over the input data,
generating tuples of (distance, length, symbol)
Parameters:
-
input_list(list[Hashable]) –The input sequence to be compressed.
Returns:
-
list–A list of tuples of (distance, length, symbol) representing the LZ77 encoded data.
Source code in amads/algorithms/complexity.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
lz77_decode
¶
lz77_decode(encoded: list[tuple[int, int, Hashable]]) -> list[Hashable]
Decode a list of LZ77 tokens, each of which are 3-tuples with form (distance, length, symbol)
Parameters:
-
encoded(list[tuple[int, int, Hashable]]) –A list of LZ77 encoded tokens.
Returns:
-
list[Hashable]–The decoded sequence.
Source code in amads/algorithms/complexity.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
lz77_complexity
¶
lz77_complexity(
sequence: Iterable[Hashable], normalized: bool = False
) -> Union[int, float]
Compresses a discrete sequence using the LZ77 algorithm and returns the length of the compressed output.
This function applies LZ77 compression to a sequence of hashable elements (e.g., strings, floats, integers). Higher compression lengths indicate greater complexity in the sequence.
Parameters:
-
sequence(Iterable[Hashable]) –A discrete sequence, such as pitch classes or inter-onset intervals.
-
normalized(Optional(bool), default:False) –If True, the result is expressed relative to the input size, where 1.0 indicates maximum complexity (i.e., no compression is possible). Defaults to False.
Returns:
-
int | float:–The length of the compressed sequence, either as a raw integer or a normalized float.
Source code in amads/algorithms/complexity.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | |