Style#
Code style#
This document outlines the coding style guidelines for contributing to this project.
Code organization#
Modules should be organized in a logical hierarchy that reflects their purpose. For example, complexity algorithms go in:
algorithm/complexity/lz77.py
Note that functions will be importable in multiple ways:
from amads.harmony.root_finding.parncutt_1988 import root
from amads.all import root_parncutt_1988
The first style is more verbose, but it makes the logical organization of the package more explicit. The second style is more appropriate for interactive use.
In order to support the second style, we add import statements to the __init__.py
file of each module.
For example, the __init__.py
file for the root_finding
module contains:
from .parncutt_1988 import root as root_parncutt_1988
Then the __init__.py
file for the harmony
module contains:
from .root_finding import *
Finally, the all.py
file for the amads
package contains:
from .harmony import *
Function naming#
Be explicit about what functions return. Don’t make users guess:
# Good
lz77_size()
lz77_compression_ratio()
# Bad
lz77() # Unclear what this returns
Code structure#
Local function definitions should be avoided as they can negatively impact performance. Instead, define functions at module level:
# Good
def helper_function(x):
return x * 2
def main_function(x):
return helper_function(x)
# Bad
def main_function(x):
def helper_function(x): # Defined locally - avoid this
return x * 2
return helper_function(x)
We plan to implement a pipeline for standardizing code formatting using black
. This will ensure consistent code style across the project.
Docstrings should use numpydoc formatting:
def calculate_entropy(pitches: list[int]) -> float:
"""Calculate the entropy of a pitch sequence.
Parameters
----------
pitches
List of MIDI pitch numbers
Returns
-------
float
Entropy value between 0 and 1
Examples
--------
>>> calculate_entropy([60, 62, 64])
0.682
"""
pass
External package imports (except numpy) should be done locally within functions for efficiency. This avoids loading unused dependencies:
# Good
def plot_histogram(data):
import matplotlib.pyplot as plt # Import inside function
plt.hist(data)
plt.show()
# Bad
import matplotlib.pyplot as plt # Global import - avoid this
def plot_histogram(data):
plt.hist(data)
plt.show()
Types#
Provide type hints for function parameters and return types
If a function accepts either float or int you can use float as the type hint, int will be understood as being accepted too
Functions should accept Python base types as inputs but can optionally support numpy arrays
Return Python base types by default, use numpy types only when necessary
For internal computations, either base Python or numpy is fine
Where possible, only take simple singular input types and let users handle iteration
Common patterns#
When implementing algorithms, we distinguish between internal and external functions. Internal functions implement the core algorithm or equation. External functions wrap these internal implementations, handling input validation, type checking, and any necessary data conversion. This separation of concerns helps keep the core algorithmic logic clean and focused while ensuring robust input handling at the API level.
For example:
# External function
def calculate_entropy(pitches: list[int]) -> float:
"""Calculate the entropy of a pitch sequence.
Handles input validation and conversion before calling _calculate_entropy_core().
"""
if not pitches:
raise ValueError("Input pitch list cannot be empty")
# Convert pitches to counts
from collections import Counter
counts = list(Counter(pitches).values())
return _calculate_entropy(counts)
# Internal function
def _calculate_entropy(counts: list[int]) -> float:
"""Core entropy calculation from Shannon (1948).
Internal function that implements the entropy formula.
Assumes input has been validated.
"""
total = sum(counts)
probabilities = [c/total for c in counts]
return -sum(p * math.log2(p) for p in probabilities if p > 0)
Put the external function at the beginning of the module, so that it’s the first thing the user sees. Note that we prefix the internal function with an underscore, to indicate that it’s not part of the public API.
References#
Include references with DOIs/URLs where possible. Here are some examples:
[1]: Ziv, J., & Lempel, A. (1977). A universal algorithm for sequential data compression.
IEEE Transactions on Information Theory. 23/3 (pp. 337–343).
https://doi.org/10.1109/TIT.1977.1055714
[2]: Cheston, H., Schlichting, J. L., Cross, I., & Harrison, P. M. C. (2024).
Rhythmic qualities of jazz improvisation predict performer identity and style
in source-separated audio recordings. Royal Society Open Science. 11/11.
https://doi.org/10.1098/rsos.231023