Skip to content

Refactor: split utils.py (4454 lines) into _validate.py / _color.py / _geometry.py and finish _datashader.py extraction #696

@timtreis

Description

@timtreis

Summary

src/spatialdata_plot/pl/utils.py is 4454 lines and mixes several unrelated concerns. Roughly ~3400 of those lines are two cohesive concerns (validation, color resolution) interleaved with everything else. Splitting them into sibling modules would massively improve navigability and shrink the blast radius of changes — with essentially no behavioral risk.

Why now

  • Any contributor touching one validation message or color helper must scroll a 4454-line file and risks editing adjacent unrelated code.
  • There is no import cycle in the extraction direction: utils.py imports only from render_params (a leaf) plus third-party libs — never from render.py/basic.py. The consumers import from utils.
  • There is an established precedent: pl/_palette.py and pl/_datashader.py are already self-contained sibling modules.
  • All symbols involved are private (_foo) — no public API surface to preserve.

Proposed split

New module Moves (approx lines) Examples
pl/_validate.py ~1400 _type_check_params (L2660), _validate_*_render_params ×5 (L3106–3835), _validate_show_parameters (L2483), _validate_col_for_column_table, uniqueness/shadow checks
pl/_color.py ~1370 _set_color_source_vec (L1265), _map_color_seg (L1486), _extract_colors_from_table_uns (L1689), categorical-mapping chain (L1616–1940), _color_vector_to_rgba, _get_linear_colormap, _set_outline
pl/_geometry.py ~460 _get_collection_shape (L601), _convert_shapes (L4273), pathpatch helpers
(finish) pl/_datashader.py ~234 the canvas/aggregate helpers still in utils.py (L3910–4451) that _datashader.py currently back-imports — the only near-cycle in the repo. Also fixes the _datshader_get_how_kw_for_spread typo.

After these moves, utils.py is left as a ~1000-line grab-bag of genuinely cross-cutting helpers (figure/axes, coord/extent, table extraction).

Risk / effort

Mechanical: each move is an import-path swap in render.py/basic.py/_datashader.py (their import blocks already group symbols by concern). No cycle, no public surface, covered by existing tests. Effort: medium · Risk: low. Best done as one module per PR.

Notes

  • The two mega-functions being relocated (_get_collection_shape, _convert_shapes, _type_check_params, _set_color_source_vec) are themselves decomposition candidates, but that is out of scope here — this issue is the relocation only, to keep diffs reviewable. Decomposition can follow once each lives in a smaller, single-concern file.

Part of a maintainability/refactor audit of main.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions