perf(shapes): build matplotlib patches once, share across fill/outline#691
Merged
Conversation
The matplotlib branch of _render_shapes called _get_collection_shape separately for the outer outline, inner outline, and fill, each rebuilding the same patch geometry from scratch via GeoDataFrame.iterrows()/to_dict(). For shape elements this geometry construction is the dominant render cost and it ran 2-3x per plot. Factor the colour-independent geometry build into _build_shape_patches() and call it once in _render_shapes, passing the result to each collection via a new optional prebuilt_patches argument. Also drop the per-row iterrows()/Series construction in favour of columnar iteration and resolve the scale scalar once. Output is unchanged: RGBA buffers are byte-identical to main across 16 scenarios (plain/outline/fill+outline/scaled circles, polygons, multipolygons, categorical/continuous fill, groups, na_color). ~35% faster on a 2k-shape outline render (763 -> ~500 ms), scaling linearly with shape count.
Apply /simplify cleanups (output byte-identical, 16-scenario parity holds): - expand fill colours via numpy fancy-indexing instead of .tolist() + a python loop (drops a dead hasattr fallback; fill_c is always an ndarray); - normalize geometries with a single vectorized shapely.normalize call, falling back to per-geometry only if the bulk call rejects an input, and materialize the geometry array once; - index the radius array with numpy boolean masking; - pass the already-resolved scale scalar to _scale_pathpatch_around_centroid so the MultiPolygon branch doesn't re-extract it.
This was referenced Jun 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
The matplotlib branch of
_render_shapesbuilds 2–3PatchCollections (outeroutline, optional inner outline, fill), and each call to
_get_collection_shaperebuilt the same patch geometry from scratch via
GeoDataFrame.iterrows()+row.to_dict(). For shape elements this geometry construction is the dominant rendercost, and it ran once per layer.
This is the top item from the profiling write-up in #690 (the biggest single payoff).
Change
_build_shape_patches()helper and call it once in
_render_shapes, sharing the result across allcollections via a new optional
prebuilt_patches=argument on_get_collection_shape(back-compatible — defaults toNone, in which case itbuilds as before).
iterrows()/to_dict()loop with columnar iteration, andresolve the
scalescalar once instead of per shape._get_collection_shape's colour logic is unchanged; it now expands per-shape fillcolours to per-patch (preserving MultiPolygon expansion and the single-colour
broadcast) and assembles the
PatchCollection.Why sharing is safe:
PatchCollection.set_pathsdoesp.get_transform().transform_path(p.get_path()), i.e. it bakes a fresh, independentPathper collection, so building the patch list once and handing it to multiplecollections (and the existing per-collection
trans.transformvertex update) does notcross-contaminate. Verified empirically.
Correctness — byte-identical output
Rendered 16 scenarios on this branch and on
main, comparing the Agg RGBA buffersexactly (
np.array_equal): all identical.This covers the tricky paths: MultiPolygon → multiple patches with replicated colour,
groupsfiltering (fewer shapes),na_color, and centroid scaling.Performance
render_shapes(..., outline_alpha=1.0)at 2000 shapes: 763 → ~500 ms (~35%),medium
blobs, Agg, dpi 100. Scales linearly with shape count, so the absolute savinggrows on large datasets (shape rendering was measured at ~3 s for 8k shapes on
main).The datashader branch (auto-selected for >10k shapes) is untouched.