Skip to content

Replace Array.FindIndex with for loop to avoid closure allocation#94

Open
FMsongX2 wants to merge 1 commit into
Live2D:developfrom
FMsongX2:fix/renderer-findindex-alloc
Open

Replace Array.FindIndex with for loop to avoid closure allocation#94
FMsongX2 wants to merge 1 commit into
Live2D:developfrom
FMsongX2:fix/renderer-findindex-alloc

Conversation

@FMsongX2

Copy link
Copy Markdown

Summary

Replaces Array.FindIndex(..., lambda) calls on the per-frame render path with allocation-free for loops. The lambdas capture a loop variable, so each call allocates a Predicate<T> delegate. Behavior is unchanged; only the per-frame garbage is removed.

Problem

In CubismRenderController.OnDynamicDrawableData — invoked from CubismModel.Update() every frame — the renderer for each drawable is located with:

var rendererIndex = Array.FindIndex(renderers, cubismRenderer => cubismRenderer.Drawable.UnmanagedIndex == dataIndex);

This method has three loops over all drawables (visibility/order/opacity, multiply color, screen color), and each runs this FindIndex once per drawable before any dirty check — so it executes 3 × drawableCount times every frame. The lambda captures the loop variable dataIndex, so the compiler cannot cache it as a static delegate (only non-capturing lambdas are cached); a Predicate<T> delegate is allocated on every call. CubismRenderingInterceptController.TryDraw has the same pattern, capturing currentCamera.

Measurement

Compiled and run on .NET 10, Release (desktop CLR), measured with GC.GetAllocatedBytesForCurrentThread() over 100,000 calls:

Pattern Allocated (100k calls) Per call
Array.FindIndex + capturing lambda 6.42 MB 64.2 B
explicit for loop 0 B 0 B
Array.FindIndex + non-capturing lambda (control) 88 B (cached once) ≈ 0 B

The non-capturing control confirms the cause: capturing the loop variable defeats the compiler's delegate caching. The same allocation is produced under C# 7.3, 9, and 13 (identical bytes), so this is language semantics, not a compiler-version optimization. In the same harness the explicit loop was also faster (no per-element delegate indirection), so this is not an allocation-for-speed trade-off.

Runtime note: this was measured on desktop .NET, not inside Unity. Mono / IL2CPP were not directly measured, but the generated IL contains a per-call delegate newobj, so a per-call allocation occurs there too — only the byte size differs.

Changes

  • CubismRenderController.OnDynamicDrawableData: the three loops perform the same search (by Drawable.UnmanagedIndex over the same renderers array), so they now call one shared, allocation-free IndexOfDrawable helper.
  • CubismRenderingInterceptController.TryDraw: a single search over a different array (_cameraDrawStatus, keyed by Camera), with no reuse — so it is inlined as a for loop rather than sharing the helper.

Why this is safe

  • Functionally identical to Array.FindIndex: returns the first matching index, or -1 when none matches. No change to iteration order or to the existing rendererIndex < 0 / statusIndex < 0 handling.
  • Empty array returns -1, same as before. (renderers and _cameraDrawStatus are always initialized on these paths, so the null-array case does not arise.)
  • Keeps the lookup by Drawable.UnmanagedIndex — it does not assume positional alignment between renderers and data, so the search behavior is unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant