Skip to content

Parallelize startup and harden playback/export A/V sync#1893

Merged
richiemcilroy merged 31 commits into
mainfrom
editor-performance
Jun 7, 2026
Merged

Parallelize startup and harden playback/export A/V sync#1893
richiemcilroy merged 31 commits into
mainfrom
editor-performance

Conversation

@richiemcilroy
Copy link
Copy Markdown
Member

@richiemcilroy richiemcilroy commented Jun 6, 2026

  • Parallelize editor startup (mic/system-audio load + GPU layer warmup overlap decoder creation)
    • Replace audio-timing-repair log-scraping with structured AudioGapSummary metadata (recorder → AudioMeta → editor)
    • Harden audio mixer against offset overflow (i128 intermediates) and skip decode work for sub-fallback frames
    • Synthesize a default timeline for un-edited recordings (decoded media duration) so raw/CLI exports render
    • Gapless export audio on zero-frame fallback; fragmented-recording remux on finalize/export
    • Tighten the audio-ready handshake timeout to prevent unsynced playback

Greptile Summary

This PR hardens A/V sync across playback and export through several coordinated changes: structured AudioGapSummary metadata replaces log-scraping for startup-offset repair, GPU layer warmup is parallelised with decoder creation, the video clock is now stamped only after the audio device fires its first callback, and a default timeline is synthesised from encoded frame counts at recording stop so un-edited recordings are immediately exportable.

  • A/V sync: start = Instant::now() moved after AudioPlayback::spawn() returns, which blocks on the first device callback via a new channel handshake; initial latency offset switched from initial_compensation_secs (2× safety multiplier) to raw initial_output_latency_secs; silent audio frame substituted for None renders during export to prevent gaps.
  • AudioGapSummary pipeline: structured startup_overlap_trimmed_ms / startup_overlap_drops counters are accumulated in AudioGapTracker, stored in a OnceLock on FinishedOutputPipeline, written into AudioMeta.gap_summary, and consumed by audio_timing_repair_offset in the editor—replacing fragile log-scraping while keeping a legacy fallback for old recordings.
  • Overflow hardening: audio mixer uses i128 intermediates and usize::try_from instead of isize and wrapping_add_signed to prevent index overflow on large negative track offsets.

Confidence Score: 4/5

Safe to merge; the changes are well-tested and correctness-focused, with one open concern around startup latency for very long recordings.

The audio timing-repair rewrite is thorough and well-covered by new unit tests. The start = Instant::now() move correctly ties video to the audio device's first callback, fixing the sync regression. The one remaining concern is that AudioPlayback::spawn() has no upper-bound timeout for the PrerenderedAudioBuffer::new() pre-render phase: for a multi-hour recording the polling loop can silently stall for tens of seconds before playback begins.

crates/editor/src/playback.rs — the spawn() polling loop should have an outer timeout covering the pre-render phase, not just AUDIO_FIRST_CALLBACK_TIMEOUT which starts after stream.play().

Important Files Changed

Filename Overview
crates/editor/src/playback.rs Moves start = Instant::now() after audio_playback.spawn() to synchronize A/V start; adds first-callback handshake with AUDIO_FIRST_CALLBACK_TIMEOUT; hardens prefetch_duration and duration against invalid timelines; changes initial latency from initial_compensation_secs (2× multiplier) to raw initial_output_latency_secs.
crates/editor/src/editor_instance.rs Replaces log-scraping A/V timing repair with structured AudioGapSummary metadata; adds LegacyAudioTimingRepair fallback for old recordings; parallelises audio I/O via spawn_audio_load; removes 5 s hardcoded fallback duration; minor redundant guard in audio_timing_repair_offset.
crates/recording/src/output_pipeline/core.rs Adds AudioGapSummary struct and per-event startup/total accounting to AudioGapTracker; passes frame_count to record_overlap; stores summary in OnceLock on OutputPipeline and surfaces it in FinishedOutputPipeline.
crates/recording/src/studio_recording.rs Synthesizes TimelineConfiguration and ClipConfiguration from encoded frame_count/fps at recording stop; propagates gap_summary into AudioMeta; persists populated project config so un-edited recordings have a valid timeline for export/playback.
crates/audio/src/renderer.rs Replaces i32/isize arithmetic with i128 intermediates and usize::try_from to prevent overflow on large offsets; fixes stereo base-index calculation; adds regression test for negative offset (leading-silence) behaviour.
crates/export/src/mp4.rs Emits a silent audio frame instead of skipping on zero-frame fallback, ensuring gapless export when the audio renderer returns None for a given time slot.
crates/editor/src/editor.rs Splits start_renderer_layers_creation into start + finish_renderer_layers_creation so GPU layer warmup overlaps decoder/segment creation; removes now-unused total_frames field.
crates/recording/src/feeds/microphone.rs Adds normalize_pending_timestamps to correct stale startup burst timestamps before they are enqueued, preventing phantom overlap events that would otherwise inflate the gap summary.
crates/rendering/src/decoder/avassetreader.rs Computes minimum_fallback_frame from pending requests and skips decode work for frames below that bound, reducing unnecessary decode overhead during seek.

Comments Outside Diff (1)

  1. crates/editor/src/editor_instance.rs, line 176-180 (link)

    P2 Hardcoded 5.0 s fallback creates incorrect timeline for any recording not ~5 s long

    When both Video::new and get_video_duration_fallback fail (e.g., partially written or corrupt display file), the synthesized TimelineSegment uses end: 5.0. A 30-second recording whose display file is unreadable would export with a 5-second timeline, silently truncating content. Since this fallback is exercised for real on first open (before recovery or remux), the project config is then written with the wrong end value and persisted. Returning None from the filter in the MultipleSegments arm (same as the explicit duration <= 0.0 guard just below) and warning the user would be safer than silently synthesising an arbitrary duration.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/editor/src/editor_instance.rs
    Line: 176-180
    
    Comment:
    **Hardcoded 5.0 s fallback creates incorrect timeline for any recording not ~5 s long**
    
    When both `Video::new` and `get_video_duration_fallback` fail (e.g., partially written or corrupt display file), the synthesized `TimelineSegment` uses `end: 5.0`. A 30-second recording whose display file is unreadable would export with a 5-second timeline, silently truncating content. Since this fallback is exercised for real on first open (before recovery or remux), the project config is then written with the wrong `end` value and persisted. Returning `None` from the filter in the `MultipleSegments` arm (same as the explicit `duration <= 0.0` guard just below) and warning the user would be safer than silently synthesising an arbitrary duration.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
crates/editor/src/playback.rs:1285-1356
**Pre-render phase has no upper-bound timeout in `spawn()`**

`AudioPlayback::spawn()` polls `ready_rx` every 50 ms and only exits when the audio thread sends a value or `stop_rx` fires. The audio thread sends `ready_tx` only _after_ `PrerenderedAudioBuffer::new()` completes **and** the first device callback arrives. `PrerenderedAudioBuffer::new()` is O(recording length), so for a multi-hour recording on modest hardware it can take well over 10–20 s. During that entire window the polling loop spins with no way out except user cancellation.

`AUDIO_FIRST_CALLBACK_TIMEOUT` does not help here: it only starts timing after `stream.play()`, which is called _after_ pre-render finishes. The net effect is that pressing play on a long recording can silently stall for tens of seconds before video begins—trading the previous async desync for a synchronous startup freeze.

### Issue 2 of 2
crates/editor/src/editor_instance.rs:759-762
The `overlap_dropped_frames < MIN_STALE_STARTUP_DROPS` guard is unreachable: `startup_overlap_drops ≤ overlap_dropped_frames` by construction, so if the `startup_overlap_drops` check passes (`>= 3`), `overlap_dropped_frames >= 3` is guaranteed. Removing the redundant condition avoids misleading future readers into thinking total drops could prevent repair when startup drops are sufficient.

```suggestion
    if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS
        || !(MIN_STALE_STARTUP_TRIMMED_MS..=MAX_STALE_STARTUP_REPAIR_MS)
            .contains(&summary.startup_overlap_trimmed_ms)
```

Reviews (3): Last reviewed commit: "Add test for negative timing offset cons..." | Re-trigger Greptile

@polarityinc
Copy link
Copy Markdown

polarityinc Bot commented Jun 6, 2026

Paragon Review Skipped

Hi @richiemcilroy! Your Polarity credit balance is insufficient to complete this review.

Please visit https://app.paragon.run to finish your review.

self.total_overlap_trimmed = self.total_overlap_trimmed.saturating_add(overlap);
if dropped_whole_frame {
self.overlap_dropped_frames += 1;
if frame_count <= STARTUP_OVERLAP_DROP_FRAME_LIMIT {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor off-by-one risk: frame_count looks 0-based (incremented after processing), so <= STARTUP_OVERLAP_DROP_FRAME_LIMIT counts 4 frames when the limit is 3. If you meant “first 3 frames”, consider switching to < (or renaming the constant/param to make the indexing explicit).

Suggested change
if frame_count <= STARTUP_OVERLAP_DROP_FRAME_LIMIT {
if frame_count < STARTUP_OVERLAP_DROP_FRAME_LIMIT {
self.startup_overlap_drops += 1;
}

return 0.0;
};

if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

audio_timing_repair_offset applies total_overlap_trimmed_ms (whole recording). If there are any whole-frame overlap drops outside the startup window, this could over-shift audio. One cheap guard is to only apply when all drops were in startup.

Suggested change
if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS
if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS
|| summary.overlap_dropped_frames < MIN_STALE_STARTUP_DROPS
|| summary.startup_overlap_drops != summary.overlap_dropped_frames
|| !(MIN_STALE_STARTUP_TRIMMED_MS..=MAX_STALE_STARTUP_REPAIR_MS)
.contains(&summary.total_overlap_trimmed_ms)
{
return 0.0;
}

Comment thread crates/editor/src/editor_instance.rs Outdated
Comment on lines +646 to +659
fn audio_timing_repair_offset(summary: Option<&cap_project::AudioGapSummary>) -> f32 {
let Some(summary) = summary else {
return 0.0;
};

if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS
|| summary.overlap_dropped_frames < MIN_STALE_STARTUP_DROPS
|| !(MIN_STALE_STARTUP_TRIMMED_MS..=MAX_STALE_STARTUP_REPAIR_MS)
.contains(&summary.total_overlap_trimmed_ms)
{
return 0.0;
}

-(summary.total_overlap_trimmed_ms as f32 / 1_000.0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 total_overlap_trimmed_ms may include mid-recording trims, overcorrecting silence insertion

audio_timing_repair_offset uses summary.total_overlap_trimmed_ms — which accumulates ALL overlap trims throughout the entire recording — as the correction amount. The startup_overlap_drops >= 3 guard confirms that a stale-startup burst occurred, but the silence inserted equals the full lifetime total, not just the startup portion. A recording that has, say, 200 ms of startup drops and 800 ms of unrelated mid-recording trims would insert 1 000 ms of leading silence, shifting audio 800 ms too late and creating new A/V desync instead of fixing it. Adding a startup_overlap_trimmed_ms field alongside startup_overlap_drops in AudioGapSummary would allow the correction to be scoped to the actual startup trim.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/editor/src/editor_instance.rs
Line: 646-659

Comment:
**`total_overlap_trimmed_ms` may include mid-recording trims, overcorrecting silence insertion**

`audio_timing_repair_offset` uses `summary.total_overlap_trimmed_ms` — which accumulates ALL overlap trims throughout the entire recording — as the correction amount. The `startup_overlap_drops >= 3` guard confirms that a stale-startup burst occurred, but the silence inserted equals the full lifetime total, not just the startup portion. A recording that has, say, 200 ms of startup drops and 800 ms of unrelated mid-recording trims would insert 1 000 ms of leading silence, shifting audio 800 ms too late and creating new A/V desync instead of fixing it. Adding a `startup_overlap_trimmed_ms` field alongside `startup_overlap_drops` in `AudioGapSummary` would allow the correction to be scoped to the actual startup trim.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread crates/editor/src/playback.rs Outdated
Comment on lines +1180 to +1187
/// How long the caller waits for the audio thread's ready verdict. This MUST stay strictly larger
/// than [`AUDIO_FIRST_CALLBACK_TIMEOUT`] plus the thread's setup time so the thread's verdict
/// always wins the race. The setup includes a synchronous full pre-render of the recording, which
/// scales with its length, hence the generous headroom. If the caller times out first it concludes
/// "no audio" and stops feeding the playhead while the audio thread may still start a now-unsynced
/// stream — the desync this guards against. The thread reports failures immediately, so this only
/// bounds a genuinely wedged device and is never waited out in normal operation.
const AUDIO_READY_TIMEOUT: Duration = Duration::from_secs(15);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Windows pre-render can exceed AUDIO_READY_TIMEOUT, causing UI freeze then unsynced audio

On Windows, create_stream_prerendered builds a PrerenderedAudioBuffer entirely in memory before the stream is created — this work is O(recording length). For a multi-hour recording on modest hardware, the pre-render can take well over 15 seconds. When ready_rx.recv_timeout(AUDIO_READY_TIMEOUT) fires first, spawn() returns false, the video starts playing with has_audio = false, and start = Instant::now() is set. The audio thread eventually finishes pre-rendering, builds and plays the stream, and calls handle.block_on(stop_rx.changed()) — by which point the video playhead has already advanced by the overshoot time, so audio plays unsynced for the rest of the session. The desync scenario described in the comment is live for Windows users on long recordings.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/editor/src/playback.rs
Line: 1180-1187

Comment:
**Windows pre-render can exceed `AUDIO_READY_TIMEOUT`, causing UI freeze then unsynced audio**

On Windows, `create_stream_prerendered` builds a `PrerenderedAudioBuffer` entirely in memory before the stream is created — this work is O(recording length). For a multi-hour recording on modest hardware, the pre-render can take well over 15 seconds. When `ready_rx.recv_timeout(AUDIO_READY_TIMEOUT)` fires first, `spawn()` returns `false`, the video starts playing with `has_audio = false`, and `start = Instant::now()` is set. The audio thread eventually finishes pre-rendering, builds and plays the stream, and calls `handle.block_on(stop_rx.changed())` — by which point the video playhead has already advanced by the overshoot time, so audio plays unsynced for the rest of the session. The desync scenario described in the comment is live for Windows users on long recordings.

How can I resolve this? If you propose a fix, please make it concise.

@richiemcilroy
Copy link
Copy Markdown
Member Author

hey @greptileai please re-review the PR

@richiemcilroy
Copy link
Copy Markdown
Member Author

hey @greptileai please re-review the PR

@richiemcilroy richiemcilroy merged commit 2deaf26 into main Jun 7, 2026
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant