Parallelize startup and harden playback/export A/V sync#1893
Conversation
|
Paragon Review Skipped Hi @richiemcilroy! Your Polarity credit balance is insufficient to complete this review. Please visit https://app.paragon.run to finish your review. |
| self.total_overlap_trimmed = self.total_overlap_trimmed.saturating_add(overlap); | ||
| if dropped_whole_frame { | ||
| self.overlap_dropped_frames += 1; | ||
| if frame_count <= STARTUP_OVERLAP_DROP_FRAME_LIMIT { |
There was a problem hiding this comment.
Minor off-by-one risk: frame_count looks 0-based (incremented after processing), so <= STARTUP_OVERLAP_DROP_FRAME_LIMIT counts 4 frames when the limit is 3. If you meant “first 3 frames”, consider switching to < (or renaming the constant/param to make the indexing explicit).
| if frame_count <= STARTUP_OVERLAP_DROP_FRAME_LIMIT { | |
| if frame_count < STARTUP_OVERLAP_DROP_FRAME_LIMIT { | |
| self.startup_overlap_drops += 1; | |
| } |
| return 0.0; | ||
| }; | ||
|
|
||
| if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS |
There was a problem hiding this comment.
audio_timing_repair_offset applies total_overlap_trimmed_ms (whole recording). If there are any whole-frame overlap drops outside the startup window, this could over-shift audio. One cheap guard is to only apply when all drops were in startup.
| if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS | |
| if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS | |
| || summary.overlap_dropped_frames < MIN_STALE_STARTUP_DROPS | |
| || summary.startup_overlap_drops != summary.overlap_dropped_frames | |
| || !(MIN_STALE_STARTUP_TRIMMED_MS..=MAX_STALE_STARTUP_REPAIR_MS) | |
| .contains(&summary.total_overlap_trimmed_ms) | |
| { | |
| return 0.0; | |
| } |
| fn audio_timing_repair_offset(summary: Option<&cap_project::AudioGapSummary>) -> f32 { | ||
| let Some(summary) = summary else { | ||
| return 0.0; | ||
| }; | ||
|
|
||
| if summary.startup_overlap_drops < MIN_STALE_STARTUP_DROPS | ||
| || summary.overlap_dropped_frames < MIN_STALE_STARTUP_DROPS | ||
| || !(MIN_STALE_STARTUP_TRIMMED_MS..=MAX_STALE_STARTUP_REPAIR_MS) | ||
| .contains(&summary.total_overlap_trimmed_ms) | ||
| { | ||
| return 0.0; | ||
| } | ||
|
|
||
| -(summary.total_overlap_trimmed_ms as f32 / 1_000.0) |
There was a problem hiding this comment.
total_overlap_trimmed_ms may include mid-recording trims, overcorrecting silence insertion
audio_timing_repair_offset uses summary.total_overlap_trimmed_ms — which accumulates ALL overlap trims throughout the entire recording — as the correction amount. The startup_overlap_drops >= 3 guard confirms that a stale-startup burst occurred, but the silence inserted equals the full lifetime total, not just the startup portion. A recording that has, say, 200 ms of startup drops and 800 ms of unrelated mid-recording trims would insert 1 000 ms of leading silence, shifting audio 800 ms too late and creating new A/V desync instead of fixing it. Adding a startup_overlap_trimmed_ms field alongside startup_overlap_drops in AudioGapSummary would allow the correction to be scoped to the actual startup trim.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/editor/src/editor_instance.rs
Line: 646-659
Comment:
**`total_overlap_trimmed_ms` may include mid-recording trims, overcorrecting silence insertion**
`audio_timing_repair_offset` uses `summary.total_overlap_trimmed_ms` — which accumulates ALL overlap trims throughout the entire recording — as the correction amount. The `startup_overlap_drops >= 3` guard confirms that a stale-startup burst occurred, but the silence inserted equals the full lifetime total, not just the startup portion. A recording that has, say, 200 ms of startup drops and 800 ms of unrelated mid-recording trims would insert 1 000 ms of leading silence, shifting audio 800 ms too late and creating new A/V desync instead of fixing it. Adding a `startup_overlap_trimmed_ms` field alongside `startup_overlap_drops` in `AudioGapSummary` would allow the correction to be scoped to the actual startup trim.
How can I resolve this? If you propose a fix, please make it concise.| /// How long the caller waits for the audio thread's ready verdict. This MUST stay strictly larger | ||
| /// than [`AUDIO_FIRST_CALLBACK_TIMEOUT`] plus the thread's setup time so the thread's verdict | ||
| /// always wins the race. The setup includes a synchronous full pre-render of the recording, which | ||
| /// scales with its length, hence the generous headroom. If the caller times out first it concludes | ||
| /// "no audio" and stops feeding the playhead while the audio thread may still start a now-unsynced | ||
| /// stream — the desync this guards against. The thread reports failures immediately, so this only | ||
| /// bounds a genuinely wedged device and is never waited out in normal operation. | ||
| const AUDIO_READY_TIMEOUT: Duration = Duration::from_secs(15); |
There was a problem hiding this comment.
Windows pre-render can exceed
AUDIO_READY_TIMEOUT, causing UI freeze then unsynced audio
On Windows, create_stream_prerendered builds a PrerenderedAudioBuffer entirely in memory before the stream is created — this work is O(recording length). For a multi-hour recording on modest hardware, the pre-render can take well over 15 seconds. When ready_rx.recv_timeout(AUDIO_READY_TIMEOUT) fires first, spawn() returns false, the video starts playing with has_audio = false, and start = Instant::now() is set. The audio thread eventually finishes pre-rendering, builds and plays the stream, and calls handle.block_on(stop_rx.changed()) — by which point the video playhead has already advanced by the overshoot time, so audio plays unsynced for the rest of the session. The desync scenario described in the comment is live for Windows users on long recordings.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/editor/src/playback.rs
Line: 1180-1187
Comment:
**Windows pre-render can exceed `AUDIO_READY_TIMEOUT`, causing UI freeze then unsynced audio**
On Windows, `create_stream_prerendered` builds a `PrerenderedAudioBuffer` entirely in memory before the stream is created — this work is O(recording length). For a multi-hour recording on modest hardware, the pre-render can take well over 15 seconds. When `ready_rx.recv_timeout(AUDIO_READY_TIMEOUT)` fires first, `spawn()` returns `false`, the video starts playing with `has_audio = false`, and `start = Instant::now()` is set. The audio thread eventually finishes pre-rendering, builds and plays the stream, and calls `handle.block_on(stop_rx.changed())` — by which point the video playhead has already advanced by the overshoot time, so audio plays unsynced for the rest of the session. The desync scenario described in the comment is live for Windows users on long recordings.
How can I resolve this? If you propose a fix, please make it concise.|
hey @greptileai please re-review the PR |
|
hey @greptileai please re-review the PR |
Greptile Summary
This PR hardens A/V sync across playback and export through several coordinated changes: structured
AudioGapSummarymetadata replaces log-scraping for startup-offset repair, GPU layer warmup is parallelised with decoder creation, the video clock is now stamped only after the audio device fires its first callback, and a default timeline is synthesised from encoded frame counts at recording stop so un-edited recordings are immediately exportable.start = Instant::now()moved afterAudioPlayback::spawn()returns, which blocks on the first device callback via a new channel handshake; initial latency offset switched frominitial_compensation_secs(2× safety multiplier) to rawinitial_output_latency_secs; silent audio frame substituted forNonerenders during export to prevent gaps.AudioGapSummarypipeline: structuredstartup_overlap_trimmed_ms/startup_overlap_dropscounters are accumulated inAudioGapTracker, stored in aOnceLockonFinishedOutputPipeline, written intoAudioMeta.gap_summary, and consumed byaudio_timing_repair_offsetin the editor—replacing fragile log-scraping while keeping a legacy fallback for old recordings.i128intermediates andusize::try_frominstead ofisizeandwrapping_add_signedto prevent index overflow on large negative track offsets.Confidence Score: 4/5
Safe to merge; the changes are well-tested and correctness-focused, with one open concern around startup latency for very long recordings.
The audio timing-repair rewrite is thorough and well-covered by new unit tests. The
start = Instant::now()move correctly ties video to the audio device's first callback, fixing the sync regression. The one remaining concern is thatAudioPlayback::spawn()has no upper-bound timeout for thePrerenderedAudioBuffer::new()pre-render phase: for a multi-hour recording the polling loop can silently stall for tens of seconds before playback begins.crates/editor/src/playback.rs — the spawn() polling loop should have an outer timeout covering the pre-render phase, not just AUDIO_FIRST_CALLBACK_TIMEOUT which starts after stream.play().
Important Files Changed
start = Instant::now()afteraudio_playback.spawn()to synchronize A/V start; adds first-callback handshake withAUDIO_FIRST_CALLBACK_TIMEOUT; hardensprefetch_durationanddurationagainst invalid timelines; changes initial latency frominitial_compensation_secs(2× multiplier) to rawinitial_output_latency_secs.AudioGapSummarymetadata; addsLegacyAudioTimingRepairfallback for old recordings; parallelises audio I/O viaspawn_audio_load; removes 5 s hardcoded fallback duration; minor redundant guard inaudio_timing_repair_offset.AudioGapSummarystruct and per-event startup/total accounting toAudioGapTracker; passesframe_counttorecord_overlap; stores summary inOnceLockonOutputPipelineand surfaces it inFinishedOutputPipeline.TimelineConfigurationandClipConfigurationfrom encoded frame_count/fps at recording stop; propagatesgap_summaryintoAudioMeta; persists populated project config so un-edited recordings have a valid timeline for export/playback.usize::try_fromto prevent overflow on large offsets; fixes stereo base-index calculation; adds regression test for negative offset (leading-silence) behaviour.start_renderer_layers_creationinto start +finish_renderer_layers_creationso GPU layer warmup overlaps decoder/segment creation; removes now-unusedtotal_framesfield.normalize_pending_timestampsto correct stale startup burst timestamps before they are enqueued, preventing phantom overlap events that would otherwise inflate the gap summary.minimum_fallback_framefrom pending requests and skips decode work for frames below that bound, reducing unnecessary decode overhead during seek.Comments Outside Diff (1)
crates/editor/src/editor_instance.rs, line 176-180 (link)When both
Video::newandget_video_duration_fallbackfail (e.g., partially written or corrupt display file), the synthesizedTimelineSegmentusesend: 5.0. A 30-second recording whose display file is unreadable would export with a 5-second timeline, silently truncating content. Since this fallback is exercised for real on first open (before recovery or remux), the project config is then written with the wrongendvalue and persisted. ReturningNonefrom the filter in theMultipleSegmentsarm (same as the explicitduration <= 0.0guard just below) and warning the user would be safer than silently synthesising an arbitrary duration.Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (3): Last reviewed commit: "Add test for negative timing offset cons..." | Re-trigger Greptile