🌱 E2E Summary Output Fix#2751
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR updates the E2E summary generator to tolerate Prometheus queries returning multiple time series (e.g., due to operator pod restarts during serial E2E runs), and reorganizes the summary logic/templates under test/internal/summary.
Changes:
- Relaxed
PerformanceQueryto accept multiple result streams instead of requiring exactly one. - Added new markdown templates for the overall summary, alerts, and mermaid charts.
- Switched E2E suite wiring to call the new
test/internal/summarypackage and removed the old test util helper.
Reviewed changes
Copilot reviewed 4 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| test/internal/summary/templates/summary.md.tmpl | Adds the main E2E summary markdown template and performance section layout. |
| test/internal/summary/templates/mermaid_chart.md.tmpl | Adds a mermaid xychart-beta template used by performance queries. |
| test/internal/summary/templates/alert.md.tmpl | Adds alert rendering template for firing/pending alerts. |
| test/internal/summary/summary.go | Implements summary generation, Prometheus querying, and template execution with updated behavior for multiple streams. |
| test/internal/summary/artifacts.go | Renames package to align with the new summary package structure. |
| test/e2e/features_test.go | Calls summary.PrintSummary from the new package at the end of successful E2E runs. |
| internal/shared/util/test/utils.go | Removes the old FindK8sClient helper (now unused). |
Comments suppressed due to low confidence (2)
test/internal/summary/summary.go:87
- When all samples are <= 0 (e.g., negative
deriv(...)values), initializingchart.YMaxtomath.SmallestNonzeroFloat64preventsYMaxfrom ever updating (since0 > SmallestNonzeroFloat64is false). This produces an incorrect y-axis range in the rendered chart.
test/internal/summary/summary.go:96 PerformanceQuerynow allows multiple Prometheus result streams (e.g., when a pod restarts), but it concatenates each stream's samples in whatever order Prometheus returns them. That order is label-sorted, not time-sorted, so the finallinedata can be out of chronological order and render a misleading chart. Sort the streams by their first sample timestamp before appending values.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The serial e2e tests require pod restarts which causes the metrics to scrape two sets of results, which the summary generator did not allow. Now, the results from the new pod will be aggregated to the final set. Signed-off-by: Daniel Franz <dfranz@redhat.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2751 +/- ##
==========================================
- Coverage 67.10% 67.06% -0.04%
==========================================
Files 149 148 -1
Lines 11341 11330 -11
==========================================
- Hits 7610 7599 -11
+ Misses 3178 3177 -1
- Partials 553 554 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 7 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (2)
test/internal/summary/summary.go:96
- The current loop appends samples from every returned series in
matrixwithout preserving timestamp order. When Prometheus returns multiple series (e.g., old+new pod after a restart), concatenating series in label-sorted order can scramble the timeline and produce a misleading chart. Merge samples from all series in timestamp order (and error if no samples are returned) before computing min/max and chart data.
test/internal/summary/summary.go:159 executeTemplatebuilds the template path fromos.Getwd(), which makes summary generation sensitive to the caller's working directory. If the tests are invoked from a different cwd (or the process chdirs), template parsing will fail at runtime. Prefer resolving templates relative to the source (e.g.,//go:embed+template.ParseFS, or deriving the directory viaruntime.Caller) instead ofGetwd().
| if err := summary.PrintSummary(path); err != nil { | ||
| // Alert but do not fail the run if alerts are found | ||
| fmt.Printf("%v", err) | ||
| os.Exit(1) | ||
| } |
There was a problem hiding this comment.
We've not had the performance results for a long time, and currently they're firing alerts pretty reliably because of the recent move to concurrent test runs which cause the pods to use far more resources.
Until we've had a chance to re-tune the alert thresholds, I'd like for this to fail silently but at least give us the feedback we need to make those adjustments.
|
/lgtm |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rashmigottipati, tmshort The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
23b7e52
into
operator-framework:main
The serial e2e tests require pod restarts which causes the metrics to scrape two sets of results, which the summary generator did not allow. Now, the results from the new pod will be aggregated to the final set.
Description
Reviewer Checklist