chore: add qwen perf yaml by tianmu-li · Pull Request #369 · mlcommons/endpoints

tianmu-li · 2026-06-22T18:18:49Z

What does this PR do?

Add performance run .yaml file for Qwen3.6-35B-A3B for agentic inference

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

github-actions · 2026-06-22T18:19:03Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

gemini-code-assist

Code Review

This pull request introduces a new benchmark configuration file, qwen_agentic_benchmark.yaml, for Qwen agentic inference. The review feedback highlights missing configuration parameters required to comply with benchmark invariants, specifically recommending the addition of num_trajectories_to_issue and stop_issuing_on_first_user_complete under agentic_inference, as well as the settings.client configuration block.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-22T18:19:45Z

+    agentic_inference:
+      enable_salt: true # do not change.
+      inject_tool_delay: true # do not change.


To ensure compliance with the benchmark invariants and to make the configuration complete, please explicitly specify num_trajectories_to_issue and stop_issuing_on_first_user_complete under agentic_inference.

agentic_inference: enable_salt: true # do not change. inject_tool_delay: true # do not change. num_trajectories_to_issue: 990 # Should be integer multiple of dataset trajectory count. stop_issuing_on_first_user_complete: false # required benchmark default.

gemini-code-assist · 2026-06-22T18:19:45Z

+  load_pattern:
+    type: agentic_inference
+    target_concurrency: 8 # Submission-specific concurrency.


The settings.client configuration is missing. For official agentic benchmark runs, the client settings warmup_connections: 0 and max_idle_time: 0.5 are required invariants to ensure consistent and comparable performance results.

load_pattern: type: agentic_inference target_concurrency: 8 # Submission-specific concurrency. client: warmup_connections: 0 max_idle_time: 0.5

Copilot

Pull request overview

Adds a runnable benchmark configuration YAML under examples/10_Agentic_Inference/ for running an online performance benchmark of Qwen/Qwen3.6-35B-A3B using the agentic inference load pattern.

Changes:

Add qwen_agentic_benchmark.yaml example config for agentic inference performance runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+  load_pattern:
+    type: agentic_inference
+    target_concurrency: 8 # Submission-specific concurrency.
+


+    agentic_inference:
+      enable_salt: true # do not change.
+      inject_tool_delay: true # do not change.
+


Add qwen perf yaml

185bc91

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

tianmu-li requested review from a team and Copilot June 22, 2026 18:18

Copilot started reviewing on behalf of tianmu-li June 22, 2026 18:19 View session

Minor typo

b5ea6c2

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

gemini-code-assist Bot reviewed Jun 22, 2026

View reviewed changes

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: add qwen perf yaml#369

chore: add qwen perf yaml#369
tianmu-li wants to merge 2 commits into
mlcommons:mainfrom
tianmu-li:chore/add_qwen_perf_yaml

tianmu-li commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 22, 2026

Uh oh!

gemini-code-assist Bot Jun 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tianmu-li commented Jun 22, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants