[WIP][NV] add glm5-fp4-gb200-dynamo-sglang by hshrivastava-droid · Pull Request #1780 · SemiAnalysisAI/InferenceX

hshrivastava-droid · 2026-06-15T18:58:14Z

Note

Low Risk
Benchmark and CI launcher/config only; no application runtime or auth changes. Main review surface is recipe/topology correctness and cluster resource assumptions.

Overview
Adds GLM-5 NVFP4 disaggregated Dynamo + SGLang benchmark coverage on GB200, mirroring the existing GB300 glm5 entry pattern.

nvidia-master.yaml introduces glm5-fp4-gb200-dynamo-sglang with fixed-seq-len scenarios for 8k1k and 1k1k: wide-EP decode (TP=32) max-throughput topologies (4p–10p prefill variants) and per-node TP=4 low-latency decode workers, each wired to a concrete CONFIG_FILE under recipes/sglang/glm5/gb200-fp4/.

New srt-slurm recipe YAMLs (ported from upstream gb200-fp4/glm5.yaml, one file per topology) live under benchmarks/multi_node/srt-slurm-recipes/sglang/glm5/gb200-fp4/ with Slurm resources, Dynamo frontend, nixl disagg, and tuned sglang_config / sa-bench concurrency per recipe.

runners/launch_gb200-nv.sh maps glm5 + fp4 to lustre GLM-5-NVFP4 and overlays the glm5 recipe tree onto NVIDIA/srt-slurm (sa-submission-q2-2026). perf-changelog.yaml documents the new config key.

^{Reviewed by Cursor Bugbot for commit ba74df2. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-06-15T21:12:06Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27575447726
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27575447726

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit aa5f207. Configure here.}

github-actions · 2026-06-16T22:42:08Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27652197300
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27652197300

github-actions · 2026-06-16T22:50:28Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27652967695
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27652967695

…-gb200-fp4-v2

github-actions · 2026-06-17T02:02:09Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27653968669
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27653968669

add glm5 gb200

5ac7d84

hshrivastava-droid requested a review from a team June 15, 2026 18:58

hshrivastava-droid requested review from jgangani and kedarpotdar-nv as code owners June 15, 2026 18:58

hshrivastava-droid added the full-sweep-enabled label Jun 15, 2026

github-project-automation Bot added this to InferenceMAX Board Jun 15, 2026

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread .github/configs/nvidia-master.yaml

Merge branch 'main' into nv/gl5-gb200-fp4-v2

1805fb2

update nv cluster block

aa5f207

cursor Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread runners/launch_gb200-nv.sh

Merge branch 'main' into nv/gl5-gb200-fp4-v2

ead0e94

hshrivastava-droid added full-sweep-enabled and removed full-sweep-enabled labels Jun 16, 2026

hshrivastava-droid added 2 commits June 16, 2026 16:04

add model path

f6d35f0

Merge remote-tracking branch 'origin/nv/gl5-gb200-fp4-v2' into nv/gl5…

ba74df2

…-gb200-fp4-v2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][NV] add glm5-fp4-gb200-dynamo-sglang#1780

[WIP][NV] add glm5-fp4-gb200-dynamo-sglang#1780
hshrivastava-droid wants to merge 6 commits into
mainfrom
nv/gl5-gb200-fp4-v2

hshrivastava-droid commented Jun 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hshrivastava-droid commented Jun 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hshrivastava-droid commented Jun 15, 2026 •

edited by cursor Bot

Loading