-
Notifications
You must be signed in to change notification settings - Fork 202
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
DSR1-FP4 MI355x SGLang: Add EP configurations
#1811
opened Jun 17, 2026 by
ppalanga
Collaborator
Loading…
Add Qwen3.5-FP8 GB200 SGLang disaggregated benchmark
full-sweep-enabled
#1810
opened Jun 16, 2026 by
RohitNagraj
Collaborator
Loading…
[AMD] [MI300X] minimaxm3-fp8-mi300x-vllm: enable AITER kernels for MXFP8 on MI300X
full-sweep-enabled
#1808
opened Jun 16, 2026 by
JohnQinAMD
Collaborator
Loading…
Fix for https://github.com/sgl-project/sglang/issues/22072
#1806
opened Jun 16, 2026 by
davzhuAMD
Loading…
[NV]Add GLM-5 NVFP4 GB200 disagg non-mtp TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1803
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[NV]Add GLM-5 NVFP4 GB200 disagg-mtp TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1800
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[NV]Add GLM-5 NVFP4 GB300 disagg-mtp TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1799
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[NV]Add GLM-5 NVFP4 GB300 disagg-non-mtp TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1798
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[NV]Update Kimi K2.5 NVFP4 GB200 disaggregated TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1797
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[NV]Add Kimi K2.5 NVFP4 GB300 disaggregated TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1796
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
chore(runners): add TensorWave MI300X docker runners (mi300x-tw)
#1793
opened Jun 16, 2026 by
cquil11
Collaborator
Loading…
[NV]dsr1-fp4-b200-sglang: add DPA PDL lane
full-sweep-enabled
#1792
opened Jun 15, 2026 by
hshrivastava-droid
Collaborator
Loading…
[DO NOT MERGE] Run-only: gb200 dsr1 measured power+temp (canonical NVIDIA)
sweep-enabled
#1791
opened Jun 15, 2026 by
arygupt
Collaborator
Loading…
[NV] Add MiniMax M3 B300 Dynamo vLLM recipes
full-sweep-enabled
#1787
opened Jun 15, 2026 by
jasonlizhengjian
Collaborator
Loading…
perf(vllm): compact MiniMax M3 EP decode routes on MI300X
#1782
opened Jun 15, 2026 by
Oseltamivir
Collaborator
Loading…
[WIP][NV] add glm5-fp4-gb200-dynamo-sglang
full-sweep-enabled
#1780
opened Jun 15, 2026 by
hshrivastava-droid
Collaborator
Loading…
[codex] perf: fuse MiniMax M3 allreduce and Gemma RMSNorm on MI300X
full-sweep-enabled
#1778
opened Jun 15, 2026 by
Oseltamivir
Collaborator
Loading…
[AMD] refactor: engine-neutral aiperf plotter + fill sglang panels
#1774
opened Jun 15, 2026 by
AMD-yanfeiwang
Loading…
2 of 3 tasks
[Klaud Cold][Experimental][DNM] minimaxm3-fp8-mi355x-vllm-disagg: day-zero MoRI-IO disagg smoke test (1P TP8 + 1D TP8, conc 1)
non-canary-full-sweep-enabled
Run the full sweep without the canary gate (full search space, no trim)
#1762
opened Jun 14, 2026 by
functionstackx
Collaborator
Loading…
[Experimental][DNM till upstream PR merges][AMD] perf: hybrid MXFP8 MoE for MiniMax M3 on MI300X
full-sweep-enabled
#1753
opened Jun 14, 2026 by
Oseltamivir
Collaborator
Loading…
Minimax m3 gb200 agg lowconc
full-sweep-enabled
#1752
opened Jun 14, 2026 by
Oseltamivir
Collaborator
Loading…
MiniMax-M3 MXFP8 full sweep config for GB300
full-sweep-enabled
#1735
opened Jun 13, 2026 by
Oseltamivir
Collaborator
Loading…
2 of 5 tasks
[blocked by vllm#45879] MiniMax-M3 MXFP8 full sweep config for GB200
full-sweep-enabled
#1734
opened Jun 13, 2026 by
Oseltamivir
Collaborator
Loading…
1 of 2 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.