issue/424 -Clean up unused code. by pengcheng888 · Pull Request #425 · InfiniTensor/InfiniLM

pengcheng888 · 2026-06-10T11:36:41Z

Summary

llama 9g qwen3 glm4的单卡/双卡的test_infer.py输出正常
`
models=(
"/data-aisoft/mechdancer/models/TinyLlama-1.1B-Chat-v1.0"
"/data-aisoft/mechdancer/models/9g_8b_thinking"
"/data-aisoft/mechdancer/models/Qwen3-0.6B"
"/data-aisoft/mechdancer/models/GLM-4-9B-0414"
)

for model in "${models[@]}"; do
echo "========================================"
echo "正在运行模型: $model"
echo "========================================"
python examples/test_infer.py
--device nvidia
--prompt "介绍下你自己"
--model="$model"
--tp=1
--enable-paged-attn
--attn paged-attn
--max-new-tokens 256
echo "" # 输出空行分隔不同模型的结果

python examples/test_infer.py \
    --device nvidia \
    --prompt "介绍下你自己" \
    --model="$model" \
    --tp=2 \
    --enable-paged-attn \
    --attn flash-attn \
    --max-new-tokens 256
echo ""   # 输出空行分隔不同模型的结果

done`

海光平台
Qen3-0.6B tp=2 static test_infer.py
python examples/test_infer.py --device nvidia --model=$MODEL --batch-size 2 --max-new-tokens 64

iluvatar平台

9g_8b_thinking tp=2 static test_benchmark.py

python test/bench/test_benchmark.py --device iluvatar --model=$MODEL --bench ceval --subject middle_school_mathematics --num-samples 100 --backend cpp --tp 2 --cache-dir "/data-aisoft/zenghua/scripts/lm_eval/cache/datasets"

maca平台

9G7B_MHA tp=1/2 bench.py

moore平台

GLM-4-9B-0414 tp=1/2 paged test_bench.py

普通服务

nvidia平台

tp=1/2 MiniCPM-V-2_6 服务

python python/infinilm/server/inference_server.py \
--device metax \
--model=$MODEL  \
--temperature 1.0 \
--top-p 0.8 \
--top-k 1 \
--port 9100 \
--tp 2  \
--block-size 256 \
--max-new-tokens 1024 \
--num-blocks 1000 \
--max-batch-size 32 \
--enable-graph \
--enable-paged-attn \
--attn flash-attn \
--log-level INFO

Motivation

Closes #

Type of Change

feat — new feature / new model
fix — bug fix
perf — performance improvement (no behavioral change)
refactor — code restructuring without behavior change
test — adding or fixing tests only
docs — documentation only
build / ci — build system or CI configuration
chore — tooling, formatting, or other non-code changes
Breaking change

Test Results of Involved Models on Supported Platforms (Please attach screenshots)

Benchmark / Performance Impact

Notes for Reviewers

Checklist

Every contributor must verify every item below before requesting
review. Tick each box only after the check has actually been performed —
do not tick speculatively. If an item truly does not apply, replace the
checkbox with N/A and briefly explain why in an inline comment.

Title, Branch, and Commits

PR title follows Conventional Commits (e.g. feat(nvidia): …, fix(cuda/gemm): …).
Branch name follows <type>/xxx-yyyy-zzzz where <type> matches the PR title's Conventional Commits type and words are joined with hyphens (see CONTRIBUTING.md §Branches).
Each commit message follows Conventional Commits.
Small PR is a single squashable commit; or, for a large PR, every commit is meaningful, well-formed, and independently reviewable (see CONTRIBUTING.md §Pull Requests).
No stray merge commits from main — the branch is rebased cleanly on top of the current main.
No fixup! / squash! / wip commits remain.
Existing PR/branch/commit that followed the legacy issue format.

Scope and Design

Changes are minimal — nothing unrelated to the stated motivation was added (CONTRIBUTING.md §Code/General).
No dead code, commented-out blocks, debug prints, printf/std::cout/print(...) left behind, or TODO without an owner and issue link.
No unrelated formatting churn that would obscure the diff.
Public API changes (if any) are intentional, documented, and reflected in affected callers/tests.

General Code Hygiene (applies to all languages)

The code is self-explanatory; comments were added only where the why is non-obvious (CONTRIBUTING.md §Code/General).
Every modified or added file ends with a single trailing newline (CONTRIBUTING.md §Code/General).
No trailing whitespace, tab/space mixing, or stray BOMs.
Identifiers in comments and error messages are wrapped in backticks (e.g. the `seqlens_k` tensor) (CONTRIBUTING.md §Code/General).
All comments and error messages are in English (CONTRIBUTING.md §Code/General).
Comments and error messages are complete sentences — capitalized first letter, terminal punctuation — unless the language/framework convention says otherwise (CONTRIBUTING.md §Code/General; §Python).

C++ Specific (if C++ files changed)

Code follows the Google C++ Style Guide strictly.
Error and warning message wording follows the LLVM Coding Standards (CONTRIBUTING.md §C++).
Constructor initializer list order matches member declaration order (CONTRIBUTING.md §C++).
No raw new/delete; RAII / smart pointers / existing allocators are used.
Changed files are formatted by scripts/format.py.
No changes/reference to csrc/models/llama_legacy/.

Python Specific (if Python files changed)

Code is PEP 8 compliant.
Comments are complete English sentences, starting with a capital letter and ending with punctuation; Markdown backticks are used for code references (CONTRIBUTING.md §Python).
Docstrings (if any) follow PEP 257 (CONTRIBUTING.md §Python).
Changed files are formatted by scripts/format.py.
No changes/reference to python/infinilm/auto_config.py.

Testing

For any platform that could not be tested, an explicit reason is given in the table and a reviewer with access has been tagged.
Passed single request test (examples/test_infer.py), or specify the reason for skipping.
Passed offline performance test (examples/bench.py), or specify the reason for skipping.
Passed sanity test (test/bench/test_benchmark.py), or specify the reason for skipping.
Passed service test (python/infinilm/server/inference_server.py + scripts/test_perf.py), or specify the reason for skipping.

Build, CI, and Tooling

The project builds cleanly from a fresh directory on at least one affected platform.

Documentation

README.md, CONTRIBUTING.md, or inline docs updated when behavior, build flags, or developer workflow changed.
Any user-visible breaking change is called out explicitly under "Motivation" and in the commit/PR title with a ! or BREAKING CHANGE: footer.

Security and Safety

No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
Third-party code is license-compatible and attributed.
No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were introduced.

pengcheng888 requested a review from a team June 10, 2026 11:36

pengcheng888 linked an issue Jun 10, 2026 that may be closed by this pull request

[DEV] 清理版本遗留的无用代码 #424

Open

pengcheng888 force-pushed the issue/424 branch from 0fa9dbc to f51bf29 Compare June 11, 2026 02:27

pengcheng888 requested review from PanZezhong1725, qinyiqun and wooway777 June 11, 2026 03:26

issue/424 -Clean up unused code.

c7c2a44

pengcheng888 force-pushed the issue/424 branch from f51bf29 to c7c2a44 Compare June 11, 2026 07:44

qinyiqun approved these changes Jun 11, 2026

View reviewed changes

wooway777 approved these changes Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue/424 -Clean up unused code.#425

issue/424 -Clean up unused code.#425
pengcheng888 wants to merge 1 commit into
mainfrom
issue/424

pengcheng888 commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pengcheng888 commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Type of Change

Test Results of Involved Models on Supported Platforms (Please attach screenshots)

Benchmark / Performance Impact

Notes for Reviewers

Checklist

Title, Branch, and Commits

Scope and Design

General Code Hygiene (applies to all languages)

C++ Specific (if C++ files changed)

Python Specific (if Python files changed)

Testing

Build, CI, and Tooling

Documentation

Security and Safety

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pengcheng888 commented Jun 10, 2026 •

edited

Loading