enable CPU benchmark for VLLM Perf Dashboard. #39

louie-tsai · 2025-06-13T23:15:32Z

this PR depends on a vLLM PR vllm-project/vllm#18444
Since I don't fully understand how workflow works, this is just a early draft to start the work.

huydhn · 2025-06-14T00:30:53Z

Let me add you to the list of contributors so you do not need to wait for CI approval

.github/workflows/vllm-benchmark.yml

Co-authored-by: Huy Do <[email protected]>

huydhn · 2025-07-01T17:00:35Z

I have the PR to publish the docker image up at vllm-project/ci-infra#118, will ask the team for a review

.github/workflows/vllm-benchmark.yml

huydhn · 2025-07-03T01:04:48Z

.github/scripts/generate_vllm_benchmark_matrix.py

    2: [
        "linux.aws.h100.4",
        "linux.rocm.gpu.mi300.2",
+        "intel-cpu-emr",


This line means that this runner intel-cpu-emr will only be used when tensor_parallel_size is 2. Is this the expected behavior? From your json files, it looks like this should be under 1 and 4?

the EMR machine from Chendi only has 2 numa nodes, so we put it under TP 2 case to only run TP 1 and 2 test cases. however, Chendi indeed plans to have a new EMR system, so we will try to have 4 numa node in the new system.
Therefore, I moved it to TP 4 case for now.

huydhn · 2025-07-07T23:37:17Z

sounds good. will do that once I have the write access.

Oh, you should have it now, please let me know if it works

huydhn · 2025-07-09T08:24:29Z

.github/workflows/vllm-benchmark.yml

            --to-benchmark-configs-dir vllm-benchmarks/vllm/.buildkite/nightly-benchmarks/tests \
-            --models "${MODELS}"
+            --models "${MODELS}" \
+            --device "${DEVICE_NAME// /_}"


There is a bug here where DEVICE_NAME is set to cuda or rocm for non-cpu cases. In these cases, the logic in .github/scripts/setup_vllm_benchmark.py will fail to find the JSON benchmark suite because they don't have the _cuda or _rocm suffix, only _cpu has it. DEVICE_NAME should just be empty in these cases.

You can see that https://github.com/pytorch/pytorch-integration-testing/actions/runs/16163751659/job/45620654542#step:13:71 found no JSON file.

@huydhn you are right. made a quick change. hopefully it fixed the issue.

louie-tsai · 2025-07-15T16:23:52Z

moved the works into #44. closed for duplication

facebook-github-bot added the cla signed label Jun 13, 2025

huydhn reviewed Jun 14, 2025

View reviewed changes

.github/workflows/vllm-benchmark.yml Outdated Show resolved Hide resolved

louie-tsai and others added 2 commits June 25, 2025 16:28

first draft to enable CPU benchmark

ec0ac36

Update .github/workflows/vllm-benchmark.yml

1201ea6

Co-authored-by: Huy Do <[email protected]>

louie-tsai force-pushed the cpu_vllm_benchmark branch 2 times, most recently from 0671ad5 to 41fa9ce Compare June 26, 2025 06:26

louie-tsai requested a review from huydhn June 26, 2025 15:50

louie-tsai changed the title ~~[WIP] Draft to enable CPU benchmark for VLLM Perf Dashboard.~~ enable CPU benchmark for VLLM Perf Dashboard. Jun 26, 2025

fix for ROCm changes

c253948

louie-tsai force-pushed the cpu_vllm_benchmark branch from 41fa9ce to c253948 Compare June 27, 2025 01:16

louie-tsai had a problem deploying to pytorch-x-vllm June 27, 2025 01:24 — with GitHub Actions Failure

huydhn mentioned this pull request Jul 1, 2025

Enable CPU nightly performance benchmark and its Markdown report vllm-project/vllm#18444

Merged

huydhn reviewed Jul 3, 2025

View reviewed changes

.github/workflows/vllm-benchmark.yml Outdated Show resolved Hide resolved

huydhn reviewed Jul 3, 2025

View reviewed changes

louie-tsai force-pushed the cpu_vllm_benchmark branch from e3a9885 to c253948 Compare July 3, 2025 01:23

louie-tsai added 2 commits July 2, 2025 18:33

change to use public cpu vllm postmerge registry

1d0271a

target on 4 NUMA node EMR machine

9cffd0e

louie-tsai requested a review from huydhn July 3, 2025 01:41

huydhn mentioned this pull request Jul 9, 2025

Enable CPU benchmark for VLLM perf dashboard #44

Merged

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Failure

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Error

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Failure

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Error

huydhn reviewed Jul 9, 2025

View reviewed changes

fix an CI issues on cuda and rocm

4bc388c

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:02 — with GitHub Actions Failure

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:02 — with GitHub Actions Error

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:02 — with GitHub Actions Failure

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:02 — with GitHub Actions Error

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:02 — with GitHub Actions Failure

louie-tsai closed this Jul 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enable CPU benchmark for VLLM Perf Dashboard. #39

enable CPU benchmark for VLLM Perf Dashboard. #39

Uh oh!

louie-tsai commented Jun 13, 2025

Uh oh!

huydhn commented Jun 14, 2025

Uh oh!

Uh oh!

huydhn commented Jul 1, 2025

Uh oh!

Uh oh!

huydhn Jul 3, 2025

Uh oh!

louie-tsai Jul 3, 2025

Uh oh!

huydhn commented Jul 7, 2025

Uh oh!

huydhn Jul 9, 2025 •

edited

Loading

Uh oh!

louie-tsai Jul 10, 2025

Uh oh!

louie-tsai commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enable CPU benchmark for VLLM Perf Dashboard. #39

enable CPU benchmark for VLLM Perf Dashboard. #39

Uh oh!

Conversation

louie-tsai commented Jun 13, 2025

Uh oh!

huydhn commented Jun 14, 2025

Uh oh!

Uh oh!

huydhn commented Jul 1, 2025

Uh oh!

Uh oh!

huydhn Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

louie-tsai Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

huydhn commented Jul 7, 2025

Uh oh!

huydhn Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

louie-tsai Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

louie-tsai commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

huydhn Jul 9, 2025 •

edited

Loading