-
Notifications
You must be signed in to change notification settings - Fork 21
enable CPU benchmark for VLLM Perf Dashboard. #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Let me add you to the list of contributors so you do not need to wait for CI approval |
Co-authored-by: Huy Do <[email protected]>
0671ad5 to
41fa9ce
Compare
41fa9ce to
c253948
Compare
|
I have the PR to publish the docker image up at vllm-project/ci-infra#118, will ask the team for a review |
| 2: [ | ||
| "linux.aws.h100.4", | ||
| "linux.rocm.gpu.mi300.2", | ||
| "intel-cpu-emr", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line means that this runner intel-cpu-emr will only be used when tensor_parallel_size is 2. Is this the expected behavior? From your json files, it looks like this should be under 1 and 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the EMR machine from Chendi only has 2 numa nodes, so we put it under TP 2 case to only run TP 1 and 2 test cases. however, Chendi indeed plans to have a new EMR system, so we will try to have 4 numa node in the new system.
Therefore, I moved it to TP 4 case for now.
e3a9885 to
c253948
Compare
Oh, you should have it now, please let me know if it works |
.github/workflows/vllm-benchmark.yml
Outdated
| --to-benchmark-configs-dir vllm-benchmarks/vllm/.buildkite/nightly-benchmarks/tests \ | ||
| --models "${MODELS}" | ||
| --models "${MODELS}" \ | ||
| --device "${DEVICE_NAME// /_}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bug here where DEVICE_NAME is set to cuda or rocm for non-cpu cases. In these cases, the logic in .github/scripts/setup_vllm_benchmark.py will fail to find the JSON benchmark suite because they don't have the _cuda or _rocm suffix, only _cpu has it. DEVICE_NAME should just be empty in these cases.
You can see that https://github.com/pytorch/pytorch-integration-testing/actions/runs/16163751659/job/45620654542#step:13:71 found no JSON file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huydhn you are right. made a quick change. hopefully it fixed the issue.
|
moved the works into #44. closed for duplication |
this PR depends on a vLLM PR vllm-project/vllm#18444
Since I don't fully understand how workflow works, this is just a early draft to start the work.