Skip to content

Conversation

@fems14
Copy link
Contributor

@fems14 fems14 commented Oct 23, 2025

What this PR does / why we need it?

On Arm systems, os.sched_yield() does not take effect, causing the GIL (Global Interpreter Lock) to remain unrelinquished and resulting in CPU bound issues. This PR applies a patch to sched_yield in vLLM, making the process execute time.sleep(0) instead to release the GIL.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a patch to disable sched_yield in vllm by monkey-patching the USE_SCHED_YIELD flag in vllm.distributed.utils. While this may be necessary for performance on Ascend hardware, the current implementation is fragile. It directly assigns a value to an attribute of a dependency, which will cause the application to crash with an AttributeError if that attribute is ever renamed or removed in a future version of vllm. My review includes a critical suggestion to make this patch more robust by checking for the attribute's existence before modification, thus preventing potential crashes and improving maintainability.

Comment on lines 1 to 4
import vllm.distributed.utils


vllm.distributed.utils.USE_SCHED_YIELD = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This monkey patch is fragile and can cause the application to crash on startup if the USE_SCHED_YIELD attribute is removed or renamed in a future version of the vllm library. To make the patch more robust, you should defensively handle the attribute assignment. This prevents an AttributeError and ensures the application can still run, while also providing a warning that the patch was not applied.

Suggested change
import vllm.distributed.utils
vllm.distributed.utils.USE_SCHED_YIELD = False
import logging
import vllm.distributed.utils
try:
vllm.distributed.utils.USE_SCHED_YIELD = False
except AttributeError:
logging.warning(
"Could not apply patch for sched_yield: 'vllm.distributed.utils.USE_SCHED_YIELD' not found. "
"This may be due to a vLLM version update.")

@fems14 fems14 changed the title patch sched_yield 【main】patch sched_yield Oct 23, 2025
Signed-off-by: fems14 <[email protected]>
@jianzs
Copy link
Collaborator

jianzs commented Oct 23, 2025

@fems14 Explain the reasoning behind your code changes in both comments and commit messages.

@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Oct 23, 2025
@@ -0,0 +1,3 @@
import vllm.distributed.utils

vllm.distributed.utils.USE_SCHED_YIELD = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add os.arch check? I assume this patch should only work on aarch64?

Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
@wangxiyuan wangxiyuan merged commit 2bcadcb into vllm-project:main Oct 23, 2025
25 checks passed
wangxiyuan pushed a commit to wangxiyuan/vllm-ascend that referenced this pull request Oct 23, 2025
### What this PR does / why we need it?
On Arm systems, os.sched_yield() does not take effect, causing the GIL
(Global Interpreter Lock) to remain unrelinquished and resulting in CPU
bound issues. This PR applies a patch to sched_yield in vLLM, making the
process execute time.sleep(0) instead to release the GIL.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: fems14 <[email protected]>
wangxiyuan added a commit that referenced this pull request Oct 23, 2025
### What this PR does / why we need it?
On Arm systems, os.sched_yield() does not take effect, causing the GIL
(Global Interpreter Lock) to remain unrelinquished and resulting in CPU
bound issues. This PR applies a patch to sched_yield in vLLM, making the
process execute time.sleep(0) instead to release the GIL. ### Does this
PR introduce _any_ user-facing change?

Signed-off-by: fems14 <[email protected]>
Co-authored-by: fems14 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants