- 
                Notifications
    You must be signed in to change notification settings 
- Fork 528
【main】patch sched_yield #3648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【main】patch sched_yield #3648
Conversation
Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
| 👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge: 
 If CI fails, you can run linting and testing checks locally according Contributing and Testing. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a patch to disable sched_yield in vllm by monkey-patching the USE_SCHED_YIELD flag in vllm.distributed.utils. While this may be necessary for performance on Ascend hardware, the current implementation is fragile. It directly assigns a value to an attribute of a dependency, which will cause the application to crash with an AttributeError if that attribute is ever renamed or removed in a future version of vllm. My review includes a critical suggestion to make this patch more robust by checking for the attribute's existence before modification, thus preventing potential crashes and improving maintainability.
| import vllm.distributed.utils | ||
|  | ||
|  | ||
| vllm.distributed.utils.USE_SCHED_YIELD = False | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This monkey patch is fragile and can cause the application to crash on startup if the USE_SCHED_YIELD attribute is removed or renamed in a future version of the vllm library. To make the patch more robust, you should defensively handle the attribute assignment. This prevents an AttributeError and ensures the application can still run, while also providing a warning that the patch was not applied.
| import vllm.distributed.utils | |
| vllm.distributed.utils.USE_SCHED_YIELD = False | |
| import logging | |
| import vllm.distributed.utils | |
| try: | |
| vllm.distributed.utils.USE_SCHED_YIELD = False | |
| except AttributeError: | |
| logging.warning( | |
| "Could not apply patch for sched_yield: 'vllm.distributed.utils.USE_SCHED_YIELD' not found. " | |
| "This may be due to a vLLM version update.") | 
Signed-off-by: fems14 <[email protected]>
| @fems14 Explain the reasoning behind your code changes in both comments and commit messages. | 
| @@ -0,0 +1,3 @@ | |||
| import vllm.distributed.utils | |||
|  | |||
| vllm.distributed.utils.USE_SCHED_YIELD = False | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add os.arch check? I assume this patch should only work on aarch64?
Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
Signed-off-by: fems14 <[email protected]>
### What this PR does / why we need it? On Arm systems, os.sched_yield() does not take effect, causing the GIL (Global Interpreter Lock) to remain unrelinquished and resulting in CPU bound issues. This PR applies a patch to sched_yield in vLLM, making the process execute time.sleep(0) instead to release the GIL. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: fems14 <[email protected]>
### What this PR does / why we need it? On Arm systems, os.sched_yield() does not take effect, causing the GIL (Global Interpreter Lock) to remain unrelinquished and resulting in CPU bound issues. This PR applies a patch to sched_yield in vLLM, making the process execute time.sleep(0) instead to release the GIL. ### Does this PR introduce _any_ user-facing change? Signed-off-by: fems14 <[email protected]> Co-authored-by: fems14 <[email protected]>
What this PR does / why we need it?
On Arm systems, os.sched_yield() does not take effect, causing the GIL (Global Interpreter Lock) to remain unrelinquished and resulting in CPU bound issues. This PR applies a patch to sched_yield in vLLM, making the process execute time.sleep(0) instead to release the GIL.
Does this PR introduce any user-facing change?
How was this patch tested?