Skip to content

Commit b321e38

Browse files
wangxiyuanfems14
andauthored
[cherry-pick]【main】patch sched_yield (#3648) (#3687)
### What this PR does / why we need it? On Arm systems, os.sched_yield() does not take effect, causing the GIL (Global Interpreter Lock) to remain unrelinquished and resulting in CPU bound issues. This PR applies a patch to sched_yield in vLLM, making the process execute time.sleep(0) instead to release the GIL. ### Does this PR introduce _any_ user-facing change? Signed-off-by: fems14 <[email protected]> Co-authored-by: fems14 <[email protected]>
1 parent d0086d4 commit b321e38

File tree

3 files changed

+15
-0
lines changed

3 files changed

+15
-0
lines changed

vllm_ascend/patch/platform/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
import vllm_ascend.patch.platform.patch_config # noqa
2020
import vllm_ascend.patch.platform.patch_distributed # noqa
2121
import vllm_ascend.patch.platform.patch_mamba_config # noqa
22+
import vllm_ascend.patch.platform.patch_sched_yield # noqa
2223

2324
if os.getenv("DYNAMIC_EPLB", "false") == "true" or os.getenv(
2425
"EXPERT_MAP_RECORD", "false") == "true":
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
import sys
2+
3+
import vllm.distributed.utils
4+
from vllm.platforms import CpuArchEnum, Platform
5+
6+
is_arm = (Platform.get_cpu_architecture() == CpuArchEnum.ARM)
7+
8+
USE_SCHED_YIELD = (
9+
((sys.version_info[:3] >= (3, 11, 1)) or
10+
(sys.version_info[:2] == (3, 10) and sys.version_info[2] >= 8))
11+
and not is_arm)
12+
13+
vllm.distributed.utils.USE_SCHED_YIELD = USE_SCHED_YIELD

vllm_ascend/patch/worker/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
import vllm_ascend.patch.worker.patch_triton
2222

2323
# isort: off
24+
import vllm_ascend.patch.platform.patch_sched_yield # noqa
2425
import vllm_ascend.patch.worker.patch_distributed # noqa
2526
import vllm_ascend.patch.worker.patch_logits # noqa
2627
import vllm_ascend.patch.worker.patch_roberta # noqa

0 commit comments

Comments
 (0)