[bugfix]fix proxy decode bug #3748

Shirley125 · 2025-10-25T04:47:30Z

What this PR does / why we need it?

fix proxy decode bug parsing non-UTF-8 characters

vLLM version: v0.11.0rc3
vLLM main: vllm-project/vllm@17c540a

github-actions · 2025-10-25T04:47:47Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces error handling for decoding stream chunks in two proxy server examples. While adding a try-except block is a good step to prevent crashes from decoding errors, the implementation can be improved for robustness. Specifically, catching the generic Exception is too broad and can mask other potential issues. It's a best practice to catch the specific UnicodeDecodeError to make the error handling more precise and the code more maintainable.

examples/disaggregated_prefill_v1/load_balance_proxy_layerwise_server_example.py

examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py

### What this PR does / why we need it? 1. Rename common_fused_moe.py to fused_moe.py. 2. Rename fused_moe_prepare_and_finalize.py / FusedMoEPrepareAndFinalize to prepare_finalize.py / PrepareAndFinalize. 3. Rename vllm_ascend/ops/moe to vllm_ascend/ops/fused_moe. 4. Move vllm_ascend/ops/fused_moe.py to vllm_ascend/ops/fused_moe/fused_moe.py ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@17c540a Signed-off-by: Pr0Wh1teGivee <[email protected]> Signed-off-by: CHEN <[email protected]>

### What this PR does / why we need it? Caps the calculated maximum number of tokens at 512. This prevents allocating an excessively large buffer when a cudagraph capture size is not specified, mitigating the risk of out-of-memory errors. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? None. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@17c540a Signed-off-by: Yizhou Liu <[email protected]> Signed-off-by: CHEN <[email protected]>

Signed-off-by: CHEN <[email protected]>

gemini-code-assist bot reviewed Oct 25, 2025

View reviewed changes

examples/disaggregated_prefill_v1/load_balance_proxy_layerwise_server_example.py Outdated Show resolved Hide resolved

examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py Outdated Show resolved Hide resolved

Shirley125 changed the title ~~fix proxy decode bug~~ [bugfix]fix proxy decode bug Oct 25, 2025

Pr0Wh1teGivee and others added 4 commits October 25, 2025 12:52

fix proxy decode bug

2950e99

Signed-off-by: CHEN <[email protected]>

fix proxy decode bug

8a725eb

Signed-off-by: CHEN <[email protected]>

Shirley125 force-pushed the main branch from 0537032 to 8a725eb Compare October 25, 2025 04:55

github-actions bot added module:tests module:ops module:core module:quantization labels Oct 25, 2025

Shirley125 closed this Oct 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix]fix proxy decode bug #3748

[bugfix]fix proxy decode bug #3748

Shirley125 commented Oct 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Oct 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[bugfix]fix proxy decode bug #3748

[bugfix]fix proxy decode bug #3748

Conversation

Shirley125 commented Oct 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Uh oh!

github-actions bot commented Oct 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Shirley125 commented Oct 25, 2025 •

edited by github-actions bot

Loading