[Misc] Moved override for allreduce fusion thresholds from env var to config #23722

nvjullin · 2025-08-27T08:32:31Z

Purpose

Follow up on #23639.
Also cleaned up two competing/conflicting ways of tuning thresholds: number of tokens vs size.
Size is the relevant parameter (for perf), so we should only use that.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Julien Lin <[email protected]>

gemini-code-assist

Code Review

This pull request successfully refactors the configuration for allreduce fusion thresholds, moving them from an environment variable to a more structured configuration object. The cleanup of the logic for tuning thresholds is also a welcome improvement. I've found one potential performance issue in the calculation of max_token_num which appears to be overly conservative and could prevent the fused kernel from being used in some cases where it would be beneficial. Please see the detailed comment.

vllm/compilation/collective_fusion.py

Signed-off-by: Julien Lin <[email protected]>

ilmarkov · 2025-08-27T13:29:11Z

I am changing the constants and a bit of logic in the other PR. But keeping the max size and cleaning the other tuning ways make sense to me.
LGTM.

ProExpertProg

I think if we could restructure this such that the defaults are also reflected in config that would be nice. So maybe config asks the pass for. defaults but uses CLI values with precedence.

hmellor · 2025-08-27T16:17:06Z

I agree it would be nice if the the defaults could be the default of the actual config field rather than living with the implementation

nvjullin · 2025-08-28T05:53:45Z

I agree it would be nice if the the defaults could be the default of the actual config field

If the default is {"2": 64, "4": 1, "6": 1, "8": 1}, then if the user wants to override 8 only, the user will have to pass {"2": 64, "4": 1, "6": 1, "8": 8}. This is quite bad UI.

I think if we could restructure this such that the defaults are also reflected in config that would be nice.

Right now, the comment explains the defaults, so it is indeed reflected in the config. The issue is that the comment has to be in sync with the implementation. It's not ideal, but otherwise we'll have to write a new dict-like class to handle the aforementioned UI problem which I think is overkill for a very niche config option.

Another option is to have a default of {"2": 64, "4": 1, "6": 1, "8": 1} in config and fall back to the one in flashinfer_max_size when the config is empty. This is essentially the same as the current situation where we have a comment explaining the default: we still have to keep them in sync.

ilmarkov · 2025-08-28T09:00:23Z

If the default is {"2": 64, "4": 1, "6": 1, "8": 1}, then if the user wants to override 8 only, the user will have to pass {"2": 64, >"4": 1, "6": 1, "8": 8}

@nvjullin I'd suggest to update the default config with user-provided dictionary. I believe user usually needs to specify one key:value pair at the initialization to update the default config.

moved env var to config

7c14004

Signed-off-by: Julien Lin <[email protected]>

nvjullin requested review from simon-mo, WoosukKwon, youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth, houseroad, hmellor, yewentao256, ProExpertProg and zou3519 as code owners August 27, 2025 08:32

gemini-code-assist bot reviewed Aug 27, 2025

View reviewed changes

vllm/compilation/collective_fusion.py Show resolved Hide resolved

nvjullin added 2 commits August 27, 2025 17:25

Merge branch 'main' into ar-config

5db1b04

cap at max_num_batched_tokens

52f7e0c

Signed-off-by: Julien Lin <[email protected]>

ProExpertProg reviewed Aug 27, 2025

View reviewed changes

ProExpertProg added the torch.compile label Aug 28, 2025

github-project-automation bot added this to torch.compile integration Aug 28, 2025

github-project-automation bot moved this to To triage in torch.compile integration Aug 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Moved override for allreduce fusion thresholds from env var to config #23722

[Misc] Moved override for allreduce fusion thresholds from env var to config #23722

nvjullin commented Aug 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

ilmarkov commented Aug 27, 2025

Uh oh!

ProExpertProg left a comment

Uh oh!

hmellor commented Aug 27, 2025 •

edited

Loading

Uh oh!

nvjullin commented Aug 28, 2025 •

edited

Loading

Uh oh!

ilmarkov commented Aug 28, 2025

Uh oh!

Uh oh!

Uh oh!

[Misc] Moved override for allreduce fusion thresholds from env var to config #23722

Are you sure you want to change the base?

[Misc] Moved override for allreduce fusion thresholds from env var to config #23722

Conversation

nvjullin commented Aug 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

ilmarkov commented Aug 27, 2025

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

hmellor commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nvjullin commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilmarkov commented Aug 28, 2025

Uh oh!

Uh oh!

nvjullin commented Aug 27, 2025 •

edited by github-actions bot

Loading

hmellor commented Aug 27, 2025 •

edited

Loading

nvjullin commented Aug 28, 2025 •

edited

Loading