-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Refactor] refactor freezing_value/cuda_event initialize outside try finally
v1
#23758
opened Aug 27, 2025 by
andyxning
Loading…
5 tasks
[ci] breaks down V1 Test into 3 groups of approx 30 minutes runtime
ci/build
#23757
opened Aug 27, 2025 by
jeanschmidt
Loading…
[Bugfix] when nixi port by bind, process canot stop
#23756
opened Aug 27, 2025 by
lengrongfu
Loading…
5 tasks
Support for NemotronH Nano VLM with an optimized vision model (vLLM native)
multi-modality
Related to multi-modality (#4194)
new-model
Requests to new models
#23753
opened Aug 27, 2025 by
danielafrimi
•
Draft
[Doc]: Spelling errors fixed in .md files
ci/build
documentation
Improvements or additions to documentation
needs-rebase
performance
Performance-related issues
#23751
opened Aug 27, 2025 by
didier-durand
Loading…
1 task done
[ux] Switch a warning to debug about a pytorch fallback
v1
#23750
opened Aug 27, 2025 by
russellb
Loading…
[Model] Merge Related to multi-modality (#4194)
new-model
Requests to new models
ready
ONLY add when PR is ready to merge/full CI is needed
v1
SupportsMultiModalWithRawInput
with SupportsMultiModal
multi-modality
#23749
opened Aug 27, 2025 by
DarkLight1337
Loading…
5 tasks
Tune configs for triton block fp8 gemm H100/H200
performance
Performance-related issues
#23748
opened Aug 27, 2025 by
mgoin
Loading…
5 tasks
[Feat] A novel static EPLB placement strategy for MoE models.
#23745
opened Aug 27, 2025 by
cboss6
Loading…
[Docs] Fix warnings in ONLY add when PR is ready to merge/full CI is needed
structured-output
tpu
Related to Google TPUs
v1
mkdocs build
(continued)
ready
#23743
opened Aug 27, 2025 by
Zerohertz
Loading…
[Feature][Response API] Add streaming support for non-harmony
frontend
v1
#23741
opened Aug 27, 2025 by
kebe7jun
Loading…
3 of 5 tasks
Adapting Qwen3-32B to Eagle3 mode to resolve head dimension mismatch issues
qwen
Related to Qwen models
v1
#23740
opened Aug 27, 2025 by
coder-fny
Loading…
5 tasks
Disable Improvements or additions to documentation
new-model
Requests to new models
ready
ONLY add when PR is ready to merge/full CI is needed
torch.compile
for dynamic rope models in Transformers backend
documentation
#23738
opened Aug 27, 2025 by
hmellor
Loading…
[BugFix][FlashInfer] Fix potential race condition for paged_kv_indptr_cpu
v1
#23737
opened Aug 27, 2025 by
WoosukKwon
Loading…
[Frontend] Gemma3n audio
transcriptions
/translations
endpoint
frontend
#23735
opened Aug 27, 2025 by
NickLucche
Loading…
[Feature] Support Decode Context Parallel for MLA
needs-rebase
v1
#23734
opened Aug 27, 2025 by
youzhedian
Loading…
6 tasks
[Misc] Use CpuGpuBuffer for FlashInfer metadata builder
v1
#23731
opened Aug 27, 2025 by
WoosukKwon
Loading…
[Misc] Extract common utils for nvfp4 kernel source files
#23727
opened Aug 27, 2025 by
elvischenv
Loading…
5 tasks
[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp
#23725
opened Aug 27, 2025 by
nvjullin
Loading…
5 tasks
[Misc] Moved override for allreduce fusion thresholds from env var to config
#23722
opened Aug 27, 2025 by
nvjullin
Loading…
5 tasks
[Frontend] Pass API server count to each process
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#23717
opened Aug 27, 2025 by
DarkLight1337
Loading…
1 of 5 tasks
[Bugfix] Fix for V1 priority scheduling crashes at preemption
v1
#23713
opened Aug 27, 2025 by
Hanchenli
Loading…
5 tasks
[Bugfix] when set offline model running error
frontend
#23711
opened Aug 27, 2025 by
lengrongfu
Loading…
5 tasks
[WIP] Adding int4 models for CPU benchmarking
ci/build
performance
Performance-related issues
#23709
opened Aug 27, 2025 by
louie-tsai
Loading…
5 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2025-08-24.