[Bug]: Model loading from local path is broken when HF_HUB_OFFLINE is set to 1

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
==============================
        System Info
==============================
OS                           : CentOS Stream 9 (x86_64)
GCC version                  : (GCC) 11.5.0 20240719 (Red Hat 11.5.0-9)
Clang version                : Could not collect
CMake version                : Could not collect
Libc version                 : glibc-2.34

==============================
       PyTorch Info
==============================
PyTorch version              : 2.7.1+cpu
Is debug build               : False
CUDA used to build PyTorch   : None
ROCM used to build PyTorch   : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.11

==============================
       CUDA / GPU Info
==============================
Is CUDA available            : False
CUDA runtime version         : No CUDA
CUDA_MODULE_LOADING set to   : N/A
GPU models and configuration : No CUDA
Nvidia driver version        : No CUDA
cuDNN version                : No CUDA
HIP runtime version          : N/A
MIOpen runtime version       : N/A
Is XNNPACK available         : True

==============================
Versions of relevant libraries
==============================
[pip3] numpy==2.1.2
[pip3] nvidia-cublas-cu12==12.6.4.1
[pip3] nvidia-cuda-cupti-cu12==12.6.80
[pip3] nvidia-cuda-nvrtc-cu12==12.6.77
[pip3] nvidia-cuda-runtime-cu12==12.6.77
[pip3] nvidia-cufile-cu12==1.11.1.6
[pip3] nvidia-curand-cu12==10.3.7.77
[pip3] nvidia-cusparselt-cu12==0.6.3
[pip3] nvidia-nccl-cu12==2.26.2
[pip3] nvidia-nvjitlink-cu12==12.6.85
[pip3] nvidia-nvtx-cu12==12.6.77
[pip3] pyzmq==27.0.2
[pip3] torch==2.7.1+cpu
[pip3] torchaudio==2.7.1+cpu
[pip3] torchvision==0.22.1+cpu
[pip3] transformers==4.55.4
[pip3] triton==3.3.1
[conda] Could not collect

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
Neuron SDK Version           : N/A
vLLM Version                 : 0.10.1.1
vLLM Build Flags:
  CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
  Could not collect

==============================
     Environment Variables
==============================
NCCL_CUMEM_ENABLE=0
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
```

</details>


### 🐛 Describe the bug

For offline inference, model loading from local path is broken when HF_HUB_OFFLINE is set to 1.

I am trying to load a model form local path for offline inferencing using code similar to below. `/local/path/to/model` is the path to a model that contains HuggingFace model snapshot.

```
from vllm import LLM, SamplingParams

prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="/local/path/to/model")

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```

When I try to run this code after setting `HF_HUB_OFFLINE=1`, `LocalEntryNotFoundError` exception is raised with this error message:

```"Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input."```

This workflow used to work without any issues until recently before this PR was merged: https://github.com/vllm-project/vllm/pull/22526.

Upon more debugging, we found that the https://github.com/vllm-project/vllm/pull/22526 and https://github.com/vllm-project/vllm/pull/21680 might be introducing conflicting changes. With https://github.com/vllm-project/vllm/pull/21680, vLLM is trying to log non default arguments, which works fine on its own. But together with https://github.com/vllm-project/vllm/pull/22526, it introduces issue during logging of non default args:

1. After specifying `/local/path/to/model`, the look up of the local model is fine.
2. Once the local model look up is done, vLLM tries to log non default args.
3. During logging of non default args, vLLM tries to create a default instance of EngineArgs() when non-default args are present: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/utils.py#L316
4. In this default instance, model is set to "Qwen/Qwen3-0.6B": https://github.com/vllm-project/vllm/blob/main/vllm/config/__init__.py#L275
5. With https://github.com/vllm-project/vllm/pull/22526/, during __post_init__() of EngineArgs() vLLM tries to load "Qwen/Qwen3-0.6B" from disk here: https://github.com/vllm-project/vllm/blob/main/vllm/engine/arg_utils.py#L463
6. When vLLM tries to load the model, it complains saying "Qwen/Qwen3-0.6B" is not found in local path. I do not have "Qwen/Qwen3-0.6B" in my local model cache, nor do I need it since I'm loading an entirely different model for inferencing.

During logging of non-default args, we should avoid loading model, as this should not prevent from logging non-default args. This issue is affecting our offline workflows with `HF_HUB_OFFLINE=1`. Please help take a look at this issue.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Model loading from local path is broken when HF_HUB_OFFLINE is set to 1 #23684

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Model loading from local path is broken when HF_HUB_OFFLINE is set to 1 #23684

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions