[AutoDeploy]:Unify Checkpoint and Graph-Based Quantization Detection

Currently, AutoDeploy handles quantized model detection through two separate mechanisms:

1.  config parsing (e.g. via _load_quantization_config() in HF factory)
2. quantized nodes detection in graph transformation

This split introduces redundancy and complexity when onboarding new quant formats from source other than ModelOpt. 

We want to introduce a base class to unify detection, transformation, and parameter resolution across all quant sources.

Goals:
- Support any quantization source (ModelOpt, native graph, etc.) via a unified handler.
- Remove scattered logic in HF factory for hf_quant_config.json.
- Make quantization transformation format-agnostic and modular.
- Reduce copy-paste code when adding new formats.

Subtasks:

- [ ] Introduce New Interface and Registry
- [ ] Migrate ModelOpt + FP8/NVFP4 Handler
- [ ] Refactor HF Factory Logic
- [ ] Refactor Graph Quantization Pass
- [ ] Validate with new quantization format

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AutoDeploy]:Unify Checkpoint and Graph-Based Quantization Detection #5860

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AutoDeploy]:Unify Checkpoint and Graph-Based Quantization Detection #5860

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions