Skip to content

[AutoDeploy]: Nemotron-6 8K/16K (ISL/OSL) on B200; BF16 #8436

@nzmora-nvidia

Description

@nzmora-nvidia

🚀 The feature, motivation and pitch

capture nsys; compare to vLLM and identify issues.

  • Create and share a perf table for various configuration points (same as H100 + 8K/16K)
  • Share instructions on how to setup vLLM docker on B200
  • Share traces
  • Find root causes of bad AD perf, if any.

Alternatives

No response

Additional context

capture nsys; compare to vLLM and identify issues.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Assignees

Labels

AutoDeploy<NV> AutoDeploy Backend

Type

No type

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions