Smoothquant LLaMA builds not working on 0.8.0 release

### System Info

GPU : NVIDIA A100 80GB

package version
tensorrt-9.2.0.post12.dev5-cp310-none-linux_x86_64.whl
[TensorRT-LLM] TensorRT-LLM version: 0.8.00.8.0

### Who can help?

@Tracin @byshiue

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

1. Installation
`python -m pip install tensorrt_llm==0.8.0 --extra-index-url https://pypi.nvidia.com`

2. Create smoothquant checkpoint for LLaMA
`python ./examples/llama/convert_checkpoint.py --model_dir ~/Llama-2-13b-chat-hf  --output_dir ~/fp16-tp4-sq5 --dtype float16 --tp_size 4 --smoothquant 0.5 --per_token --per_channel --workers 4`

### Expected behavior

Checkpoint should be created.

### actual behavior

Error at line - https://github.com/NVIDIA/TensorRT-LLM/blob/v0.8.0/examples/llama/convert_checkpoint.py#L1502

ValueError: You are trying to save a non contiguous tensor: `transformer.layers.0.attention.qkv.weight` which is not allowed. It either means you are trying to save tensors which are reference of each other in which case it's recommended to save only the full tensors, and reslice at load time, or simply call `.contiguous()` on your tensor to pack it before saving.

### additional notes

No such error seen on release 0.7.1

My guess is that the function `get_tllm_linear_sq_weight` returns some non-contiguous tensors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Smoothquant LLaMA builds not working on 0.8.0 release #1267

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Smoothquant LLaMA builds not working on 0.8.0 release #1267

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions