Bus error running t5 conversion script using the latest main

### System Info

GPU (a10g). I have tried with an AWS g5.2xlarge instance and AWS g5.12xlarge instance.

### Who can help?

@byshiue

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

I pretty much follow the official installation:
1. docker run --shm-size=2g --rm --runtime=nvidia --GPUs all --entrypoint /bin/bash -it nvidia/cuda:12.1.0-devel-ubuntu22.04
2. apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev git python-is-python3 vim
3. pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com
4. git clone https://github.com/NVIDIA/TensorRT-LLM.git (05/02 version)
5. cd TensorRT-LLM
6. 
```
export MODEL_TYPE="t5"
export MODEL_NAME="google/flan-t5-large"
export INFERENCE_PRECISION="float32"
export TP_SIZE=1
export PP_SIZE=1
export WORLD_SIZE=1

python examples/enc_dec/convert_checkpoint.py --model_type ${MODEL_TYPE}   
              --model_dir ${MODEL_NAME}         
        --output_dir tmp/trt_models/${MODEL_NAME}/${INFERENCE_PRECISION}        
        --tp_size ${TP_SIZE}            
       --pp_size ${PP_SIZE}             
       --weight_data_type float32            
       --dtype ${INFERENCE_PRECISION}
```

### Expected behavior

Model converted

### actual behavior

```
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 2 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
Bus error (core dumped)
```

### additional notes

I also tried to use bart model with the same script, and it successfully exits. Just change to export MODEL_TYPE="bart"
export MODEL_NAME="facebook/bart-large-cnn". So this might be a t5 architecture only problem, or it could relate to the GPU type I'm using (a10g)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bus error running t5 conversion script using the latest main #1538

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bus error running t5 conversion script using the latest main #1538

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions