Skip to content

Encountered an error in forward function: slice 712 exceeds buffer size 471 #1480

@sleepwalker2017

Description

@sleepwalker2017

System Info

GPU A30 * 2

TensorRT-LLM version: v0.9.0

Model: vicuna 13B

Who can help?

@byshiue

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. build engine
python convert_checkpoint.py --model_dir /data/weilong.yu/vicuna-13b/vicuna-13b-v1.5/ \
                              --output_dir ./tllm_checkpoint_2gpu_fp16 \
                              --dtype float16 --tp_size 2

trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_fp16 \
            --output_dir ./tmp/llama/13B/trt_engines/fp16/2-gpu \
            --gemm_plugin float16 \
            --use_fused_mlp \
            --max_batch_size $1 \
            --max_input_len 2048 \
            --max_output_len 256 \
            --context_fmha enable \
            --paged_kv_cache enable \
            --use_paged_context_fmha enable \
            --remove_input_padding enable  --workers 2 \
            --use_fused_mlp
  1. run benchmark
mpirun -n 2 --allow-run-as-root ./gptManagerBenchmark --engine_dir ../../../examples/llama/tmp/llama/13B/trt_engines/fp16/2-gpu/ --dataset ../../../benchmarks/cpp/token-norm-dist.json --kv_cache_free_gpu_mem_fraction 0.85 --enable_kv_cache_reuse -enable_chunked_context

Expected behavior

No error message.

actual behavior

sh run.sh
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 712 exceeds buffer size 471
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 712 exceeds buffer size 471
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 1553 exceeds buffer size 927
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 1553 exceeds buffer size 927
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 884 exceeds buffer size 642
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 884 exceeds buffer size 642
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 1192 exceeds buffer size 951
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 1192 exceeds buffer size 951
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 1253 exceeds buffer size 1012
[TensorRT-LLM][ERROR] Encountered an error in forward function: slice 1253 exceeds buffer size 1012
[TensorRT-LLM][WARNING] Step function failed, continuing.
[TensorRT-LLM][WARNING] Step function failed, continuing.
[BENCHMARK] num_samples 200
[BENCHMARK] total_latency(ms) 71149.43
[BENCHMARK] seq_throughput(seq/sec) 2.81
[BENCHMARK] token_throughput(token/sec) 531.37
[BENCHMARK] avg_sequence_latency(ms) 22587.76
[BENCHMARK] p99_sequence_latency(ms) 50983.86
[BENCHMARK] p90_sequence_latency(ms) 45602.29
[BENCHMARK] p50_sequence_latency(ms) 14514.95
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.

additional notes

no

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions