-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
Doc<NV>TRTLLM's textual/illustrative materials: API refs, guides, tutorials. Improvement & clarity.<NV>TRTLLM's textual/illustrative materials: API refs, guides, tutorials. Improvement & clarity.triagedIssue has been triaged by maintainersIssue has been triaged by maintainers
Description
The readme file write
python convert_checkpoint.py --model_version v1_13b \
--model_dir baichuan-inc/Baichuan-13B-Chat \
--dtype float16 \
--output_dir ./tmp/baichuan_v1_13b/trt_engines/fp16/1-gpu/
and
trtllm-build --checkpoint_dir ./trt_ckpt/baichuan_v1_13b/ \
--output_dir ./trt_engines/baichuan_v1_13b/ \
--gemm_plugin float16 \
--max_batch_size=32 \
--max_input_len=1024 \
--max_output_len=512
Perhaps it means that using trtllm-build to generate an engine from the checkpoint given by convert_checkpoint.py? But the input/ouput of them do not match.
Myabe it should be
python convert_checkpoint.py --model_version v1_13b \
--model_dir baichuan-inc/Baichuan-13B-Chat \
--dtype float16 \
--output_dir ./trt_ckpt/baichuan_v1_13b/
and
trtllm-build --checkpoint_dir ./trt_ckpt/baichuan_v1_13b/ \
--output_dir ./trt_engines/baichuan_v1_13b/ \
--gemm_plugin float16 \
--max_batch_size=32 \
--max_input_len=1024 \
--max_output_len=512
Metadata
Metadata
Assignees
Labels
Doc<NV>TRTLLM's textual/illustrative materials: API refs, guides, tutorials. Improvement & clarity.<NV>TRTLLM's textual/illustrative materials: API refs, guides, tutorials. Improvement & clarity.triagedIssue has been triaged by maintainersIssue has been triaged by maintainers