Skip to content

Conversation

XiaobingSuper
Copy link
Contributor

@XiaobingSuper XiaobingSuper commented May 8, 2024

For HF Tokenizer.encode, the add_special_tokens default value is True(https://huggingface.co/transformers/v2.11.0/main_classes/tokenizer.html#transformers.PreTrainedTokenizer.encode), but tensorrtllm_backend default value is Flase, and if the user doesn't set it at preprocessing/config.pbtxt, there may have some issues: #445 and #434. I think we need to align the behavior of tensorrtllm_backend with HF and not expect the user to set it before launching a model server(same as Tokenizer.decode).

@XiaobingSuper XiaobingSuper force-pushed the xiaobing/add_special_tokens branch from 9530bf9 to a1e8233 Compare May 8, 2024 09:06
@XiaobingSuper
Copy link
Contributor Author

@schetlur-nv

@XiaobingSuper XiaobingSuper force-pushed the xiaobing/add_special_tokens branch from a1e8233 to 84ff2d1 Compare May 8, 2024 09:52
@XiaobingSuper XiaobingSuper changed the title make add_special_tokens default value is true which align with hf setting make add_special_tokens/skip_special_tokens default value is true which align with hf setting May 8, 2024
@kaiyux
Copy link
Collaborator

kaiyux commented May 14, 2024

Hi @XiaobingSuper , thanks a lot for your great contribution! We've integrated your changes in PR #454 and credited you as co-author, hence I'm going to close this PR.

Let me know if you have any questions, and thanks again for help make TensorRT-LLM better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants