Fix eos_token_id error #67

ypwhs · 2023-03-24T06:56:27Z

tokenizer.eos_token_id outputs 20002, but the eos_token_id defined in the config.json of ChatGLM-6B is 150005, so it needs to be changed to 150005.

{
  "_name_or_path": "THUDM/chatglm-6b",
  "architectures": [
    "ChatGLMModel"
  ],
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration"
  },
  "bos_token_id": 150004,
  "eos_token_id": 150005,
  "hidden_size": 4096,
  "inner_hidden_size": 16384,
  "layernorm_epsilon": 1e-05,
  "max_sequence_length": 2048,
  "model_type": "chatglm",
  "num_attention_heads": 32,
  "num_layers": 28,
  "position_encoding_2d": true,
  "torch_dtype": "float16",
  "transformers_version": "4.23.1",
  "use_cache": true,
  "vocab_size": 150528
}

mymusise · 2023-03-24T07:38:28Z

Thanks for this PR!

It looks like a bug in ChatGLM Tokenizer config, this may be the reason for not stopping when generating sentences.
#55 #60

But I'd instead read eos_token_id from the config file than a magic number

config = AutoConfig.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True, device_map='auto')
config.eos_token_id

ypwhs · 2023-03-24T07:50:22Z

Thanks for this PR!

It looks like a bug in ChatGLM Tokenizer config, this may be the reason for not stopping when generating sentences. #55 #60

But I'd instead read eos_token_id from the config file than a magic number
config = AutoConfig.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True, device_map='auto')
config.eos_token_id

Your solution is better.

dumpmemory · 2023-03-24T10:41:18Z

It should be 150005. please check stream_chat function

Fix eos_token_id error

3925f41

ypwhs closed this Mar 24, 2023

ypwhs mentioned this pull request Mar 24, 2023

Fix eos_token_id error #73

Merged

ypwhs deleted the patch-1 branch March 27, 2023 07:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix eos_token_id error #67

Fix eos_token_id error #67

Uh oh!

ypwhs commented Mar 24, 2023 •

edited

Loading

Uh oh!

mymusise commented Mar 24, 2023

Uh oh!

ypwhs commented Mar 24, 2023

Uh oh!

dumpmemory commented Mar 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix eos_token_id error #67

Fix eos_token_id error #67

Uh oh!

Conversation

ypwhs commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mymusise commented Mar 24, 2023

Uh oh!

ypwhs commented Mar 24, 2023

Uh oh!

dumpmemory commented Mar 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ypwhs commented Mar 24, 2023 •

edited

Loading