[BUG/Help] tokenizer collapse? #1446

CoinCheung · 2024-01-12T04:02:00Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

tokenizer collapse

Expected Behavior

No response

Steps To Reproduce

model_name = 'THUDM/chatglm3-6b-base'
config = AutoConfig.from_pretrained(model_name, trust_remote_code=True, use_fast=False)
tokenizer = AutoTokenizer.from_pretrained(model_name, config=config, trust_remote_code=True)
print([tokenizer.decode(el) for el in [1833, 2893]])

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

On my platform, the output is same:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG/Help] tokenizer collapse? #1446

[BUG/Help] tokenizer collapse? #1446

CoinCheung commented Jan 12, 2024 •

edited

[BUG/Help] tokenizer collapse? #1446

[BUG/Help] tokenizer collapse? #1446

Comments

CoinCheung commented Jan 12, 2024 • edited

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

CoinCheung commented Jan 12, 2024 •

edited