Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newly uploaded image_t2x.json would return assertation error #82

Open
jwzhi opened this issue Mar 5, 2024 · 1 comment
Open

Newly uploaded image_t2x.json would return assertation error #82

jwzhi opened this issue Mar 5, 2024 · 1 comment

Comments

@jwzhi
Copy link

jwzhi commented Mar 5, 2024

I am trying to train the text2image direction using the newly uploaded image_t2x dataset. However, it would return AssertationError in anyToImageVideoAudio.py line 410: assert e - s + 1 == num_gen_tokens, (s, e).

After a closer look into it, an example of the image in image_t2x.json would give e=124, s=118, and thus e-s+1=7, while num_gen_token is a predefined hyperparameter that = 4 and raises assertation error.

And this comes from line 303 in utils.py under model/common, where for the following sentence 'Of course, I can assist you with that! Behold, a captivating vector illustration showcasing a hand delicately crafting a heart shape. The vibrant colors and meticulous details make this graphic both eye-catching and heartwarming. I hope you find it delightful.[IMG0] [IMG1] [IMG2] [IMG3]\n###'. The tokenizer would return [4587, 3236, 29892, 306, 508, 6985, 366, 411, 393, 29991, 1522, 8948, 29892, 263, 4332, 440, 1218, 4608, 8632, 362, 1510, 29883, 5832, 263, 1361, 628, 293, 2486, 25554, 292, 263, 5192, 8267, 29889, 450, 325, 4626, 424, 11955, 322, 1539, 12906, 681, 4902, 1207, 445, 3983, 293, 1716, 10977, 29899, 12510, 292, 322, 5192, 29893, 2817, 292, 29889, 306, 4966, 366, 1284, 372, 15319, 1319, 29889, 32002, 259, 32003, 259, 32004, 259, 32005, 29871, 13, 2277, 29937] where there will be 7 tokens between 32002 and 32005 but it would only be 4 and this raises the assertation error.

I wonder which part of this is wrong and how should I correct this. @ChocoWu

Thank you~

@ChocoWu
Copy link
Collaborator

ChocoWu commented Mar 6, 2024

Hi, @jwzhi
I guess you are using the updated Vicuna (not v0) because I also encountered the issue with the updated tokenizer where it tokenizes “[IMG0] [IMG1] [IMG2] [IMG3]” into “32002, 259, 32003, 259, 32004, 259, 32005” instead of the expected “32002, 32003, 32004, 32005”. But, when trying to tokenize “[IMG0]”, you will get the expected results, “32002”.
Unfortunately, I haven't been able to identify the cause of this issue yet.
So, I recommend either reverting to Vicuna-v0 or making modifications to the code to address this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants