Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading local .pt file with from_pretrained failes cause of domino chain of errors. #51

Open
gilkzxc opened this issue Mar 25, 2024 · 8 comments

Comments

@gilkzxc
Copy link

gilkzxc commented Mar 25, 2024

By using the "http://one-peace-shanghai.oss-accelerate.aliyuncs.com/one-peace.pt", and from_pretrained().
I failed to load the pretrained model.
As it's prints "Killed".
So I have traced deeper to the location of the bug.
Led me to try checkpoint_utils.load_model_ensemble_and_task(), which also failed.
Eventually, I have found that an AssertionError is been raised in line 43 setup_task()
"task is not None"
image

@logicwong
Copy link
Member

It might be because you haven't installed fairseq under this repo. Try:

pip uninstall fairseq
git clone https://github.com/OFA-Sys/ONE-PEACE
cd ONE-PEACE
pip install -r requirements.txt

@gilkzxc
Copy link
Author

gilkzxc commented Apr 3, 2024

But I did install fairseq in the same directory.

@logicwong
Copy link
Member

Can you share your execution command?

@gilkzxc
Copy link
Author

gilkzxc commented Apr 3, 2024

image
That's just shows "Killed".
So I used the equivalent calls that from_pretraind() do. And then load_model.... and etc..

@logicwong
Copy link
Member

Could it be because there's not enough memory? Try this:
model = from_pretrained("one-peace.pt", device=device, dtype="float16")

@gilkzxc
Copy link
Author

gilkzxc commented Apr 4, 2024

Tried with dtype="float16" and "float8", same result "Killed".

@gilkzxc
Copy link
Author

gilkzxc commented Apr 4, 2024

Even reinstalled omegaconf again to be sure it's 2.0.6 as required.
image

@logicwong
Copy link
Member

Can you switch to a machine with larger memory, and then try using model = from_pretrained("one-peace.pt", device='cpu', dtype="float16")? I haven't encountered this situation before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants