Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller / Distilled model? #6

Closed
johnpaulbin opened this issue Nov 2, 2021 · 10 comments
Closed

Smaller / Distilled model? #6

johnpaulbin opened this issue Nov 2, 2021 · 10 comments

Comments

@johnpaulbin
Copy link

Will there be a smaller or a distilled model release? The problem with inferencing in google colab is the speeds. 4:32 for one image on a P100, and 2 hours+ for 3 images on K80.

@shonenkov
Copy link
Collaborator

shonenkov commented Nov 3, 2021

yes, sure! We will publish small version of rudalle (135M) rudolph (350M) for the New Year :)

@shonenkov shonenkov pinned this issue Nov 3, 2021
@neverix
Copy link
Contributor

neverix commented Nov 3, 2021

#12 fixes some of the speed issues

@HighCWu
Copy link

HighCWu commented Jan 7, 2022

It's a new year🎉.
I can't wait for the small version of rudalle model. I think that the model can not only infer faster, but should also be able to be finetuned faster. Although there may not be as rich a priori information as before, if the data used for fine-tuning is enough, the result should still be great.
Waiting for your good news !!!

@shonenkov
Copy link
Collaborator

@HighCWu small version "RuDOLPH 🦌🎄☃️ " 350M is available here https://github.com/sberbank-ai/ru-dolph

@HighCWu
Copy link

HighCWu commented Jan 8, 2022

Wow, I see a lot of new features in the new repo. The new model can do more. Great job👍

@HighCWu
Copy link

HighCWu commented Jan 8, 2022

@shonenkov I noticed that the hidden_size of ru-dolph has been reduced by half, and the resolution of the image becomes 128p, and the codes obtained by vqgan becomes 16x16. Although this is indeed much faster, the generated images lose more details, or compared to 32x32 codes, the details are not accurate.
Will there be a small model with 1024 hidden size, while the input is still 32x32 codes in the future?

@johnpaulbin
Copy link
Author

@HighCWu
Copy link

HighCWu commented Jan 9, 2022

@johnpaulbin Should this issue be closed? In fact ru-dolph is an extended version of ru-dalle, which has a medium size (350M parameters), which is not the small version (135M) of ru-dalle as @shonenkov said.

@shonenkov
Copy link
Collaborator

@HighCWu Sorry for the long reply. For quick and easy fine-tuning setup and fast inference, the community can use the Medium (350m) version of RuDOLPH. The Small version is not enough to generate images in the RuDALLE version, so this version will not be published.

@HighCWu
Copy link

HighCWu commented Jan 9, 2022

@shonenkov Get it. And I want to know why ru-dolph uses 16x16 vqgan codes instead of 32x32 in ru-dalle. If vqgan encodes the face as 16x16 codes, after decoding, the eyes in the picture will become very strange.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants