Smaller / Distilled model? #6

johnpaulbin · 2021-11-02T17:12:34Z

Will there be a smaller or a distilled model release? The problem with inferencing in google colab is the speeds. 4:32 for one image on a P100, and 2 hours+ for 3 images on K80.

shonenkov · 2021-11-03T09:36:13Z

yes, sure! We will publish small version of ~~rudalle (135M)~~ rudolph (350M) for the New Year :)

neverix · 2021-11-03T13:49:37Z

#12 fixes some of the speed issues

HighCWu · 2022-01-07T07:58:31Z

It's a new year🎉.
I can't wait for the small version of rudalle model. I think that the model can not only infer faster, but should also be able to be finetuned faster. Although there may not be as rich a priori information as before, if the data used for fine-tuning is enough, the result should still be great.
Waiting for your good news !!!

shonenkov · 2022-01-07T21:37:44Z

@HighCWu small version "RuDOLPH 🦌🎄☃️ " 350M is available here https://github.com/sberbank-ai/ru-dolph

HighCWu · 2022-01-08T00:01:10Z

Wow, I see a lot of new features in the new repo. The new model can do more. Great job👍

HighCWu · 2022-01-08T10:01:22Z

@shonenkov I noticed that the hidden_size of ru-dolph has been reduced by half, and the resolution of the image becomes 128p, and the codes obtained by vqgan becomes 16x16. Although this is indeed much faster, the generated images lose more details, or compared to 32x32 codes, the details are not accurate.
Will there be a small model with 1024 hidden size, while the input is still 32x32 codes in the future?

johnpaulbin · 2022-01-08T20:09:34Z

https://github.com/sberbank-ai/ru-dolph <-

HighCWu · 2022-01-09T03:33:45Z

@johnpaulbin Should this issue be closed? In fact ru-dolph is an extended version of ru-dalle, which has a medium size (350M parameters), which is not the small version (135M) of ru-dalle as @shonenkov said.

shonenkov · 2022-01-09T08:55:53Z

@HighCWu Sorry for the long reply. For quick and easy fine-tuning setup and fast inference, the community can use the Medium (350m) version of RuDOLPH. The Small version is not enough to generate images in the RuDALLE version, so this version will not be published.

HighCWu · 2022-01-09T15:03:29Z

@shonenkov Get it. And I want to know why ru-dolph uses 16x16 vqgan codes instead of 32x32 in ru-dalle. If vqgan encodes the face as 16x16 codes, after decoding, the eyes in the picture will become very strange.

shonenkov pinned this issue Nov 3, 2021

johnpaulbin closed this as completed Jan 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smaller / Distilled model? #6

Smaller / Distilled model? #6

johnpaulbin commented Nov 2, 2021

shonenkov commented Nov 3, 2021 •

edited

neverix commented Nov 3, 2021

HighCWu commented Jan 7, 2022 •

edited

shonenkov commented Jan 7, 2022

HighCWu commented Jan 8, 2022

HighCWu commented Jan 8, 2022

johnpaulbin commented Jan 8, 2022

HighCWu commented Jan 9, 2022

shonenkov commented Jan 9, 2022

HighCWu commented Jan 9, 2022

Smaller / Distilled model? #6

Smaller / Distilled model? #6

Comments

johnpaulbin commented Nov 2, 2021

shonenkov commented Nov 3, 2021 • edited

neverix commented Nov 3, 2021

HighCWu commented Jan 7, 2022 • edited

shonenkov commented Jan 7, 2022

HighCWu commented Jan 8, 2022

HighCWu commented Jan 8, 2022

johnpaulbin commented Jan 8, 2022

HighCWu commented Jan 9, 2022

shonenkov commented Jan 9, 2022

HighCWu commented Jan 9, 2022

shonenkov commented Nov 3, 2021 •

edited

HighCWu commented Jan 7, 2022 •

edited