Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong result #414

Open
Yurains opened this issue Jan 7, 2024 · 3 comments
Open

Wrong result #414

Yurains opened this issue Jan 7, 2024 · 3 comments

Comments

@Yurains
Copy link

Yurains commented Jan 7, 2024

Thank you for your outstanding work
I am training my own model using StyleGAN2 ada pytorch and importing other photos with PTI
but I encountered an "AssertionError: Wrong size for dimension 1: got 18, expected 12" issue
This seems to be a dimension-related problem, but I'm not sure how to resolve it
Is there a way to make the necessary changes?

@PDillis
Copy link
Contributor

PDillis commented Jan 14, 2024

Since I haven't used PTI, I can tell you where that error comes from and how to find where the code fails: in StyleGAN1/2, the mapping network $f$ or G.mapping will take a random latent z ($z\in\mathbb{R}^{512}$) and will output a disentangled latent w ($w\in\mathbb{R}^{1\times n\times512}$); for unconditional models, you simply do w = G.mapping(z, None). The disentangled latent w is the one you wish to find to do the editing with DragGAN (using either simple inversion or PTI), whose dimension $n$ will depend on the image resolution of your dataset/size of images that will be generated.

Concretely, StyleGAN expects two sections of the disentangled latent per block resolution in the synthesis network $g$ or G.synthesis (which starts from 4 and goes up by powers of 2 up until your final output resolution; more info in the StyelGAN architecture). So, from the AssertionError you posted above, it seems like PTI is giving you a disentangled latent of shape [1, 18, 512] whereas the network you are training is expecting a disentangled latent of shape [1, 12, 512]. In other words, PTI has hard-coded an image resolution of 1024 ($n=18$) whereas your StyleGAN2 model has a resolution of 128 ($n=12$).

I could be wrong and be the other way, so it's always helpful to tell us which code you ran and which line gave the AssertionError above, otherwise all we can do is guess.

@Yurains
Copy link
Author

Yurains commented Mar 3, 2024

@PDillis
Sorry for replying to you now. Thank you very much for your reply.
I tried to combine PTI with DranGAN and determined the default model he wanted to use.
1

This is the official default model, there is no problem.

Here I make sure my dimensions are correct and use the specified [1,18,512], but this error still occurs
螢幕擷取畫面 2024-03-04 035801

@Yurains
Copy link
Author

Yurains commented Mar 3, 2024

@PDillis If you need the code, I can mail it to you thank

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants