Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate image of arbitrary dimension #2

Open
amarzullo24 opened this issue Jan 30, 2017 · 5 comments
Open

Generate image of arbitrary dimension #2

amarzullo24 opened this issue Jan 30, 2017 · 5 comments

Comments

@amarzullo24
Copy link

amarzullo24 commented Jan 30, 2017

Dear,
Is it possible to generate images of arbitrary dimension?
I would like to try the framework using images of arbitrary size but I don't know what I have to change in the code. Images generated in the output grid are indeed squared.

It is sufficient for me also generate rectangular images as in the sky generator, but changing the parameters as in that files (i.e. multiplying by two the scale parameter) results in an error when running train.lua.

Here the stacktrace:

Number of free parameters in D: 11904795
Number of free parameters in G: 5179396
Copying model to gpu...
Loading new training data...
/home/user/torch/install/bin/luajit: inconsistent tensor size at /home/user/torch/pkg/torch/lib/TH/generic/THTensorCopy.c:7
stack traceback:
        [C]: at 0x7fd5bfd1cf60
        [C]: in function '__newindex'
        ./utils/nn_utils.lua:159: in function 'visualizeProgress'
        train.lua:232: in function 'main'
        train.lua:263: in main chunk
        [C]: in function 'dofile'
        ...user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00405d50

Thanks for your help

@aleju
Copy link
Owner

aleju commented Jan 30, 2017

The models are adapted towards 32x32 images. While D may technically handle other image sizes (might still perform bad or slow), G will simply continue to generate 32x32 images, no matter the parameter settings. The other parts of the scripts might be able to handle other image sizes (not tested).
You can try to replace the function models.create_G_decoder_upsampling32c in models.lua with the following (for 64x64 images):

function models.create_G_decoder_upsampling32c(dimensions, noiseDim)
    local model = nn.Sequential()
    -- 4x4
    model:add(nn.Linear(noiseDim, 512*4*4))
    --model:add(nn.BatchNormalization(512*4*4))
    model:add(nn.PReLU(nil, nil, true))
    model:add(nn.View(512, 4, 4))
    
    -- 4x4 -> 8x8
    model:add(nn.SpatialUpSamplingNearest(2))
    model:add(cudnn.SpatialConvolution(512, 512, 3, 3, 1, 1, (3-1)/2, (3-1)/2))
    model:add(nn.SpatialBatchNormalization(512))
    model:add(nn.PReLU(nil, nil, true))
    
    -- 8x8 -> 16x16
    model:add(nn.SpatialUpSamplingNearest(2))
    model:add(cudnn.SpatialConvolution(512, 256, 3, 3, 1, 1, (3-1)/2, (3-1)/2))
    model:add(nn.SpatialBatchNormalization(256))
    model:add(nn.PReLU(nil, nil, true))
    
    -- 16x16 -> 32x32
    model:add(nn.SpatialUpSamplingNearest(2))
    model:add(cudnn.SpatialConvolution(256, 128, 5, 5, 1, 1, (5-1)/2, (5-1)/2))
    model:add(nn.SpatialBatchNormalization(128))
    model:add(nn.PReLU(nil, nil, true))
    
    -- 32x32 -> 64x64
    model:add(nn.SpatialUpSamplingNearest(2))
    model:add(cudnn.SpatialConvolution(128, 128, 5, 5, 1, 1, (5-1)/2, (5-1)/2))
    model:add(nn.SpatialBatchNormalization(128))
    model:add(nn.PReLU(nil, nil, true))

    model:add(cudnn.SpatialConvolution(128, dimensions[1], 3, 3, 1, 1, (3-1)/2, (3-1)/2))
    model:add(nn.Sigmoid())

    model = require('weight-init')(model, 'heuristic')

    return model
end

which simply adds one more upsampling step to the generator model (from 32x32 to 64x64).

The V model is also not adapted towards 64x64 images. It would be best to simply deactivate its usage by removing/commenting the following lines (177 to 182) from utils/nn_utils.lua:

    local rndImagesRating = nn_utils.rateWithV(rndImages)
    local goodImagesRating = nn_utils.rateWithV(goodImages)
    local badImagesRating = nn_utils.rateWithV(badImages)
    table.insert(PLOT_DATA, {EPOCH, rndImagesRating, goodImagesRating, badImagesRating})
    print(string.format("<nnutils viz> [V] semiRandom: %.4f, goodImages: %.4f, badImages: %.4f", rndImagesRating, goodImagesRating, badImagesRating))
    DISP.plot(PLOT_DATA, {win=OPT.window+5, labels={'epoch', 'V(semiRandom)', 'V(goodImages)', 'V(badImages)'}, title='Rating by V'})

I think it was only used there.
If you pretrained a generator model according to the instructions, you will have to deactivate its usage by calling train.lua with the flag --G_pretrained_dir="NONE".

Then training with 64x64 images might work. The results can still be bad, as none of the networks was optimized for that resolution.

@amarzullo24
Copy link
Author

Great, I will try.
And what about generate images of size 64x32 as in the sky-generator? Is still possible?

@aleju
Copy link
Owner

aleju commented Jan 30, 2017

D might still be able to handle this.
The G function posted above should then start instead with

    model:add(nn.Linear(noiseDim, 512*4*2))
    --model:add(nn.BatchNormalization(512*4*2))
    model:add(nn.PReLU(nil, nil, true))
    model:add(nn.View(512, 4, 2))

(Second 4 decreased to 2 in all of them.)

Not sure if all of the visualization is able to handle that image shape.

@amarzullo24
Copy link
Author

Yes, indeed the error seems related to the visualization. I have to take a look into this more carefully

@aleju
Copy link
Owner

aleju commented Jan 30, 2017

The error posted above (first post) is related to G. The visualization creates matrices according to the --width and --height parameters. Then it generates images via G and tries to place them in these matrices. But as G ignores the width/height parameters and uses instead always 32x32, the dimensions don't match, resulting in a crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants