Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about data augmentation in your ResNet image classifier example #8

Open
jossgillet opened this issue Jan 27, 2021 · 2 comments

Comments

@jossgillet
Copy link

Hello Ben - how can we remove the data augmentation step in your ResNet example? I need to pass the entire image in the training (no cropping).

I tried to modify the variables train_transforms and test_transforms to remove the rotation, horizontal flip and cropping, thus keeping only .Resize(), .ToTensor() and Normalize() in these variables. So the only thing I've modified in your script is:

train_transforms = transforms.Compose([
                           transforms.Resize(pretrained_size),
                           transforms.ToTensor(),
                           transforms.Normalize(mean = pretrained_means, 
                                                std = pretrained_stds)
                       ])

test_transforms = transforms.Compose([
                           transforms.Resize(pretrained_size),
                           transforms.ToTensor(),
                           transforms.Normalize(mean = pretrained_means, 
                                                std = pretrained_stds)
                       ])

But then when triggering the training loop, I get this error message:

invalid argument 0: Sizes of tensors must match except in dimension 0. Got 630 and 513 in dimension 3 at /opt/conda/conda-bld/pytorch_1579022060824/work/aten/src/TH/generic/THTensor.cpp:612

Any idea how to fix that?

Many thanks

@bentrevett
Copy link
Owner

The reason for this is that if the argument to transforms.Resize is an integer then it rescales the image to only make the shorter edge pretrained_size. The longer edge will be scaled but as the sizes of the images in the batch will be different so will the size of this longer edge, and thus your images can't be fixed together.

If you really don't want to augment, then the fix is to change the resize to transforms.Resize((pretrained_size, pretrained_size)). The downside of this is if you have non-square images then the longer dimension will be very squished and the images will be quite distorted, potentially meaning the images are so unrecognizable that your model won't be able to effectively classify them.

Why do you not want to crop your images?

@jossgillet
Copy link
Author

Many thanks Ben for your reply - it is clear now. I'm trying to avoid the cropping step in the augmentation since I need the model to see the full image. Some differences between images are often present on the edges of the images, and by center-cropping them, the model might not understand what constitutes each class. So I'd like to ensure it learns from the entire image and not a cropped zone.

Hope that makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants