Question about data augmentation in your ResNet image classifier example #8

jossgillet · 2021-01-27T17:15:37Z

Hello Ben - how can we remove the data augmentation step in your ResNet example? I need to pass the entire image in the training (no cropping).

I tried to modify the variables train_transforms and test_transforms to remove the rotation, horizontal flip and cropping, thus keeping only .Resize(), .ToTensor() and Normalize() in these variables. So the only thing I've modified in your script is:

train_transforms = transforms.Compose([
                           transforms.Resize(pretrained_size),
                           transforms.ToTensor(),
                           transforms.Normalize(mean = pretrained_means, 
                                                std = pretrained_stds)
                       ])

test_transforms = transforms.Compose([
                           transforms.Resize(pretrained_size),
                           transforms.ToTensor(),
                           transforms.Normalize(mean = pretrained_means, 
                                                std = pretrained_stds)
                       ])

But then when triggering the training loop, I get this error message:

invalid argument 0: Sizes of tensors must match except in dimension 0. Got 630 and 513 in dimension 3 at /opt/conda/conda-bld/pytorch_1579022060824/work/aten/src/TH/generic/THTensor.cpp:612

Any idea how to fix that?

Many thanks

The text was updated successfully, but these errors were encountered:

bentrevett · 2021-01-31T22:16:03Z

The reason for this is that if the argument to transforms.Resize is an integer then it rescales the image to only make the shorter edge pretrained_size. The longer edge will be scaled but as the sizes of the images in the batch will be different so will the size of this longer edge, and thus your images can't be fixed together.

If you really don't want to augment, then the fix is to change the resize to transforms.Resize((pretrained_size, pretrained_size)). The downside of this is if you have non-square images then the longer dimension will be very squished and the images will be quite distorted, potentially meaning the images are so unrecognizable that your model won't be able to effectively classify them.

Why do you not want to crop your images?

jossgillet · 2021-02-01T08:43:09Z

Many thanks Ben for your reply - it is clear now. I'm trying to avoid the cropping step in the augmentation since I need the model to see the full image. Some differences between images are often present on the edges of the images, and by center-cropping them, the model might not understand what constitutes each class. So I'd like to ensure it learns from the entire image and not a cropped zone.

Hope that makes sense.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about data augmentation in your ResNet image classifier example #8

Question about data augmentation in your ResNet image classifier example #8

jossgillet commented Jan 27, 2021

bentrevett commented Jan 31, 2021

jossgillet commented Feb 1, 2021

Question about data augmentation in your ResNet image classifier example #8

Question about data augmentation in your ResNet image classifier example #8

Comments

jossgillet commented Jan 27, 2021

bentrevett commented Jan 31, 2021

jossgillet commented Feb 1, 2021