Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about data augmentation #5

Open
yuanshuai220 opened this issue Sep 28, 2017 · 13 comments
Open

Question about data augmentation #5

yuanshuai220 opened this issue Sep 28, 2017 · 13 comments

Comments

@yuanshuai220
Copy link

Thanks for your code, it helps me a lot. But I have some questions about data augmentation. In the generate_train_lap_pry.m, you only used downsizing to make more training data. While in the paper, the author augments the training data in three ways, scaling, rotation and flipping. Your performance is better than the paper, but your training data only has 7488 examples. I'm confused about it.

@ZhangDY827
Copy link

ZhangDY827 commented Sep 28, 2017

@yuanshuai220 Hi, I am reproducing the paper result recently. The training data provided is a tiny sample, you can collect the BSD200, T91 and general100 total 391 images as your training dataset using generate_train_lap_pry.m. I get the training datasets size of (11712, 1, 32, 32). After 200 epochs, I get average psnr 31.32 on Set5 for 4X. After several test, I find that the training datasets play a important roles in resluts. The more richer training datasets is, the better result you will get. Meanwhile data augmentation is also important, you can add scaling, rotation and flipping function in generate_train_lap_pry.m script by yourself.

@yuanshuai220
Copy link
Author

@CasdDesnDR I agree with you. If the trianing data is not enough, the nerual network will overfit with the training set. So the performance on test set is not good. I will add rotation and flipping in the generate_train_lap_pry.m

@twtygqyy
Copy link
Owner

@yuanshuai220 @CasdDesnDR Please refer https://github.com/twtygqyy/pytorch-SRResNet/blob/master/data/generate_train_srresnet.m for adding flipping and rotation

@baiyancheng20
Copy link

@twtygqyy Hi, Thank you for sharing your code. I want to know why you convert RGB images into YCbCr colour space and only use the Y channel information. How about the results directly using all RGB channels?

@twtygqyy
Copy link
Owner

Hi @baiyancheng20, I followed the LapSRN paper for the implementation. Actually, you can check https://github.com/twtygqyy/pytorch-SRResNet which I used RGB image as inputs.

@sriprabhar
Copy link

@twtygqyy Hi, Thank you for sharing your LAPSRN code. I took your pytorch code from Git-hub and executed it. It works only for grayscale images.
I modified the lapsrn.py to extend support for RGB color images. Then I took just one color image from Urban100 dataset. On this image, I performed augmentations as given in your matlab code (generate_train_lap_pry.m). I got around 165 color image patches of size 128x128. Using these image patches (h5 file ) I trained the network for 100 epochs.
For testing I modified the test.py for color images, and gave as input,
a 32x32 cropped image from the original training image. The results are very poor. I'm not sure where I'm going wrong. I have attached modified codes and my results. I request that I may be kindly given technical advise as to how I can proceed further to get the correct results.

sourcefiles.zip

@twtygqyy
Copy link
Owner

@sriprabhar Hi, I understand that you tried to overfit the network on a small dataset. How is the loss looks like in your training? Did it converge well?

@sriprabhar
Copy link

Thanks for your response.
I took the building image (attached) and took several overlapping patches.
Training trial 1

stride = 64 and
number of patched = 15x11 patches of size 128x128.
Convergence noticed. Attached Plot no 1.
Trained for 100 epochs

Training trial 2

stride = 16 and
number of patches were 41x57, each of size 128x128.
Convergence noticed. Attached Plot no 2.
Trained for 5 epochs using training trial 1 model as pre-trained model.

Test image

  1. Image of size 32x32 cropped from building image
  2. Image patch taken from building image used for training. This image is downsampled to 32x32.

I'm not sure how to solve, if its an overfitting problem. Please have a look at the attachments.

figure_trainingtrial1
figure_trainingtrial2
figure_building
figure_patch
building

@sriprabhar
Copy link

Hi,
Also, for training, we have to create dataset in hdf5 format using matlab code. For creating h5 file using patches of single image, the file size is huge. (for 165 color patches, h5 file size is around 500 MB for 57x41 patches the file size is around 3GB)
If I have a folder containing around 50 images of size 1080x1080 and if I run the matlab code for RGB color images, the system hangs.
I'm not sure if I'm following the correct methods for dataset creation and training. Thanks for any kind of help/suggestions

@twtygqyy
Copy link
Owner

@sriprabhar Hi, I think the way how you generate the h5 is correct, while you will probably get to many small patches out of 1080x1080 because of the huge size. (3GB is not that big, TBH : ) )

A quick way to solve this is to change the stride when you run the matlab code for generation.
The best way to solve this is to generate multiple h5 files, create a new generator with the folder contains h5 files as input, and fetch data from one h5 at a time.

@twtygqyy
Copy link
Owner

@sriprabhar also the result you plotted makes sense to me. Cause the image you tested might not be the exactly same one as you used in training. Grab one image from h5 which you used for training, and see if the result looks better.

@sriprabhar
Copy link

Thank you for your response, I will give the training patch and try.
Also, one more doubt is that, the LapSRN works for Y component alone, we combined the bicubic interpolated cb and cr and merged with the Lap-srn super resolved y component. The results were good. If y component is sufficient for training and for PSNR measurement, then I would like to know why we have to train for RGB images (like in SRResNet)

@twtygqyy
Copy link
Owner

Hi @sriprabhar you can have a look at section 5.1 in this paper Fast and Accurate Image Super-Resolution Using A Combined Loss. They compared the difference between training with Y and RGB for SR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants