Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imagenet-C loader can only return 5k images #92

Open
oripress opened this issue Jun 17, 2022 · 5 comments
Open

Imagenet-C loader can only return 5k images #92

oripress opened this issue Jun 17, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@oripress
Copy link

When loading imagenetc with a batchsize of over 5k images, you always get 5k images back.
This doesn't throw out an error, and it can be confusing when you expect to receive more images than you actually get.

This behavior can be shown using this code snippet:

from robustbench.data import load_imagenetc
x_test, y_test = load_imagenetc(50000, 5, path, False, ['brightness'])
print(x_test.size()) 
@fra31
Copy link
Member

fra31 commented Jul 30, 2022

Hi,

sorry for the late reply. At the moment it is indeed possible to load the 5k images for which the results are reported. I've added an error if more images are specified with #96.

For common corruptions, it might make sense to add the option of running the evaluation of the whole validation set. I think this would require some adjustment of the code, since at the moment all examples are loaded at once in

x_test, y_test, paths = next(iter(test_loader))

@max-andr @dedeswim Thoughts?

@max-andr
Copy link
Member

max-andr commented Aug 1, 2022

@fra31: agreed, throwing an error sounds good as a temporary solution. And I think it's important to preserve backward compatibility as returning directly tensors has been useful to simplify code around RobustBench and now used in many scripts.

As for a better solution that preserves backward compatibility, we could create an optional parameter for each load_* function in data.py which would make it return either tensors (i.e., x_test, y_test as it is now) or a loader. Perhaps, then we should also make functions like clean_accuracy() also compatible with data supplied via a loader.

What do you think?

@fra31
Copy link
Member

fra31 commented Aug 2, 2022

I agree, we should preserve the current loading as default and adding the option of using the full validation set for common corruptions evaluation.

@dedeswim
Copy link
Member

dedeswim commented Aug 2, 2022

Hi! I agree to add optional support for DataLoaders. I can open a PR and work on it

@max-andr
Copy link
Member

max-andr commented Aug 2, 2022

@dedeswim that would be fantastic!

@dedeswim dedeswim self-assigned this Oct 11, 2022
@dedeswim dedeswim added the enhancement New feature or request label Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants