Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blitz tutorial takes too long to run (GD instead of SGD?) #336

Open
gdalle opened this issue Feb 18, 2022 · 2 comments
Open

Blitz tutorial takes too long to run (GD instead of SGD?) #336

gdalle opened this issue Feb 18, 2022 · 2 comments

Comments

@gdalle
Copy link

gdalle commented Feb 18, 2022

Hi there!
In the 60-minute blitz tutorial (https://fluxml.ai/tutorials/2020/09/15/deep-learning-flux.html), the part where we train a network on CIFAR10 takes longer than expected. Could it be because we actually go through every minibatch in each epoch, instead of sampling only one?
I am specifically referring to this line

. Because of it, I feel like we are actually doing a non-stochastic gradient descent, which would explain the large runtime.

@darsnack
Copy link
Member

train is already a vector of batches (see here), so iterating it in a for-loop will do mini-batch SGD.

But our mini-batches appear to not be so mini...the batch size is 1000! Fixing that should help.

@gdalle
Copy link
Author

gdalle commented Feb 18, 2022

My bad, I was wrong on the meaning of a training epoch! Thanks for your answer, I will try reducing batch size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants