Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CIFAR10 dataset, dataloader and training scripts #76

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

fnands
Copy link
Contributor

@fnands fnands commented May 8, 2024

This PR adds the code required to train a CNN on the CIFAR10 dataset.

I adapted the Dataloader already there to one that looks more like a Pytorch dataloader.

I would make it generic, but at this moment Mojo doesn't seem to support generic assignment?
I.e. I created a BaseDataset, so that the DataLoader can possibly call this, but I get a compiler error:
TODO: dynamic traits not supported yet, please use a compile time generic instead of 'BaseDataset'

Maybe one of you knows how to make the compile time generics work here?

I haven't fully checked how well the model trains (add accuracy?), but the loss is about the same for PyTorch and Basalt.
Could also maybe increase the model size for CIFAR 10.

The BasaltTensor __init__(inout self, owned tensor: _Tensor[dtype]): results in some kind of double free making the program segfault, so it's just a simple copy for the moment.

@fnands fnands marked this pull request as draft May 8, 2024 16:19
@fnands
Copy link
Contributor Author

fnands commented May 8, 2024

I added mimage as a submodule, but can also just add the .mojopkg file to the repo.
Until such time as Mojo get's a packaging system I think those are the two options.

@fnands fnands marked this pull request as ready for review May 9, 2024 07:27
@StijnWoestenborghs
Copy link
Collaborator

Thanks a lot for this, I looked at it and the example is solid. The thing I'm struggling with the most is it introducing a dependency to "all of basalt" while it is used for an "example using basalt". But the thing is, we have YoloV8 coming up, so we'll need the ability to read images for that as well. Maybe the right place for this (and all our examples) is a dedicated basalt-examples repo, as also the number of examples are growing as well.

Right now I'm thinking to leave this open and postpone a decision on this, to see how the other computer vision example turns out. Also the fact that there is no official package manager makes things harder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants