Running out of CUDA/GPU spaces #14

gmegh · 2022-11-10T09:49:56Z

I have a GPU with 15GB and it seems it runs out of space when I try to train the network with 50 videos at a time. Do you think it would be better to repeat the loss training video per video, instead of all the videos at once?

gmegh · 2022-11-10T09:54:11Z

Additionally, when training on 20 videos and text prompts the model output is still just noise, which I think is the expected result, given the lack of training, right?

lucidrains · 2022-11-10T16:26:28Z

@gmegh yea, training on video won't be a cakewalk

also, before the wip flag is removed, the network is still very alpha

i plan on making the network agnostic to image or video training, and start with images first. realistically, for this to be trained successfully outside of google, it would need to be pretrained on images

gmegh · 2022-11-10T18:50:04Z

Yes, that makes, sense. Let me know if I can help. Do you know when are you planning on having the agnostic feature ready?

I did create some short functions to be able to use .mp4 instead of just gifs and saved the tensors to mp4 as well. Let me know if you would like for me to add them to a PR

lucidrains · 2022-11-10T19:38:58Z

@gmegh so i have to add 3d continuous relative positional bias to the maskgit embedding to allow for generalization to different sizes. i think i should be able to get it done by tomorrow evening

re: mp4 - yes! that would be super helpful!

gmegh · 2022-11-10T21:01:51Z

Great! I will create a PR.

Also for reference, these guys are also working on implementing it: https://github.com/LAION-AI/phenaki

I think another nice to-do would be to allow for saving the trained model and be able to load it

lucidrains · 2022-11-10T21:58:18Z

@gmegh yup, i've been chatting with Dominic

they are planning on straying a bit farther from the paper's implementation (for example, using all convolutions in the cvivit)

but this is a joint effort; anything i develop here they are free to use

lucidrains · 2022-11-10T21:59:18Z

@gmegh yea, i'll definitely get to the training code soon, once i add a few more bells and whistles to the attention networks

gmegh · 2022-11-12T06:21:00Z

Awesome! Happy to help if you want.

lucidrains · 2022-11-14T17:43:40Z

@gmegh yea definitely welcome any help!

do you know of any good packages for processing and loading video data?

gmegh · 2022-11-15T17:49:11Z

@lucidrains Yes! I think cv2 is a good package. I made some quick functions with it that I have added to the new PR. The crop_image() should probably be edited further

gmegh · 2022-11-15T17:50:30Z

What is the status of the code right now? I think the checkboxes in the readme are outdated, right?

lucidrains · 2022-11-15T17:59:00Z

@gmegh the code will be in a very good place by the end of the week, and by end of next week, all the training code will be there

lucidrains · 2022-11-15T17:59:42Z

@gmegh usually there is some back and forth and whittling away at bugs for about a month or so after i remove the wip, but that's usually a fast process as i like to iterate quickly

lucidrains · 2022-11-15T18:00:25Z

@gmegh for training on my end, i plan to get it to a place where the framework can produce unconditional (or text conditioned) images by end of the week

that part i know very well from my other works

lucidrains · 2022-11-15T18:00:44Z

@gmegh feel free to experiment in the mean time!

gmegh · 2022-11-22T01:04:45Z

Hi @lucidrains ! Is the framework that can produce unconditional (or text conditioned) images ready? I am experimenting with the current version and I would need a way to train by batches, because using 500 videos at a time already fills up my CUDA memory. Any idea on how to go about this?

cyrilzakka · 2022-12-01T21:12:41Z

@gmegh yea definitely welcome any help!

do you know of any good packages for processing and loading video data?

@lucidrains I could take care of this. Any preferences as to whether you'd like to break down each video into frames, or sample from a video directly?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running out of CUDA/GPU spaces #14

Running out of CUDA/GPU spaces #14

gmegh commented Nov 10, 2022

gmegh commented Nov 10, 2022

lucidrains commented Nov 10, 2022

gmegh commented Nov 10, 2022

lucidrains commented Nov 10, 2022

gmegh commented Nov 10, 2022

lucidrains commented Nov 10, 2022

lucidrains commented Nov 10, 2022

gmegh commented Nov 12, 2022

lucidrains commented Nov 14, 2022

gmegh commented Nov 15, 2022 •

edited

gmegh commented Nov 15, 2022

lucidrains commented Nov 15, 2022

lucidrains commented Nov 15, 2022

lucidrains commented Nov 15, 2022

lucidrains commented Nov 15, 2022

gmegh commented Nov 22, 2022

cyrilzakka commented Dec 1, 2022 •

edited

Running out of CUDA/GPU spaces #14

Running out of CUDA/GPU spaces #14

Comments

gmegh commented Nov 10, 2022

gmegh commented Nov 10, 2022

lucidrains commented Nov 10, 2022

gmegh commented Nov 10, 2022

lucidrains commented Nov 10, 2022

gmegh commented Nov 10, 2022

lucidrains commented Nov 10, 2022

lucidrains commented Nov 10, 2022

gmegh commented Nov 12, 2022

lucidrains commented Nov 14, 2022

gmegh commented Nov 15, 2022 • edited

gmegh commented Nov 15, 2022

lucidrains commented Nov 15, 2022

lucidrains commented Nov 15, 2022

lucidrains commented Nov 15, 2022

lucidrains commented Nov 15, 2022

gmegh commented Nov 22, 2022

cyrilzakka commented Dec 1, 2022 • edited

gmegh commented Nov 15, 2022 •

edited

cyrilzakka commented Dec 1, 2022 •

edited