Training Results for ElucidatedImagen on DAVIS #368

gauenk · 2023-10-23T16:54:19Z

gauenk
Oct 23, 2023

Training Details: I trained on the DAVIS train-val dataset (90 videos of about 80 frames) for 400k iterations on each UNet for a total of 800k iterations. I used the ElucidatedImagen with Unet3D. There is no text prompt. I trained on two Titan RTX GPUs with 24 GB memory. The UNets embedding dimensions are both 64. The low and "high" resolution UNets are 64x64 and 128x128. The UNets are trained on 12 frames and 3 frames, respectively, with a temporal downsampling of two for the first UNet. The batch size is 4 and 2, respectively.

Results: I am not sure what to think of the outcome. I am happy something happened 😄 but its not an impressive result. I suspect maybe its my small embedding size. The final videos look a bit memorized, and the temporal consistency is not very good. There is also seemingly limited diversity in my results. I include an example videos from 200k, 300k, and 400k iterations when training the second UNet below:

200k

300k

400k

Checkpoints: I have checkpoint files, but I don't know how to share them. The file size is pretty big (1.5 GB), and I can't upload them to google drive. If someone is interested and has a recommended way of sharing the weights, I can do so.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Results for ElucidatedImagen on DAVIS #368

{{title}}

Replies: 0 comments

Select a reply

Training Results for ElucidatedImagen on DAVIS #368

gauenk Oct 23, 2023

Replies: 0 comments

gauenk
Oct 23, 2023