running train_op took too long ?? #24

auzyze · 2019-04-05T03:01:00Z

Thanks for sharing this great work!

I run into this issue when training ours_savp on kth dataset, the training looks going properly, but is very slow.

running train_op took too long (7.2s)
running train_op took too long (7.2s)
.....
progress  global step 100  epoch 0.5
          image/sec 1.1  remaining 37520m (625.3h) (26.1d)
d_loss 0.10482973
   discrim_video_sn_gan_loss (0.5204395, 0.1)
   discrim_video_sn_vae_gan_loss (0.5278577, 0.1)
g_loss 2.0725453
   gen_l1_loss (0.016228592, 100.0)
   gen_video_sn_gan_loss (0.32749984, 0.1)
   gen_video_sn_vae_gan_loss (0.35494953, 0.1)
   gen_video_sn_vae_gan_feature_cdist_loss (0.038144115, 10.0)
   gen_kl_loss (0.6190775, 0.0)
learning_rate 0.0002
running train_op took too long (7.2s)
running train_op took too long (7.2s)
running train_op took too long (7.3s)
......
......

My configuration:
tensorflow: 1.10.0
cuda: 9.0
cudnn: 7.3.0.29

I'm running KTH dataset with ours_savp model. When I use default hparms, I got out of memory error, so I change batch_size=8.

Tensorboard refreshes when summery_freq is reahced.

Appreciate for any suggestions.
Regards,

The text was updated successfully, but these errors were encountered:

nishokkumars · 2019-10-02T06:05:16Z

@alexlee-gk , could you please help? I am facing the same issue.

Berndinio · 2020-10-06T16:34:29Z

I am also facing the same issue. It seems like it is only a print in the train.py file line 267.
Due to they are only printing it and nothing further happens in the if state and the checked variables arent used lateron, i assume it is training correctly. Maybe was originally trained on better gpus or tpus.
As you can see, the sess.run() (where the time is measured from) is always executed. You can just add a tqdm to the for-loop it is in. So you can see the progress

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

running train_op took too long ?? #24

running train_op took too long ?? #24

auzyze commented Apr 5, 2019 •

edited

nishokkumars commented Oct 2, 2019

Berndinio commented Oct 6, 2020

running train_op took too long ?? #24

running train_op took too long ?? #24

Comments

auzyze commented Apr 5, 2019 • edited

nishokkumars commented Oct 2, 2019

Berndinio commented Oct 6, 2020

auzyze commented Apr 5, 2019 •

edited