-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nan in training #345
Comments
the same problem as me. Openpose + Resnet18 |
Hello! @zpphigh @lengyuner For [openpose] + [VGG19]: For [openpose] + [Resnet 18]: |
Loss is Nan in training based on [openpose] + [VGG19] after 6800 iterations
nan in training based on [openpose] + [Resnet18] after 74300 iterations, as blow:
Train iteration 74300 / 1000000: Learning rate 9.999999747378752e-05 total_loss:51.99763870239258, conf_loss:25.99869155883789, paf_loss:74.33314514160156, l2_loss 1.8317127227783203 stage_num:6 time:0.0002014636993408203
stage_0 conf_loss:27.358610153198242 paf_loss:78.49413299560547
stage_1 conf_loss:26.15711784362793 paf_loss:74.4624252319336
stage_2 conf_loss:25.707374572753906 paf_loss:73.61676788330078
stage_3 conf_loss:25.627124786376953 paf_loss:73.40571594238281
stage_4 conf_loss:25.598114013671875 paf_loss:73.17063903808594
stage_5 conf_loss:25.543825149536133 paf_loss:72.84919738769531
Train iteration 74400 / 1000000: Learning rate 9.999999747378752e-05 total_loss:1148700065792.0, conf_loss:1743442149376.0, paf_loss:553957654528.0, l2_loss 1.8390066623687744 stage_num:6 time:0.00020194053649902344
stage_0 conf_loss:57.1938591003418 paf_loss:109.35887145996094
stage_1 conf_loss:1848.776611328125 paf_loss:3432.841552734375
stage_2 conf_loss:17413386.0 paf_loss:1569305.875
stage_3 conf_loss:135925504.0 paf_loss:4067570176.0
stage_4 conf_loss:2292520058880.0 paf_loss:300717211648.0
stage_5 conf_loss:8167979745280.0 paf_loss:3018960142336.0
Train iteration 74500 / 1000000: Learning rate 9.999999747378752e-05 total_loss:nan, conf_loss:nan, paf_loss:nan, l2_loss nan stage_num:6 time:0.0002295970916748047
stage_0 conf_loss:nan paf_loss:nan
stage_1 conf_loss:13212.9375 paf_loss:nan
stage_2 conf_loss:40226508.0 paf_loss:nan
stage_3 conf_loss:2792057995264.0 paf_loss:nan
stage_4 conf_loss:nan paf_loss:nan
stage_5 conf_loss:nan paf_loss:2.488148857507021e+16
The text was updated successfully, but these errors were encountered: