-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
跑到660900之后,报NaN错误 #36
Labels
Comments
同问,有没有解决? |
我是跑到696100之后出现了同样的问题,有大神懂怎么解决吗? 2020-02-11 16:21:58.843161: E tensorflow/core/kernels/check_numerics_op.cc:185] abnormal_detected_host @0x7f2456e15a00 = {0, 1} Found Inf or NaN global norm.
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had Inf values
[[{{node VerifyFinite/CheckNumerics}} = CheckNumerics[T=DT_FLOAT, message="Found Inf or NaN global norm.", _device="/job:localhost/replica:0/task:0/device:GPU:0"](global_norm/global_norm)]]
[[{{node clip_by_global_norm/mul_1/_159}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2818_clip_by_global_norm/mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred: |
我之前也偶尔会遇到同样的问题,一般解决办法就是从 checkpoint 继续训练。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
2019-08-06 00:52:38.476421: E tensorflow/core/kernels/check_numerics_op.cc:185] abnormal_detected_host @0x7fb3e960d900 = {0, 1} Found Inf or
NaN global norm.Traceback (most recent call last):
File "/root/anaconda3/envs/fjpy36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/root/anaconda3/envs/fjpy36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/root/anaconda3/envs/fjpy36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had Inf values
[[{{node VerifyFinite/CheckNumerics}} = CheckNumericsT=DT_FLOAT, message="Found Inf or NaN global norm.", _device="/job:localhost/r
eplica:0/task:0/device:GPU:0"]] [[{{node clip_by_global_norm/mul_1/_301}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0
", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2818_clip_by_global_norm/mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
The text was updated successfully, but these errors were encountered: