-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in the Quantile Huber loss? #1
Comments
Could you post the corrected solution here? When training IQN, loss doesn't seem to converge. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
First of all, thanks for publicly sharing your implementations of the reinforcement learning algorithms. I find your repos very useful!
As I was playing around with the QR-DQN, I think I noticed a bug in your implementation of the Quantile Huber loss function. The code seems to run fine if batch_size == atoms. However, if you change one of the two, you get an error due to the incompatible tensor shapes in line 75 of QR-DQN.py:
loss = tf.where(tf.less(error_loss, 0.0), inv_tau * huber_loss, tau * huber_loss)
I think the error is related to the fact that TF2 implementation of the Huber loss reduces the dimension of the output by 1 with respect to the inputs (docu), even when setting
reduction=tf.keras.losses.Reduction.NONE.
This is different from the behavior in TF1, where the output dimension matches the one of the input (docu). Therefore, if I am not mistaken, one could fix this by changing theself.huber_loss
totf.compat.v1.losses.huber_loss
? I am having a bit of a hard time working out the exact dimensions upon which different operations act, so I would be happy to hear from your side if my theory is correct :PThe text was updated successfully, but these errors were encountered: