Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in the Quantile Huber loss? #1

Open
mcshel opened this issue Feb 25, 2021 · 1 comment
Open

Bug in the Quantile Huber loss? #1

mcshel opened this issue Feb 25, 2021 · 1 comment

Comments

@mcshel
Copy link

mcshel commented Feb 25, 2021

Hi,

First of all, thanks for publicly sharing your implementations of the reinforcement learning algorithms. I find your repos very useful!

As I was playing around with the QR-DQN, I think I noticed a bug in your implementation of the Quantile Huber loss function. The code seems to run fine if batch_size == atoms. However, if you change one of the two, you get an error due to the incompatible tensor shapes in line 75 of QR-DQN.py:

loss = tf.where(tf.less(error_loss, 0.0), inv_tau * huber_loss, tau * huber_loss)

I think the error is related to the fact that TF2 implementation of the Huber loss reduces the dimension of the output by 1 with respect to the inputs (docu), even when setting reduction=tf.keras.losses.Reduction.NONE. This is different from the behavior in TF1, where the output dimension matches the one of the input (docu). Therefore, if I am not mistaken, one could fix this by changing the self.huber_loss to tf.compat.v1.losses.huber_loss? I am having a bit of a hard time working out the exact dimensions upon which different operations act, so I would be happy to hear from your side if my theory is correct :P

@mcshel mcshel changed the title Bug in the Quantile Hubler loss? Bug in the Quantile Huber loss? Feb 25, 2021
@yubobao27
Copy link

Could you post the corrected solution here? When training IQN, loss doesn't seem to converge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants