-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PPO算法的实现, 为啥要给概率取对数? #147
Comments
我理解是为了将除法操作转换为减法操作吧 |
是的 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
如题, 公式里也没有需要取对数的地方, loss中也用不到对数(除了KL散度那一下), 就不大明白搞绕来绕去取对数再取指数求概率比值是为啥, 求解..
The text was updated successfully, but these errors were encountered: