Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DuelingDQN.ipynb中可能存在的两个BUG~ #140

Open
libermeng opened this issue Aug 3, 2023 · 0 comments
Open

DuelingDQN.ipynb中可能存在的两个BUG~ #140

libermeng opened this issue Aug 3, 2023 · 0 comments

Comments

@libermeng
Copy link

  1. 定义模型部分forward函数中return value + advantage - advantage.mean()可能有误,应该改为return value + advantage - advantage.mean(dim=1, keepdim=True)
    因为按照定义,优势网络输出的值要满足的条件应该是保持在动作维度上的和为0,那么减去的均值应该只是动作维度的均值,而不是总体的均值。
  2. 定义算法部分初始化函数中self.policy_net = model.to(self.device)self.target_net = model.to(self.device)有误,应该改成 self.policy_net = DuelingNet(cfg.n_states, cfg.n_actions, hidden_dim=cfg.hidden_dim).to(self.device)self.target_net = DuelingNet(cfg.n_states, cfg.n_actions, hidden_dim=cfg.hidden_dim).to(self.device)
    因为原初始化方式是初始化了两个相同内存地址的policy_net和target_net对象,修改后的初始化方式才是初始化两个不同内存地址的对象。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant