You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in the function explore_vec_env of AgentPPO, the variable actions shaped with [horizon_len, self.num_envs, 1], but the following expression convert(action) return the tensor with the 1-dim shape num_envs, which actually should be [num_envs, 1] as it works in explore_vec_env of AgentD3QN. And it indeed faild the demoexamples/demo_A2C_PPO.py.
Folloiwing change works for me:
# ActorDiscretePPO of net.pydefget_action(self, state: Tensor) -> (Tensor, Tensor):
state=self.state_norm(state)
a_prob=self.soft_max(self.net(state))
a_dist=self.ActionDist(a_prob)
action=a_dist.sample()
logprob=a_dist.log_prob(action)
returnaction.unsqueeze(1), logprob# unsqueeze the action
The text was updated successfully, but these errors were encountered:
in the function
explore_vec_env
ofAgentPPO
, the variableactions
shaped with[horizon_len, self.num_envs, 1]
, but the following expressionconvert(action)
return the tensor with the 1-dim shapenum_envs
, which actually should be[num_envs, 1]
as it works inexplore_vec_env
ofAgentD3QN
. And it indeed faild the demoexamples/demo_A2C_PPO.py
.Folloiwing change works for me:
The text was updated successfully, but these errors were encountered: