Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"MulBackward0 returned nan values" error when launch HATRPO after HAPPO #219

Open
TheShenk opened this issue Feb 8, 2024 · 0 comments
Open

Comments

@TheShenk
Copy link

TheShenk commented Feb 8, 2024

Hello. Thanks for your work. I tried to use MARLlib for training in my custom environment with different algorithms. It seems that if start training with HATRPO immediately after HAPPO "RuntimeError: Function 'MulBackward0' returned nan values in its 0th output" error will occur. The following code can be used to reproduce error:

from marllib import marl

env = marl.make_env("gymnasium_mpe", "simple_spread")
algo = marl.algos.happo(hyperparam_source="common")
model = marl.build_model(env, algo, {"core_arch": "mlp"})
algo.fit(env, model, stop={'timesteps_total': 1000})

env = marl.make_env("gymnasium_mpe", "simple_spread")
algo = marl.algos.hatrpo(hyperparam_source="common")
model = marl.build_model(env, algo, {"core_arch": "mlp"})
algo.fit(env, model, stop={'timesteps_total': 1000})

Also attaching full log. Installation of MARLlib were made with conda, no GPU used, launched in local mode and not reproduce if local_mode=False.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant