"MulBackward0 returned nan values" error when launch HATRPO after HAPPO #219

TheShenk · 2024-02-08T11:06:05Z

Hello. Thanks for your work. I tried to use MARLlib for training in my custom environment with different algorithms. It seems that if start training with HATRPO immediately after HAPPO "RuntimeError: Function 'MulBackward0' returned nan values in its 0th output" error will occur. The following code can be used to reproduce error:

from marllib import marl

env = marl.make_env("gymnasium_mpe", "simple_spread")
algo = marl.algos.happo(hyperparam_source="common")
model = marl.build_model(env, algo, {"core_arch": "mlp"})
algo.fit(env, model, stop={'timesteps_total': 1000})

env = marl.make_env("gymnasium_mpe", "simple_spread")
algo = marl.algos.hatrpo(hyperparam_source="common")
model = marl.build_model(env, algo, {"core_arch": "mlp"})
algo.fit(env, model, stop={'timesteps_total': 1000})

Also attaching full log. Installation of MARLlib were made with conda, no GPU used, launched in local mode and not reproduce if local_mode=False.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"MulBackward0 returned nan values" error when launch HATRPO after HAPPO #219

"MulBackward0 returned nan values" error when launch HATRPO after HAPPO #219

TheShenk commented Feb 8, 2024

"MulBackward0 returned nan values" error when launch HATRPO after HAPPO #219

"MulBackward0 returned nan values" error when launch HATRPO after HAPPO #219

Comments

TheShenk commented Feb 8, 2024