You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we must instantiate an instance of the environment we wish to solve before creating a MultiSyncDataCollector object. This is because we can't create a policy without knowing the environment's action and observation specs, and the MultiSyncDataCollector requires us to pass a policy to its constructor. In general we would rather not do this because creating and discarding a dummy environment is wasteful, but it may become a tangible problem for environments that are particularly large or slow to instantiate.
Ideally the MultiSyncDataCollector would allow us to access the observation and action specs from one of its sub-processes before we provide it a policy.
Solution
Construct the collector and query the environment specs before constructing a policy, like so:
Or we might prefer some sort of LUT mapping environment names to specs which does not actually instantiate the environment (this would require an instantiation once when the env is registered, but never again afterwards):
Motivation
Currently we must instantiate an instance of the environment we wish to solve before creating a
MultiSyncDataCollector
object. This is because we can't create a policy without knowing the environment's action and observation specs, and theMultiSyncDataCollector
requires us to pass a policy to its constructor. In general we would rather not do this because creating and discarding a dummy environment is wasteful, but it may become a tangible problem for environments that are particularly large or slow to instantiate.Ideally the
MultiSyncDataCollector
would allow us to access the observation and action specs from one of its sub-processes before we provide it a policy.Solution
Construct the collector and query the environment specs before constructing a policy, like so:
Then instantiate a policy and pass that to the collector:
Alternatives
Pass a reference to the policy instantiation callable to the collector, then retrieve the policy object later:
Or we might prefer some sort of LUT mapping environment names to specs which does not actually instantiate the environment (this would require an instantiation once when the env is registered, but never again afterwards):
Additional context
This isn't a huge problem, but it would be nice at some point.
Checklist
The text was updated successfully, but these errors were encountered: