-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of tf.data.Dataset within tf.function #298
Comments
Hey, I found a solution by looking empirically for proper parameters while creating the tensorflow Dataset, and checking for data availability before using next(). Even if it does not rely on min_replay_size, this quick fix works like a charm ! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey,
First of all, thanks for this repo, which I found to be an amazing contribution to the RL community !
Recently, an error has appeared during the execution of my code, something that didn't happen until now. The agent I use is based on the provided example of the distributed MPO agent, modified to work with sequences and the corresponding adder.
The environment I use is relatively slow (1 sec/iter and 5 sec for a complete reset), so I rely on the "wait" operation on the Reverb side to make sure that the learner blocks while waiting for enough experiences to be available in the buffer. I wanted it to be done thanks to the _min_replay_size parameter given to the SampleToInsertRatio object that I use:
make_reverb_dataset(...) is called with the latter to create a Dataset object which will be used in the learner, written as such in my code with batch_size and prefetch_size properly set at __init()__:
Once in graph mode within next calls to learner's _step() function, I get an error from next() on dataset iterator, with most recent call in traceback:
Following advices from #152, google-deepmind/reverb#70 or #293 (even if it concerns a JAX setup while I work with a tf one), I still get the same error. What we see in the traceback consists of what was adviced in those issues, which is directly passing the tf.dataset object to the tf.function, instead of the tf.data.Iterator obtained from iter() like it's done in the MPO template example.
I checked if any data is sent to the replay buffer, and everything's seems fine on this matter. Not enough elements are available to the learner at next() call, which seems logical since the environment takes its time to fill in at least one sequence (not more than 5 iterations when the problematic line is reached). Hence, I supposed that the error displayed here comes either from the tf.function that does not take into account the wait operation, or from a misunderstanding of reverb Table setup.
Did I miss something regarding buffer control and its min_replay_size parameter ? Or is it indeed something related to the tf.function ?
Thanks in advance !
The text was updated successfully, but these errors were encountered: