Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Hooks ? #505

Closed
NeroBlackstone opened this issue Jun 12, 2023 · 8 comments
Closed

Support Hooks ? #505

NeroBlackstone opened this issue Jun 12, 2023 · 8 comments

Comments

@NeroBlackstone
Copy link
Contributor

Is there any counterpart of Hooks in POMDPs.jl?
As I know the only way to get information from pomdps.jl solver, is to copy and edit the solver source code. (it's my solution...
It's hard to plot if we don't have hooks.

I don't know if the maintainer of POMDPs.jl also thinks this is a useful feature.

And I wonder how to implement it.

@zsunberg
Copy link
Member

So far, we have left the question of accessing additional information up to solver writers. The solve_info and action_info functions in POMDPTools sometimes output additional information.

Can you describe what information you are trying to get out of what solver, and perhaps we can think about generalizing from that point.

@NeroBlackstone
Copy link
Contributor Author

NeroBlackstone commented Jun 12, 2023

Thanks for your reply.
For example, solvers in TabularTDLearning.jl, the solver will evaluate trained policy every eval_every episode. We want to get the average reward of the trained policy trajectory while algorithm running.

@NeroBlackstone
Copy link
Contributor Author

render function has a similar concept, but it's for the problem.

@zsunberg
Copy link
Member

I think it's best to try adding some hooks to that particular package and then generalize from that if we can find a way.

In general one challenge is that we have fairly different types of solvers in the POMDPs.jl ecosystem

  1. Offline optimization solvers like SARSOP
  2. Online tree search solvers like POMCP and DESPOT
  3. Reinforcement Learning solvers like tabular td learning

The hooks for these different types might be very different.

@NeroBlackstone
Copy link
Contributor Author

NeroBlackstone commented Aug 9, 2023

Yes, I agree with @zsunberg , since different types of solvers exist, maybe we never have a unified solution.

But when I was going to bed last night, I got some inspirations.

We could pass a callback function to solve function.

like:

 function solve(f::Function,solver::QLearningSolver, mdp::MDP)
    # codes....

    f(episode,average_reward)

    # codes...
end
solve(qsolver,mdp) do episode,average_reward
    # collect data!
end
# plot!

Unfortunately, it's a break change. Maybe we could define callback as optional args.

But I still think at least we could propose a "hook convention".

@NeroBlackstone
Copy link
Contributor Author

NeroBlackstone commented Aug 9, 2023

Maybe we could directly return data in solve_info(), but compared to the callback function, we could not get data while solver running.

The callback function is useful for long-time algorithms, so we can update plots to visually check algorithm status.

I still don't know what is best practice, since there is no solver implementing this, maybe we could implement one to show the right way.

Feel free to close this issue. :)

@zsunberg
Copy link
Member

But when I was going to bed last night, I got some inspirations.

@NeroBlackstone , thanks for using your bedtime thoughts to try to improve this package! :)

In general, I like this proposal, but there is one hard question related to the diversity of solvers: What arguments should be passed to the callback?

I also think that a better first step would be to add callbacks to individual solvers as solver options, for instance, it could be used like this

solver = NativeSARSOP.SARSOPSolver() do tree, alphas
    # print statistics from the tree or something
end
solve(solver, m)

One more note:

Unfortunately, it's a break change.

I don't think this is actually a breaking change, because we could define solve(f, solver, m) = solve(solver, m) as a fallback.

@zsunberg
Copy link
Member

closing for now since this seems to be a solver-specific issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants