Replies: 1 comment 2 replies
-
Thanks for asking! Most languages these days are very object oriented. Julia allows for object-oriented programming, but it is often easier to do things with a more functional style. The easiest way to accomplish your initial task of making a Monte Carlo estimate is by creating a function: function mc_estimate(pomdp, s, h, steps)
sim = RolloutSimulator(max_steps=steps)
policy = RandomPolicy(pomdp)
return mean(simulate(sim, pomdp, policy) for i in 1:10)
end
solver = POMCPSolver(
estimate_value = mc_estimate
#...
) Now, you have complete control of anything that happens within (All of your reasoning about RolloutEstimator is correct, but I wouldn't recommend emulating that design pattern - it is from our early days) |
Beta Was this translation helpful? Give feedback.
-
Implementing custom rollout strategies for JuliaPOMDP
Hello all,
I've been trying to wrap my head around the JuliaPOMDP packages for a while now and I need some help and/or advice. Please do keep in mind that I've only just started using Julia and this is likely the cause of most, if not all, of my frustrations.
Goal
At the moment I only want to create a rollout strategy for BasicPOMCP where multiple random rollout are applied and the average reward of all the rollouts are returned.
Findings
I first looked at the main file. Here I found that you can set the estimation value of the POMCPSolver using the
estimate_value
argument which is initially set toRolloutEstimator(RandomSolver(rng))
where the RolloutEstimator calls the RandomSolver to evaluate the leaf nodes. This is where things start getting fuzzy to me.I'm guessing the RolloutEstimator gets translated into a SolvedRolloutEstimator and then
estimate_value
is called when a rollout needs to be simulated with RolloutSimulator.This brings me to the
simulate
functions for RolloutSimulator. My initial thought was to override thissimulate
function. However, there are a few concerns I have:simulate
function for the rollout and HistoryRecorder?If anyone has any advice on how to do this or can tell me if I got anything wrong, please let me know. I know this is messy but it's the best I could do with my current understanding.
Beta Was this translation helpful? Give feedback.
All reactions