Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heuristic function overestimates cost, leading to suboptimal plans. #9

Open
steadmon opened this issue Jun 16, 2017 · 6 comments
Open

Comments

@steadmon
Copy link

A* with a closed set can select suboptimal plans if the heuristic function overestimates the remaining cost of an intermediate state. The current heuristic of counting the conflicting states can be an overestimate in some cases. Consider the following setup:

Six state values: a, b, c, d, e, f
Initial state: all six values are False
Goal state: all six values are True
Four actions:

  • "setA", no preconditions, sets a:=True, cost of 2
  • "setAB", no preconditions, sets a:=True and b:=True, cost of 1
  • "optimal", precondition a==True and b==False, sets all six state values to True, cost of 1
  • "trap", precondition a==True and b==False, sets all six state values to True, cost of 5

The optimal plan is "setA" followed by "optimal", with a total cost of 3. However, with the current heuristic, the code actually selects a plan of "setAB" followed by "trap", with a total cost of 6. This is because the heuristic overestimates the remaining cost after the action "setA" as having a likely remaining cost of 5, when in fact the remaining cost is only 1.

This can be fixed by changing the heuristic function as follows:

After all actions and costs are known, find the action with the minimum (cost / number of output states) ratio. Save this optimal ratio, and pass it as an additional argument to the heuristic function. The heuristic function should then return the number of conflicting states multiplied by the cost/output ratio.

@stolk
Copy link
Owner

stolk commented Jun 16, 2017

Good point.
Thanks for pointing this out.

@Sorrien
Copy link

Sorrien commented Sep 9, 2018

Has this been resolved?

@stolk
Copy link
Owner

stolk commented Sep 9, 2018

This issue is still present in the current code base.

You could follow steadmon's approach, or for an even quicker fix: returning a constant value 1 (or better: the lowest cost of all actions) for heuristic will also get rid of suboptimal paths.

To thoroughly fix, you would have to determine lowest cost of flipping on a

  • per atom basis
  • separately for both flip directions: 0->1 and 1->0.

Then dynamically, add this cost for each differing atom value based on direction.

This will slow down the calculation of the heuristic of course, so if you can live with occasional sub optimal paths, the current code may be better.

Note: If you decide to go steadmon's route, you will need to use floating point values for the heuristic.

@kampeador
Copy link

#include <stdio.h>
#include <assert.h>

#include "goap.h"
#include "astar.h"

void check_goap(bool err)
{
    assert(err == true);
}

int main(int argc, char* argv[]) {
    actionplanner_t ap;
    goap_actionplanner_clear(&ap);

    // intial state
    worldstate_t ws_initial;
    goap_worldstate_clear(&ws_initial);
    check_goap(goap_worldstate_set(&ap, &ws_initial, "a", false));
    check_goap(goap_worldstate_set(&ap, &ws_initial, "b", false));
    check_goap(goap_worldstate_set(&ap, &ws_initial, "c", false));
    check_goap(goap_worldstate_set(&ap, &ws_initial, "d", false));
    check_goap(goap_worldstate_set(&ap, &ws_initial, "e", false));
    check_goap(goap_worldstate_set(&ap, &ws_initial, "f", false));

    // goal state
    worldstate_t ws_goal;
    goap_worldstate_clear(&ws_goal);
    check_goap(goap_worldstate_set(&ap, &ws_goal, "a", true));
    check_goap(goap_worldstate_set(&ap, &ws_goal, "b", true));
    check_goap(goap_worldstate_set(&ap, &ws_goal, "c", true));
    check_goap(goap_worldstate_set(&ap, &ws_goal, "d", true));
    check_goap(goap_worldstate_set(&ap, &ws_goal, "e", true));
    check_goap(goap_worldstate_set(&ap, &ws_goal, "f", true));

    // actions
    // "setA", no preconditions, sets a:=True, cost of 2
    goap_set_pst(&ap, "setA", "a", true);
    goap_set_cost(&ap, "setA", 2);

    // "setAB", no preconditions, sets a:=True and b:=True, cost of 1
    goap_set_pst(&ap, "setAB", "a", true);
    goap_set_pst(&ap, "setAB", "b", true);
    goap_set_cost(&ap, "setAB", 1);

    // "optimal", precondition a==True and b==False, sets all six state values to True, cost of 1
    goap_set_pre(&ap, "optimal", "a", true);
    goap_set_pre(&ap, "optimal", "b", false);
    goap_set_pst(&ap, "optimal", "a", true);
    goap_set_pst(&ap, "optimal", "b", true);
    goap_set_pst(&ap, "optimal", "c", true);
    goap_set_pst(&ap, "optimal", "d", true);
    goap_set_pst(&ap, "optimal", "e", true);
    goap_set_pst(&ap, "optimal", "f", true);
    goap_set_cost(&ap, "optimal", 1);

    // "trap", precondition a==True and b==False, sets all six state values to True, cost of 5
    goap_set_pre(&ap, "trap", "a", true);
    goap_set_pre(&ap, "trap", "b", false);
    goap_set_pst(&ap, "trap", "a", true);
    goap_set_pst(&ap, "trap", "b", true);
    goap_set_pst(&ap, "trap", "c", true);
    goap_set_pst(&ap, "trap", "d", true);
    goap_set_pst(&ap, "trap", "e", true);
    goap_set_pst(&ap, "trap", "f", true);
    goap_set_cost(&ap, "trap", 5);


    char desc[4096];
    worldstate_t states[16];
    const char* plan[16];
    int plansz = 16;
    const int plancost = astar_plan(&ap, ws_initial, ws_goal, plan, states, &plansz);
    LOGI( "plancost = %d", plancost );
    goap_worldstate_description(&ap, &ws_initial, desc, sizeof(desc));
    LOGI( "%-23s%s", "", desc );
    for (int i = 0; i < plansz && i < 16; ++i)
    {
        goap_worldstate_description(&ap, states+i, desc, sizeof(desc));
        LOGI("%d: %-20s%s", i, plan[i], desc);
    }

    puts("Exit success");
    fflush(stdout);

    return 0;
}

Output:

plancost = 3
                       a,b,c,d,e,f,
0: setA                A,b,c,d,e,f,
1: optimal             A,B,C,D,E,F,

Am I missing something?

@stolk
Copy link
Owner

stolk commented Oct 14, 2022

Maybe OP's example was not a good one, but it is a valid point that overestimating the remaining cost can cause suboptimal plans.

Personally, I don't worry much about those cases. In practice, they probably don't occur often, and if they do, you still end up with a working plan.

If you are hell-bent on always getting the best plan, you can make the search un-directed, and always put the estimate at '1'. But then you get a combinatorial explosion.

Playing with the heuristic is left as an exercise for the reader. Use what works for you. What's in the repo now works for me.

@kampeador
Copy link

It would be nice if issues like this one had practical unit test coverage, rather than theoretical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants