Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run.finished not set when using AWS Step Functions and there's an error #1710

Open
wrighting opened this issue Feb 1, 2024 · 0 comments
Open

Comments

@wrighting
Copy link

When I run a flow from AWS Step Functions
And a step throws an exception
And the exception doesn't go away after retrying
Then run.finished isn't True

I'm triggering flows via step functions and want to see when they have finished

The following works if the flow is run locally or there aren't any errors

    retry_increment = 60
    while not run.finished and (timeout < 0 or total_time < timeout):
        # run is not None here so we know run_ref is OK
        run = Run(run_ref)
        if run.finished:
            break
        steps = list(run.steps())
        if len(steps) > 0:
            last_step = steps[0]
            # Can have multiple attempts after an exception so just log a warning
            if last_step.task and last_step.task.exception is not None:
                logger.warn(
                    f"metaflow failed, check logs for {run_ref}, attempt {last_step.task.current_attempt} {last_step.task.exception}"
                )
            logger.debug(f"In {last_step} for {flow_key}, waiting for {retry_increment}...")
        time.sleep(retry_increment)
        total_time += retry_increment

It doesn't work to break when finding an exception because if the step works on a retry then it's too soon

run.finished is False
run.successful is False

It does show up as completed in the UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant