Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: on demand docker environments #1796

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

saikonen
Copy link
Collaborator

@saikonen saikonen commented Apr 10, 2024

Adds support for on demand Docker images. Open for discussion

open todo's

  • add fallback to conda environment when executing locally with --environment docker
  • add support for imagebakery endpoint auth
  • cache bakery image tags locally to avoid environment drift with consecutive deployments when using loosely pinned versions

@saikonen saikonen marked this pull request as ready for review April 29, 2024 11:19
@savingoyal
Copy link
Collaborator

python for.py --environment=docker run without @kubernetes immediately returns an error that Image Bakery is not configured. Perhaps it shouldn't throw the error?

# On Demand Docker image build configuration
###
# Image builder service url
DOCKER_IMAGE_BAKERY_URL = from_conf("DOCKER_IMAGE_BAKERY_URL", None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - change this to FAST_BAKERY_URL? and similar changes below?

def init_environment(self, echo):
if self._setup_conda_fallback():
print(
"Some steps would execute locally. Had to fallback to a conda environment"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can skip the print statement for now? ideally, we can have a debug flag that will print out what kind of environment is being created for which step. we can make it a follow up with a TODO in the code.

return True

def init_environment(self, echo):
if self._setup_conda_fallback():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that we fall back to conda environments for all steps if any step is running locally. can we make it such that any step that is marked to run on kubernetes can benefit from fast bakery?

if self._setup_conda_fallback():
print(
"Some steps would execute locally. Had to fallback to a conda environment"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a flow that has a single step marked with @kubernetes, fails right now

Some steps would execute locally. Had to fallback to a conda environment
Bootstrapping virtual environment(s) ...
Virtual environment(s) bootstrapped!
2024-05-02 08:48:39.662 Workflow starting (run-id 217204):
2024-05-02 08:48:42.651 [217204/start/1333327 (pid 68470)] Task is starting.
2024-05-02 08:48:47.021 [217204/start/1333327 (pid 68470)] [pod t-f10eeb12-c9h8b-bv7gc] Task is starting (Pod is pending, Container is waiting - ContainerCreating)...
2024-05-02 08:48:46.955 [217204/start/1333327 (pid 68470)] [pod t-f10eeb12-c9h8b-bv7gc] Setting up task environment.
2024-05-02 08:48:56.265 [217204/start/1333327 (pid 68470)] [pod t-f10eeb12-c9h8b-bv7gc] Downloading code package...
2024-05-02 08:48:57.143 [217204/start/1333327 (pid 68470)] [pod t-f10eeb12-c9h8b-bv7gc] Code package downloaded.
2024-05-02 08:48:57.175 [217204/start/1333327 (pid 68470)] [pod t-f10eeb12-c9h8b-bv7gc] Task is starting.
2024-05-02 08:48:57.658 [217204/start/1333327 (pid 68470)] [pod t-f10eeb12-c9h8b-bv7gc] bash: line 1: /conda-prefix/bin/python: No such file or directory
2024-05-02 08:49:00.594 [217204/start/1333327 (pid 68470)] Kubernetes error:
2024-05-02 08:49:00.683 [217204/start/1333327 (pid 68470)] Error (exit code 127). This could be a transient error. Use @retry to retry.
2024-05-02 08:49:00.683 [217204/start/1333327 (pid 68470)]
2024-05-02 08:49:00.912 [217204/start/1333327 (pid 68470)] Task failed.



class FastBakeryException(MetaflowException):
headline = "Docker Image Bakery ran into an exception"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit - error messages can also reference fast bakery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants