Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace kfp-notebook/elyra-server with a KFP component to run notebooks #305

Open
ckadner opened this issue Feb 22, 2022 · 0 comments
Open
Assignees
Labels
API Swagger API dependencies Pull requests that update a dependency file question Further information is requested

Comments

@ckadner
Copy link
Member

ckadner commented Feb 22, 2022

Current State:

We currently depend on the elyra-server package with the ExecuteFileOp to run notebooks in a Kubeflow Pipeline. While that works fairly well and provides much flexibility, there are some drawbacks.

The Problem:

One of the biggest drawbacks are the Python packages that are required by elyra, most of which are not essential to generate the pipeline code to run the notebook. Note, that the papermill package with its dependencies on ipython etc are only required on the Pod running the notebook, not on the MLX API server pod that generates the pipeline to run the notebook. Many of the remaining packages required by elyra overlap with the packages required by kfp/kfp-tekton, sometimes requiring conflicting and occasionally irreconcilable versions.

Proposed Solution:

Instead of using Elyra to generate the pipelines to run notebooks, we could use a straight-forward KFP component, as we have done in the past. An example of a notebook runner component with sample pipeline exists in KFP:

The Elyra team had considered switching to a KFP component, but they have not done any further investigation. Publishing a component -- that is married to a base Docker image -- would probably not meet the needs of most Elyra users (@ptitzler)

Having the flexibility to specify the Docker image to be used to run the notebook is one important consideration. Since MLX uses code templates to generate sample pipelines to launch notebooks, we could utilize kfp.components.create_component_from_func() which takes a base_image argument:

add_op = create_component_from_func(
    func=add,
    base_image='python:3.7', # Optional
    output_component_file='add.component.yaml', # Optional
    packages_to_install=['pandas==0.24'], # Optional
)

Or we could use kfp.components.create_component_from_file and replace the image attribute in the ComponentSpec > .. > ContainerSpec. Or, even more crudely, we could use kfp.components.create_component_from_text and replace an ${image} placeholder in a component.yaml.template.

Related Issues:

Interested Parties:

@ckadner ckadner self-assigned this Feb 22, 2022
@ckadner ckadner added API Swagger API dependencies Pull requests that update a dependency file question Further information is requested labels Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Swagger API dependencies Pull requests that update a dependency file question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant