Skip to content
This repository has been archived by the owner on Aug 14, 2022. It is now read-only.
/ metaflow-flyte Public archive

Experiments in dispatching Metaflow flows to Flyte.

Notifications You must be signed in to change notification settings

jinnovation/metaflow-flyte

Repository files navigation

Metaflow-Flyte

Experiments in dispatching Metaflow flows to Flyte.

Getting Started

Notes

Extending Metaflow

Metaflow has specific “extension points” built in. Any third-party package, such as ours, that conforms to their expected conventions and protocols will automatically get “injected” into `metaflow` top-level imports.

Some more details here: Netflix/metaflow-extensions-template.

This lets a package like ours define a custom decorator FooDecorator with name @foo and have it importable via:

from metaflow import foo

The same goes for CLI sub-commands, allowing us to define custom CLI trees such as my_subcmd and have them accessible from the typical python my/metaflow/flow.py command tree **with no user intervention**.

In general, the following observation holds particular value:

Metaflow allows for seamless “injection” of extensions without an associated need for wrapper CLIs, companion SDKs, or sub-classing of specialized types. These extensions range from: custom types (exceptions, decorators, etc.) to entire conventions, e.g. company/platform-specific default values.

The user stories provided at Netflix/metaflow-extensions-template are particularly worth reading.

[1/4] Tasks

Create a local Flyte cluster

First, install =flytectl=:

brew install flyteorg/homebrew-tap/flytectl
flytectl sandbox start --source .

[0/0] Submit flow to Flyte cluster

To register the workflow (WIP):

python flows/00-skeleton.py flyte register

To execute the workflow (WIP):

python flows/00-skeleton.py flyte compile

This doesn’t do much currently; it:

  • Converts the Metaflow flow to an imperatively defined Flyte workflow (though current “convert” just means create an empty Flyte workflow…);
  • Gets the default launch plan for that Flyte workflow.

We plan to have it do the following:

Submit flow with only start/end w/ no external dependencies

Submit flow with parameters

Submit and execute flow

Submit and set a schedule on a flow

Could we use the Metaflow default @schedule decorator here? Or is that coupled to AWS Step?

Submit single step to Flyte cluster

[0/2] Reuse Metaflow default decorators for Flyte analogues

Out of Scope

Complications

Metaflow branches vs Flyte domains

Domains are, from what I can tell, intended to be defined as a finite set at the control-plane layer. In other words, a user can’t arbitrarily create domains such as test.foo.

This design decision stands somewhat at odds with Metaflow’s approach to namespacing, centered around the --branch (not to mention user-specific defaults such as user.jjin).

Creating Flyte workflow and registering in same file

Related: flyteorg/flyte#1813.

This issue might impede: taking the Metaflow graph; converting it to a Flyte workflow flyte_wf via the imperative API; and registering flyte_wf in the same function/command.