Describe integration with MLflow #3856

astrojuanlu · 2024-05-06T11:35:47Z

Description

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

Read the contributing guidelines
Signed off each commit with a Developer Certificate of Origin (DCO)
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added a description of this change in the RELEASE.md file
Added tests to cover my changes
Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

astrojuanlu · 2024-05-06T11:47:53Z

Preview: https://kedro--3856.org.readthedocs.build/en/3856/integrations/mlflow.html

Closes #3541. Signed-off-by: Juan Luis Cano Rodríguez <[email protected]>

astrojuanlu · 2024-05-07T14:00:34Z

Note to reviewers:

The idea of this page is to complement kedro-mlflow documentation, and in fact it contains several references to it. What this adds then is:

References to MLflow in the Kedro docs so users can easily find it
Short examples on how to create custom integrations using hooks
Fill some gaps in kedro-mlflow

The idea is for this page to serve as brief collection of MLOps use cases, and to use this as a template for future integrations.

I'm paging @stichbury as well because I was somewhat careless with the prose in certain parts.

I ~~confess~~ disclose that Lilli wrote the first 2 paragraphs.

astrojuanlu · 2024-05-07T14:17:40Z

docs/source/integrations/mlflow.md

+At the moment it's not straightforward to specify the `run_id`
+to use a specific version of a model for inference within a Kedro pipeline,
+see discussion on [this GitHub issue](https://github.com/Galileo-Galilei/kedro-mlflow/issues/549).


xref Galileo-Galilei/kedro-mlflow#549 cc @Galileo-Galilei

I see people doing it in production for a lot of models, so it is definitely possible ;) I really need to see the code you run to get that error, but essentially you cannot save a model in an existing run, because you would alterate it and this is definitely not the purpose of experiment tracking. However, there is absolutely no problem to load a model in an existing mlflow run (say the sapecflights regressor you've just trained) to run a kedro pipeline which will log in another mlflow run, or can just be used without logging in mlflow.

essentially you cannot save a model in an existing run, because you would alterate it and this is definitely not the purpose of experiment tracking

That makes a lot of sense. Still chewing on your explanation, will rewrite this part.

noklam

Is it possible to keep kedro-mlflow as its own section? The current structure is by feature, "tracking", "artifact". The rest of our docs usually start from "basic" -> "advance".

noklam · 2024-05-09T16:22:58Z

docs/source/conf.py

+ "mlflow": ("https://www.mlflow.org/docs/2.12.1/", None),
+ "kedro-mlflow": ("https://kedro-mlflow.readthedocs.io/en/0.12.2/", None),


Not sure what this config actually do, why do we prefer mapping to a specific version over "stable" or "latest"?

Because pinning the version means the links will never break ⭐ and it captures what version of MLflow was used to write the docs

noklam · 2024-05-09T16:26:31Z

docs/source/integrations/mlflow.md

+
+- A working Kedro project.
+- The MLflow client properly installed in your project: `pip install mlflow`.
+- (Recommended) An MLflow Tracking Server, which will allow you to quickly inspect the runs with a user-friendly web interface.


Should we mention how to start a basic one with mlflow ui etc?

Should be short and wouldn't harm I'd say, good suggestion.

astrojuanlu · 2024-05-10T08:52:42Z

Is it possible to keep kedro-mlflow as its own section? The current structure is by feature, "tracking", "artifact". The rest of our docs usually start from "basic" -> "advance".

I went back and forth several times on how to structure this page. I like how I ended up making 1 section per use case but I agree there might be other ways. Awaiting for @stichbury's take on this.

docs/source/integrations/mlflow.md

stichbury · 2024-05-10T11:26:07Z

I went back and forth several times on how to structure this page. I like how I ended up making 1 section per use case but I agree there might be other ways. Awaiting for @stichbury's take on this.

I found the sectioning useful but agree with @noklam that we usually break into simple and advanced usage. Can I suggest we do the same here and have the structure as follows, but will leave you to decide where the simple/advanced sections fall?

Maybe something like this?

However, I'm not that attached to this and if you want to stick with what you have, I'd say that's fine, but omit the basic second level "Use cases" header and promote all the following (currently 3rd level) to 2nd level.

Header

Prerequisites

Simple use cases

Tracking Kedro pipeline runs in MLflow using Hooks

Artifact tracking in MLflow using hooks

Advanced use cases

Complete tracking of Kedro runs in MLflow using `kedro-mlflow`

Tracking Kedro in MLflow using the Python API

Artifact tracking in MLflow using `kedro-mlflow`

Model registry in MLflow using `kedro-mlflow`

astrojuanlu · 2024-05-10T11:31:52Z

My biggest gripe with this is that it seems wrong to declare custom hooks as "basic" and using kedro-mlflow, which is objectively fewer lines of code (just a pip install kedro-mlflow away), "advanced". If anything, the former are more "custom", "ad-hoc", or "homegrown", whereas the latter is more "off-the-shelf".

I can totally see how someone starts with the custom hook ("basic"), then they start making it more complex because they need more functionality, and in the end it becomes way more difficult than just pip install kedro-mlflow and let the plugin take care of it for you, assuming the plugin does more or less exactly what you want to do.

stichbury · 2024-05-10T11:48:38Z

I can't work on your branch so I've forked and made a PR to commit back to it #3862

Please take a look, merge what you want, and I can review again when you have the entire page in your preferred final state (see comment about sectioning above).

* Some proposed edits Signed-off-by: Jo Stichbury <[email protected]> * Fix some Vale warnings Signed-off-by: Jo Stichbury <[email protected]> --------- Signed-off-by: Jo Stichbury <[email protected]>

astrojuanlu · 2024-05-13T10:01:03Z

I'd say that's fine, but omit the basic second level "Use cases" header and promote all the following (currently 3rd level) to 2nd level.

We agreed to do this 👍🏼 Will make the change today

DimedS

Thank you, @astrojuanlu , for the excellent manual. Everything is working well. It's a great starting point for exploring Kedro+MLFlow. I've left a few minor comments.

DimedS · 2024-05-13T15:52:25Z

docs/source/integrations/mlflow.md

+We recommend use of an [MLflow tracking server](https://mlflow.org/docs/latest/tracking/server.html), that enables you to inspect runs through a web interface. This is optional, but strongly recommended, even if you're working locally on a solo project. This documentation assumes that you have a tracking server, whether it's a local one run by yourself or a remote one running on another machine or provided by a cloud vendor.
+You will need to configure your client properly,
+either by setting the `MLFLOW_TRACKING_URI` environment variable
+or by using the {external+mlflow:py:func}`mlflow.set_tracking_uri` function.


I don’t quite understand the idea of that section. I simply run mlflow ui and everything works well without any configuration. Perhaps we should include this command in the manual for simplicity, and provide links for those interested in more detailed information.

Yes, I think this is very confusing for users. Actually it is not mandatory to have a tracking server to be able to use the UI, and I think the entire documentation examples should work without one. It is obviously very recommended to have one for many reasons (persistence, collaboration, granular access, separate storage of heavy components from metadata...), but it's like saying "don't use local csv files to store your data but use a database instead" : beginners won't see the immediate need and it's not a big deal for the first use I guess.

You're both completely right, this stems from some misunderstanding I had (I confess my first experience with MLflow was kinda bumpy). I will simplify all this, agreed with @Galileo-Galilei that it's not really needed to introduce this complexity early on.

DimedS · 2024-05-13T15:54:15Z

docs/source/integrations/mlflow.md

+import mlflow
+from kedro.framework.hooks import hook_impl
+from mlflow.entities import RunStatus
+


I added the following lines to make it work:

import logging logger = logging.getLogger(__name__)

Perhaps it would be sensible to include these in the manual?

Good catch, will add

DimedS · 2024-05-13T16:00:36Z

docs/source/integrations/mlflow.md

+
+One possible way of doing it is using the {py:meth}`~kedro.framework.hooks.specs.PipelineSpecs.before_pipeline_run` Hook
+to log the `run_params` passed to the Hook.
+An implementation would look as follows:


At this point, it would be beneficial for me to receive a brief explanation of what I should do next. Including a short description that instructs me to create a new file and register it according to the guidelines in this manual Kedro Hooks Introduction would be helpful.

Wanted to avoid adding too much prose here but at least a link to the Kedro Hooks intro would indeed be useful here, will add.

Galileo-Galilei · 2024-05-15T21:05:31Z

docs/source/integrations/mlflow.md

+We recommend use of an [MLflow tracking server](https://mlflow.org/docs/latest/tracking/server.html), that enables you to inspect runs through a web interface. This is optional, but strongly recommended, even if you're working locally on a solo project. This documentation assumes that you have a tracking server, whether it's a local one run by yourself or a remote one running on another machine or provided by a cloud vendor.
+You will need to configure your client properly,
+either by setting the `MLFLOW_TRACKING_URI` environment variable
+or by using the {external+mlflow:py:func}`mlflow.set_tracking_uri` function.


Yes, I think this is very confusing for users. Actually it is not mandatory to have a tracking server to be able to use the UI, and I think the entire documentation examples should work without one. It is obviously very recommended to have one for many reasons (persistence, collaboration, granular access, separate storage of heavy components from metadata...), but it's like saying "don't use local csv files to store your data but use a database instead" : beginners won't see the immediate need and it's not a big deal for the first use I guess.

Galileo-Galilei · 2024-05-15T21:13:26Z

docs/source/integrations/mlflow.md

+:::{note}
+If you implemented your custom `MLflowHooks` as described in the previous section,
+consider disabling them to avoid logging the runs twice.
+:::


I think this is confusing for beginners and the warning should be at the beginning of the section. If one's read the section from linearly top to bottom (which is what you usually do 😄) I think it is very confusing to understand that this section suggests to replace the custom hook with kedro-mlflow and not add it. Eventually as a beginner I would likely wonder "why the hell did I start to do something that is suggested t be removed in the following section?

I think an (advanced) example that would be much nicer could be an example of a combination of kedro-mlflow and a custom hook (which will for exemple track something custom that kedro-mlflow does not, e.g. the commit sha ?), but still benefit from kedro-mlflow configuration management because kedro-mlflow hook will run first.

This is a very valid point and goes back to something @stichbury had mentioned to me as well about mixing "explanation" with "how to" content in here.

The first hook is supposed to have pedagogical value but you're right it kind of breaks the flow.

I'll see if I can rework this to keep that first part as an explainer, or if I'll have to drop it completely and do as you say @Galileo-Galilei and evolve from kedro-mlflow with custom hooks that are used in combination with the plugin.

Galileo-Galilei · 2024-05-15T21:15:24Z

docs/source/integrations/mlflow.md

+Notice that `kedro-mlflow` assumes 1 Kedro project = 1 MLflow experiment.
+If this does not suit your use case, consider pursuing a more customised approach
+like the one described in the previous section.


Actually this is the default but it is fully customisable in the mlflow.yml, so I won't call it an assumption (you can even use a custom resolver to create something very custom and dynamic in the mlflow.yml if you want)

Very true. I'll reword this section.

Galileo-Galilei · 2024-05-15T21:21:15Z

docs/source/integrations/mlflow.md

+Notice though that storing artifacts as Python pickles
+prevents you from fully using MLflow native preview capabilities.


Is it really necessary to specify it? Just a nitpick, but maybe it would be nicer to write it the other way around with a csv or a png example which would render with a nice preview and say that mlflow offers a preview for some data format?

Galileo-Galilei · 2024-05-15T21:32:44Z

docs/source/integrations/mlflow.md

+At the moment it's not straightforward to specify the `run_id`
+to use a specific version of a model for inference within a Kedro pipeline,
+see discussion on [this GitHub issue](https://github.com/Galileo-Galilei/kedro-mlflow/issues/549).


I see people doing it in production for a lot of models, so it is definitely possible ;) I really need to see the code you run to get that error, but essentially you cannot save a model in an existing run, because you would alterate it and this is definitely not the purpose of experiment tracking. However, there is absolutely no problem to load a model in an existing mlflow run (say the sapecflights regressor you've just trained) to run a kedro pipeline which will log in another mlflow run, or can just be used without logging in mlflow.

astrojuanlu · 2024-05-16T06:19:52Z

Thanks for the review @Galileo-Galilei 🙏🏼 will address your comments ASAP.

astrojuanlu force-pushed the docs/kedro-mlflow branch 2 times, most recently from a8256cf to df922b7 Compare May 7, 2024 12:51

Describe integration with MLflow

24c25c2

Closes #3541. Signed-off-by: Juan Luis Cano Rodríguez <[email protected]>

astrojuanlu force-pushed the docs/kedro-mlflow branch from df922b7 to 24c25c2 Compare May 7, 2024 13:57

astrojuanlu requested review from noklam and Galileo-Galilei May 7, 2024 13:57

astrojuanlu marked this pull request as ready for review May 7, 2024 13:57

astrojuanlu requested a review from yetudada as a code owner May 7, 2024 13:57

astrojuanlu commented May 7, 2024

View reviewed changes

noklam reviewed May 9, 2024

View reviewed changes

stichbury reviewed May 10, 2024

View reviewed changes

docs/source/integrations/mlflow.md Outdated Show resolved Hide resolved

stichbury reviewed May 10, 2024

View reviewed changes

docs/source/integrations/mlflow.md Outdated Show resolved Hide resolved

[Docs] Update MLflow docs page in #3856 (#3862)

aedab89

* Some proposed edits Signed-off-by: Jo Stichbury <[email protected]> * Fix some Vale warnings Signed-off-by: Jo Stichbury <[email protected]> --------- Signed-off-by: Jo Stichbury <[email protected]>

DimedS approved these changes May 13, 2024

View reviewed changes

Galileo-Galilei reviewed May 15, 2024

View reviewed changes

Merge branch 'main' into docs/kedro-mlflow

7a0228f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Describe integration with MLflow #3856

Describe integration with MLflow #3856

astrojuanlu commented May 6, 2024

astrojuanlu commented May 6, 2024

astrojuanlu commented May 7, 2024

astrojuanlu May 7, 2024

Galileo-Galilei May 15, 2024

astrojuanlu May 21, 2024

noklam left a comment

noklam May 9, 2024

astrojuanlu May 10, 2024

noklam May 9, 2024

astrojuanlu May 10, 2024

astrojuanlu commented May 10, 2024

stichbury commented May 10, 2024

astrojuanlu commented May 10, 2024

stichbury commented May 10, 2024

astrojuanlu commented May 13, 2024

DimedS left a comment

DimedS May 13, 2024

Galileo-Galilei May 15, 2024

astrojuanlu May 21, 2024

DimedS May 13, 2024

astrojuanlu May 21, 2024

DimedS May 13, 2024

astrojuanlu May 21, 2024

Galileo-Galilei May 15, 2024

Galileo-Galilei May 15, 2024

astrojuanlu May 21, 2024

Galileo-Galilei May 15, 2024

astrojuanlu May 21, 2024

Galileo-Galilei May 15, 2024

Galileo-Galilei May 15, 2024

astrojuanlu commented May 16, 2024

		"mlflow": ("https://www.mlflow.org/docs/2.12.1/", None),
		"kedro-mlflow": ("https://kedro-mlflow.readthedocs.io/en/0.12.2/", None),

		Notice though that storing artifacts as Python pickles
		prevents you from fully using MLflow native preview capabilities.

Describe integration with MLflow #3856

Are you sure you want to change the base?

Describe integration with MLflow #3856

Conversation

astrojuanlu commented May 6, 2024

Description

Development notes

Developer Certificate of Origin

Checklist

astrojuanlu commented May 6, 2024

astrojuanlu commented May 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noklam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astrojuanlu commented May 10, 2024

stichbury commented May 10, 2024

Header

Prerequisites

Simple use cases

Tracking Kedro pipeline runs in MLflow using Hooks

Artifact tracking in MLflow using hooks

Advanced use cases

Complete tracking of Kedro runs in MLflow using kedro-mlflow

Tracking Kedro in MLflow using the Python API

Artifact tracking in MLflow using kedro-mlflow

Model registry in MLflow using kedro-mlflow

astrojuanlu commented May 10, 2024

stichbury commented May 10, 2024

astrojuanlu commented May 13, 2024

DimedS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astrojuanlu commented May 16, 2024

Complete tracking of Kedro runs in MLflow using `kedro-mlflow`

Artifact tracking in MLflow using `kedro-mlflow`

Model registry in MLflow using `kedro-mlflow`