Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Allow user to retrieve metrics history from SDK #1185

Open
likawind opened this issue Apr 7, 2023 · 0 comments
Open

[FEATURE] Allow user to retrieve metrics history from SDK #1185

likawind opened this issue Apr 7, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@likawind
Copy link
Contributor

likawind commented Apr 7, 2023

Requirements

Add the ability to fetch the history of a metric from the SDK.

For the following example flow:

from aqueduct import Client
client = Client()

# Fetch the hotel_reviews table.
db = client.integration("aqueduct_demo")
hotel_reviews = db.sql("select * from hotel_reviews")

# Add a metric for the number of rows in the table.
@metric(output="num rows")
def num_rows(df):
	return len(df)

num_row_metric = num_rows(hotel_reviews)
client.publish_flow("Example Flow", artifacts=num_row_metric)

We can currently fetch the metric value for a particular flow with:

flow = client.flow(flow_name="Example Flow")
flow_run = flow.latest() # can also use `flow.fetch(<workflow_dag_id>)` for historical flows.

m = flow_run.artifact("num rows")
m.get() # Should return `len(hotel_reviews)`.

However, we’d like to be able to fetch the history of a metric here with a history() method.

# This should return a list of all computed metric values up until this run.
# Every run will have a corresponding entry in this list, in reverse 
# chronological order (latest first).
m.history()

That is to say, if “Example Flow” has ran 3 times. Then flow.latest().artifact("num rows").history() should return a list of length three, with each entry containing the value of that metric for its corresponding run.

Implementation Details

SDK

The history() method should be added to the NumericArtifact class here. Note that history() should only work for artifact’s fetched from an existing flow, meaning when self._from_flow_run(code) is set to true. You will need to edit and use the APIClient to make requests to the server.

Backend

For this task, you shouldn’t need to change the implementation of REST API endpoint. In case you’d like a reference, it’s located here . Prepare parses the raw http request into a structured listArtifactResultsArgs , and Perform uses the structured args and run more complex application logic.

In case you may want to change the backend implementation, the typical steps to follow are:

  • Update listArtifactResultArgs with new parameters if necessary
  • Parse any new parameters from raw request (the parameter can come from path, query, or request body).
  • Consume the parsed parameter in Perform with your updated application logic.

Running and Testing

Please refer to our CONTRIBUTING.md

References:

  • The code for the client class is here
  • The code for the flow class is here
  • The code for flow_run.artifact() is here
  • The code for the backend endpoint is here
@likawind likawind added the enhancement New feature or request label Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant