Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confused how are you displaying accuracy_score for LogisticRegression / DecisionTree by just printing the logs even when in the code you are storing it into a file #1

Open
rituraj17 opened this issue Jun 4, 2021 · 2 comments

Comments

@rituraj17
Copy link

rituraj17 commented Jun 4, 2021

Hi @FernandoLpz ,

First of all Awesome Tutorial and blog.!!!

I am currently new to kubeflow pipelines. So wanted to know how are you storing accuracy_score for LogisticRegression /DecisionTree in logs as even though in the code you are storing it into a file.

File path : decision_tree/decision_tree.py

# Get accuracy
accuracy = accuracy_score(y_test, y_pred)

# Save output into file
with open(args.accuracy, 'w') as accuracy_file:
    accuracy_file.write(str(accuracy))

File path :pipeline.py

 show_results(decision_tree_task.output, logistic_regression_task.output)

I know that you are printing the output by using the show_results() function.
But before this step how are you getting the "decision_tree_task.output "value as it should be a file right?
Shouldn’t we read the file and then print the output?

@rituraj17 rituraj17 changed the title Confused how are you shoing accuracy_score for LogisticRegression /DecisionTree by just printing the logs even when in the code you are storing it into a file Confused how are you displaying accuracy_score for LogisticRegression / DecisionTree by just printing the logs even when in the code you are storing it into a file Jun 4, 2021
@FernandoLpz
Copy link
Owner

Hi @FernandoLpz ,

First of all Awesome Tutorial and blog.!!!

I am currently new to kubeflow pipelines. So wanted to know how are you storing accuracy_score for LogisticRegression /DecisionTree in logs as even though in the code you are storing it into a file.

File path : decision_tree/decision_tree.py

# Get accuracy
accuracy = accuracy_score(y_test, y_pred)

# Save output into file
with open(args.accuracy, 'w') as accuracy_file:
    accuracy_file.write(str(accuracy))

File path :pipeline.py

 show_results(decision_tree_task.output, logistic_regression_task.output)

I know that you are printing the output by using the show_results() function.
But before this step how are you getting the "decision_tree_task.output "value as it should be a file right?
Shouldn’t we read the file and then print the output?

@FernandoLpz
Copy link
Owner

Hi @rituraj17 ,

When a component has a single output value (in this case decision_tree_task only has accuracy as its output value), the value is saved as a "string", "float", etc. as the case may be. It is for the reason that I do not need to read the file and I only extend the "output" attribute, just like: decision_tree_task.output.

In case you have multiple outputs, the "output" attribute would be a dict where "key" would be the name of the variable and "value" the value. For example: decision_tree_task.output['accuracy'], decision_tree_task.output['precision'], etc.

It is important to mention that the output attribute can have different types of data, this specification is made in the component's yaml manifest. For example, for decision_tree () the accuracy is read as a float, not as a file:
outputs:- {name: Accuracy, type: Float, description: 'Accuracy metric'}

Also, it is important to note that within the decision_tree.py script the accuracy metric is stored in a file, however the specification in the manifest says that it will be implemented as a float.

Let me know if you have any other doubt! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants