-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undocumented change in tree_.value example for DecisionTreeClassifier between versions 1.3.2 and 1.4.2 #28921
Comments
Indeed, it should be a side effect of the bug fix in #27639 where we started to store proportion instead of absolute counts. We should therefore update the documentation accordingly. |
Hi there! I'm new to open source, but I'm eager to dive in and help out. I came across this issue and I think I can tackle it. Could you point me in the right direction to get started?" |
can I work on it ? |
@Kaustbh feel free to open a PR. |
Could anyone please recommend starting points or suggest specific files I should focus on? Any insights or advice on navigating the scikit-learn codebase to find the right areas for modification would be of great help because the files have hundreds of lines of code 😢 |
Describe the issue linked to the documentation
In the the 1.4.2 docs the Understanding the decision tree structure page provides code and output in order to inspect
tree_.value
, but the tree diagram and output from the code snippet are inconsistent.The diagram shows integer values that represent the number of records in that class at each node.
The new output from the code appears to be the percentage? of the total number of records that are in the respective class.
The 1.3.2 docs were consistent on this page between the code output and the diagram lower describing the values array, so I expect that something changed between the versions but wasn't documented, at least here in this example.
I can't find where this change to
tree_.value
is documented and it appears to be causing confusion (see for example on stack overflow)Suggest a potential alternative/fix
I would suggest updating the visual and documenting more clearly what to expect from
tree_.value
forDecisionTreeClassifier
in 1.4.2 since it is evidently different compared to 1.3.2.I am working with some code that inspects the trees and would appreciate insight to make sure that I make the necessary adjustments to get the same values that I did with 1.3.2.
The text was updated successfully, but these errors were encountered: