Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get difference tree result when converting cat_features to numerical values #2632

Open
ccylance opened this issue Apr 10, 2024 · 2 comments
Open

Comments

@ccylance
Copy link

ccylance commented Apr 10, 2024

catboost version: catboost 1.2
Operating System: macos 14.4.1
CPU: M1
GPU: no

I have the train_data with X = ['C', 'C', 'C', 'C', 'C', 'A', 'C', 'B', 'C', 'B', 'C', 'B', 'A', 'C', 'A', 'A', 'B', 'A', 'A', 'B'] and y=[0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1]. I put the data into the model and get
image
(Prior set 0.55 here because in CatBoostEncoder, prior set as the mean value of y)
After that, I used CatBoostEncoder in sklearn and convert the cat_feature into the numerical
image
I got catf=[8.25, 4.125, 2.75, 5.8125, 4.65, 8.25, 3.875, 8.25, 3.32142857, 11.625, 2.90625, 12.75, 11.625, 4.25, 12.75, 13.3125, 13.3125, 10.65 , 8.875, 10.65]. (multiply 15 here because the scale value in model1 is 15)
And then feed the data into model but get difference result, does there exist any extra process when deal with the category during training?
image

@andrey-khropov andrey-khropov changed the title Get difference tree result when convet cat_features to numerical value Get difference tree result when converting cat_features to numerical values Apr 10, 2024
@andrey-khropov
Copy link
Member

andrey-khropov commented Apr 10, 2024

(multiply 15 here because the scale value in model1 is 15)

Why? What does the model scale have to do with it?

@andrey-khropov
Copy link
Member

P.S. Please, insert code as text so it will be searchable and code examples can be copy-pasted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants