New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CatBoost for Apache Spark AUC eval metric not working as expected. #2654
Comments
Some additional infomation from catboost_training.json:
|
Can you train on the same dataset (or create another dataset to reproduce) using local training (just local python (or R) package without Spark) and check that the result is the same? Maybe the nature of your dataset is that CatBoost is unable to train a good model on it. Do other GBDT packages like LightGBM or XGBoost produce significantly better results? |
If I use the local version, the AUC on eval set is about 0.74, so I think it's not about the dataset, but I'll evaluate the fitted model on eval set just to make sure if the result is not the same as reported in the log. |
Problem: Eval metric for catboost_spark.CatBoostClassifier is not working when it's set to be "AUC".
catboost version: 1.2.5
Operating System: CentOS Linux release 7.9.2009
CPU: Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz
GPU: Not installed.
My code is like:
The training log is like this:
which indicates that the model is pretty much randomly predicting the result.
After removing
.setEvalMetric("AUC")
, the trace is:The text was updated successfully, but these errors were encountered: