Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor Search Helpers Should Be Unreachable #2647

Open
8W9aG opened this issue Apr 23, 2024 · 8 comments
Open

Tensor Search Helpers Should Be Unreachable #2647

8W9aG opened this issue Apr 23, 2024 · 8 comments

Comments

@8W9aG
Copy link

8W9aG commented Apr 23, 2024

Problem: CatBoostError: /src/catboost/catboost/private/libs/algo/tensor_search_helpers.cpp:99: This should be unreachable
catboost version: 1.26.4
Operating System: Ubuntu Linux
CPU: Intel x86
GPU: nVidia

When running my pipeline this issue pops up from time to time in such a way that I can't see a pattern in these failures. Is this a known issue that I can work around? At the moment I just catch CatBoostError and continue with a different study.

@andrey-khropov
Copy link
Member

catboost version: 1.26.4

Please specify a correct version. CatBoost does not have version '1.26.4' yet.

@8W9aG
Copy link
Author

8W9aG commented Apr 24, 2024

Apologies, it's 1.2.3

@andrey-khropov
Copy link
Member

At the moment I just catch CatBoostError and continue with a different study.

If you can provide a set of hyperparameters on which such an error occurs that would be helpful. If you have a fully reproducible code example that would be even more helpful.

@8W9aG
Copy link
Author

8W9aG commented Apr 25, 2024

The hyperparameters where this has occurred in the last 5 instances are as follows:

'learning_rate': 0.024307262528096778, 'depth': 3, 'l2_leaf_reg': 6.906966305187461, 'boosting_type': 'Plain', 'iterations': 1538, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 211, 'steps': 1

'learning_rate': 0.096895104759761, 'depth': 4, 'l2_leaf_reg': 9.694786305854311, 'boosting_type': 'Plain', 'iterations': 2113, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 250, 'steps': 3

'learning_rate': 0.05149527087891092, 'depth': 4, 'l2_leaf_reg': 8.187545785164538, 'boosting_type': 'Plain', 'iterations': 2495, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 256, 'steps': 10

'learning_rate': 0.096895104759761, 'depth': 4, 'l2_leaf_reg': 9.694786305854311, 'boosting_type': 'Plain', 'iterations': 2113, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 250, 'steps': 3

'learning_rate': 0.05149527087891092, 'depth': 4, 'l2_leaf_reg': 8.187545785164538, 'boosting_type': 'Plain', 'iterations': 2495, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 256, 'steps': 10

I'll see if I can get a concise case to manifest that doesn't leak too much data, the odd thing is its usually within 10-20 iterations randomly.

@andrey-khropov
Copy link
Member

'num_features_to_select'

Are you calling select_features ?

@8W9aG
Copy link
Author

8W9aG commented Apr 25, 2024

I am, call looks like so:

try:
                summary = model.select_features(
                    train_pool,
                    features_for_select=X_train.columns.values,
                    num_features_to_select=catboost_select_features_params[
                        NUM_FEATURES_TO_SELECT_KEY
                    ],
                    steps=catboost_select_features_params[STEPS_KEY],
                    algorithm=EFeaturesSelectionAlgorithm.RecursiveByShapValues,
                    shap_calc_type=EShapCalcType.Regular,
                    train_final_model=True,
                    logging_level="Verbose",
                )
            except CatBoostError as e:
                print(f"CatBoostError: {e}")
                return None, [], {}

@8W9aG
Copy link
Author

8W9aG commented May 1, 2024

I have a reproducible test case:

data.zip

With the following code:

from catboost import CatBoostRegressor  # type: ignore
from catboost import EFeaturesSelectionAlgorithm, EShapCalcType, Pool
import pandas as pd

params = {'catboost': {'learning_rate': 0.024517856609649665, 'depth': 4, 'l2_leaf_reg': 3.279624422858039, 'boosting_type': 'Ordered', 'iterations': 720}, 'catboost_select_features': {'num_features_to_select': 0.11037514116430513, 'steps': 7}, 'task_type': 'GPU', 'devices': '0'}
X = pd.read_parquet("X.parquet")
y = pd.read_parquet("y.parquet").squeeze()
W = pd.read_parquet("W.parquet").squeeze()
model = CatBoostRegressor(**params["catboost"])
train_pool = Pool(data=X, label=y, weight=W)
catboost_select_features_params = params["catboost_select_features"]
summary = model.select_features(
    train_pool,
    features_for_select=X.columns.values,
    num_features_to_select=max(
        1,
        int(
            catboost_select_features_params["num_features_to_select"]
            * len(X.columns.values)
        ),
    ),
    steps=catboost_select_features_params["steps"],
    algorithm=EFeaturesSelectionAlgorithm.RecursiveByShapValues,
    shap_calc_type=EShapCalcType.Regular,
    train_final_model=True,
)

Happens on Linux and Mac.

@8W9aG
Copy link
Author

8W9aG commented May 1, 2024

I played around with this, and if I perform the following modification to the weights in order to avoid 0.0 appearing I don't see this issue:

W = W.clip(lower=0.00000001)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants