Permutation importance calculations of multilevel models #138

BELONOVSKII · 2023-10-23T15:44:32Z

🐛 Bug

Problem

Functions calc_one_feat_imp and calc_feats_permutation_imps in lightautoml/automl/presets/utils.py are unable to work with multilevel models.

To Reproduce

Fit a TabularAutoML with multi class Task and call get_feature_scores('accurate', df)

Traceback

KeyError Traceback (most recent call last)
Cell In[63], line 1
----> 1 accurate_fi = automl.get_feature_scores('accurate', test_data, silent=True)
2 accurate_fi.set_index('Feature')['Importance'].plot.bar(figsize = (30, 10), grid = True)

File ~/LightAutoML/lightautoml/automl/presets/tabular_presets.py:837, in TabularAutoML.get_feature_scores(self, calc_method, data, features_names, silent)
835 data, _ = read_data(data, features_names, self.cpu_limit, read_csv_params)
836 used_feats = self.collect_used_feats()
--> 837 fi = calc_feats_permutation_imps(
838 self,
839 used_feats,
840 data,
841 self.reader.target,
842 self.task.get_dataset_metric(),
843 silent=silent,
844 )
845 return fi

File ~/LightAutoML/lightautoml/automl/presets/utils.py:38, in calc_feats_permutation_imps(model, used_feats, data, target, metric, silent)
35 feat_imp = []
36 for it, f in enumerate(used_feats):
37 feat_imp.append(
---> 38 calc_one_feat_imp(
39 (it + 1, n_used_feats),
40 f,
41 model,
42 data,
43 norm_score,
44 target,
45 metric,
46 silent,
47 )
48 )
49 feat_imp = pd.DataFrame(feat_imp, columns=["Feature", "Importance"])
50 feat_imp = feat_imp.sort_values("Importance", ascending=False).reset_index(drop=True)

File ~/LightAutoML/lightautoml/automl/presets/utils.py:14, in calc_one_feat_imp(iters, feat, model, data, norm_score, target, metric, silent)
13 def calc_one_feat_imp(iters, feat, model, data, norm_score, target, metric, silent):
---> 14 initial_col = data[feat].copy()
15 data[feat] = np.random.permutation(data[feat].values)
17 preds = model.predict(data)

File ~/LAMA_venv3_8/lib/python3.8/site-packages/pandas/core/frame.py:3807, in DataFrame.getitem(self, key)
3805 if self.columns.nlevels > 1:
3806 return self._getitem_multilevel(key)
-> 3807 indexer = self.columns.get_loc(key)
3808 if is_integer(indexer):
3809 indexer = [indexer]

File ~/LAMA_venv3_8/lib/python3.8/site-packages/pandas/core/indexes/base.py:3804, in Index.get_loc(self, key, method, tolerance)
3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
-> 3804 raise KeyError(key) from err
3805 except TypeError:
3806 # If we have a listlike key, _check_indexing_error will raise
3807 # InvalidIndexError. Otherwise we fall through and re-raise
3808 # the TypeError.
3809 self._check_indexing_error(key)

KeyError: 'Lvl_0_Pipe_0_Mod_0_LinearL2_prediction_0'

The text was updated successfully, but these errors were encountered:

BELONOVSKII added the bug Something isn't working label Oct 23, 2023

BELONOVSKII linked a pull request Oct 23, 2023 that will close this issue

Permutation importance bug fixed #139

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Permutation importance calculations of multilevel models #138

Permutation importance calculations of multilevel models #138

BELONOVSKII commented Oct 23, 2023

Permutation importance calculations of multilevel models #138

Permutation importance calculations of multilevel models #138

Comments

BELONOVSKII commented Oct 23, 2023

🐛 Bug

Problem

To Reproduce

Traceback