[python-package] Can't retrieve best_iteration after Optuna optimization #6384

YingJie-Zhao · 2024-03-26T09:22:42Z

Environment info

LightGBM version or commit hash: 4.1.0
Optuna version:3.6.0
Optuna_Integration version:3.6.0

Command(s) you used to install LightGBM

pip install lightgbm

Description

Hi there, I am new to LightGBM and currently I can't find any useful solutions from google/stackoverflow/github issues, so I wonder if posting a new issue would be helpful, pardon me for the inappropriate behavior since I'm using 'issue' to ask a 'question'.

Here's my problem:

I was using Optuna to optimize my LightGBM model. At the same time I was using LightGBM callbacks early_stopping(50) to early stop the iterations. I have set the best model in loops of optimization, and retrieved the best model(best booster) from the user_attr. Since the early_stopping callbacks was set, the training output logs showed some content like this below:

Early stopping, best iteration is:
[30]	train_set's auc: 0.982083	valid_set's auc: 0.874471
Training until validation scores don't improve for 100 rounds

Assuming that the auc value above valid_set's auc: 0.874471 was indeed the best value from all iterations, the best_iteration should be [30] as showed above.
However, I got -1 from invoking best_model.best_iteration like this below:

In: print(best_model.best_iteration)
Out: -1

My question is: How can I get the correct best_iteration value from the best model retrieved from study object?

Thanks to whom may solving my problem!
Looking forward to your reply :)

Reproducible example

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
import lightgbm as lgb
import optuna
from lightgbm import early_stopping

dataset = load_breast_cancer()
x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.2)

def objective(trial, train_set, valid_set, num_iterations): 
    params = {
        'objective':'binary', 
        'metric': ['auc'],
        'verbosity':-1, 
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.5)
    }

    pruning_callback = optuna.integration.LightGBMPruningCallback(trial, 'auc', valid_name='valid_set')
    
    model = lgb.train(
        params,
        num_boost_round=num_iterations,
        train_set=train_set, 
        valid_sets=[train_set, valid_set], 
        valid_names=['train_set', 'valid_set'],
        callbacks=[pruning_callback, early_stopping(50)]
    )
    
    trial.set_user_attr(key='best_booster', value=model)

    prob_pred = model.predict(x_test, num_iteration=model.best_iteration)
    return roc_auc_score(y_test, prob_pred, labels=[0,1])

train_set = lgb.Dataset(x_train, label=y_train)
valid_set = lgb.Dataset(x_test, label=y_test)
func = lambda trial: objective(trial=trial, train_set=train_set, valid_set=valid_set, num_iterations=num_iterations)

num_iterations = 100
study = optuna.create_study(
    pruner=optuna.pruners.HyperbandPruner(), 
    direction='maximize'
)

def save_best_booster(study, trial):
    if study.best_trial.number == trial.number:
        study.set_user_attr(key='best_booster', value=trial.user_attrs['best_booster'])

study.optimize(
    func, 
    n_trials=30,
    show_progress_bar=True,
    callbacks=[save_best_booster]
)

trial = study.best_trial
best_model=study.user_attrs['best_booster']

print(best_model.best_iteration)

The text was updated successfully, but these errors were encountered:

jameslamb · 2024-03-26T14:25:47Z

Thanks for using LightGBM.

We'd be happy to help you, but would really appreciate if you could reduce this to a smaller, self-contained example that demonstrates the issue. Consider the strategies in this guide: https://stackoverflow.com/help/minimal-reproducible-example.

For example:

what version of optuna are you using?
does it really require searching values of lambda_l1, lambda_l2, and all those other parameters to reproduce this behavior? If not, remove them from the code sample.
what is the shape and distribution of the data?

We'd really appreciate if you could, for example, create an example that could be copied and pasted with 0 modification by someone trying to help you. For example, start from this for binary classification:

import lightgbm as lgb
from sklearn.datasets import make_blobs

X, y = make_blobs(n_samples=10_000, centers=[[-4, -4], [-4, 4]])

And then fill in the least additional code necessary to show the problem you're asking for help with.

YingJie-Zhao · 2024-03-26T15:13:07Z

@jameslamb Really appreciate for your useful advice! I have updated my issue and removed unnecessary code, the newly updated code can be run directly without any modification.

jameslamb · 2024-03-26T15:16:46Z

Thank you so much for that! One of us will try to look into this soon and help. If you find anything else while investigating, please post it here.

YingJie-Zhao · 2024-03-26T15:29:04Z

Thanks for the help!

I do have found something might be helpful.

When I added a print(model.best_iteration) inside the objective function, the model's best_iteration can be printed correctly as it was supposed. Seems like the best_model saved in study object was modified unexpectedly by Optuna?

Code:

def objective():
    # other code
    print('Inner best iteration',model.best_iteration)

# other code
trial = study.best_trial
best_model=study.user_attrs['best_booster']
print('Outer best iteration',best_model.best_iteration)

Output:

Inner best iteration 20
Inner best iteration 33
Inner best iteration 36
Outer best iteration -1

jameslamb · 2024-03-26T15:40:37Z

Interesting! I might be able to provide more information on that later.

Also, I just noticed you've double-posted this on Stack Overflow as well: https://stackoverflow.com/questions/78223783/cant-retrieve-best-iteration-in-lightgbm.

Please don't do that. Maintainers here also monitor the [lightgbm] tag on Stack Overflow. I could have been spending time preparing an answer here while another maintainer was spending time answering your Stack Overflow post, which would have been a waste of maintainers' limited attention that could otherwise have been spent improving this project. Double-posting also makes it less likely that others with a similar question will find the relevant discussion and answer.

YingJie-Zhao · 2024-03-26T15:44:56Z

Oops. Sorry for the inconvenience! I will delete the double-posted stackflow question right away.

jmoralez · 2024-03-26T16:32:15Z

Since this is done at the end of training

LightGBM/python-package/lightgbm/engine.py

Lines 299 to 300 in 501ce1c

 if not keep_training_booster: 

 booster.model_from_string(booster.model_to_string()).free_dataset()

I believe best_model.current_iteration() should match the best iteration.

YingJie-Zhao · 2024-03-27T01:36:50Z

@jmoralez Thanks to your reply. Sorry I might not fully understand.

As you said, best_model.current_iteration() match the best iteration indeed, but I still want to know why invoking best_model.best_iteration returns -1 instead of the correct best iteration value.

Did you mean that the model returned from train() API will remove best_iteraton attribute when invoking booster.model_from_string(booster.model_to_string()).free_dataset() ?

Assuming that the best_iteration was lost after returning from train() API, why the print('Inner best iteration',model.best_iteration) works fine and best_iteration can be printed as expected?

jmoralez · 2024-03-27T02:02:21Z

I meant I didn't know why that was removed but that you could use the current_iteration instead. Looking a bit closer at your example you try to get the attribute from the study, not the trial. Can you try the following instead?

best_model = study.best_trial.user_attrs['best_booster']

YingJie-Zhao · 2024-03-27T02:22:25Z

I tried, but it still not working.

best_model_1 = study.best_trial.user_attrs['best_booster']
print('===BEST MODEL 1===')
print('Best iteration',best_model_1.best_iteration)
print('Current iteration', best_model_1.current_iteration())
print('Memory ID with best_model_1:', id(best_model_1))

best_model_2 = study.user_attrs['best_booster']
print('===BEST MODEL 2===')
print('Best iteration',best_model_2.best_iteration)
print('Current iteration', best_model_2.current_iteration())
print('Memory ID with best_model_2:', id(best_model_2))

Output

===BEST MODEL 1===
Best iteration -1
Current iteration 46
Memory ID with best_model_1: 140175320209488
===BEST MODEL 2===
Best iteration -1
Current iteration 46
Memory ID with best_model_2: 140175320168576

It shows that best_model returned from two different ways comes up with same value on best_iteration and current_iteration() but have different memory id.

jmoralez · 2024-03-27T03:05:08Z

I think this is a question for the optuna folks, the only place I see where we set best iteration to -1 is in the __init__ method of the booster

LightGBM/python-package/lightgbm/basic.py

Line 3583 in 28536a0

self.best_iteration = -1

I don't know what they do to the user attributes that would result in that line being run.

I'd still suggest to use the Booster.current_iteration method for your purposes, since the model is trimmed to have only up to the best iteration. You could also save the best iteration as a separate attribute inside the objective function.

YingJie-Zhao · 2024-03-27T03:22:36Z

Thank you very much for you patience. Set best iteration inside objective function was a great idea!

Actually retrieving the exact value of best iteration was no longer a problem for me since current_iteration() method was working properly. Now I just want to find out the root cause of this situation.

I will issue this problem to Optuna later and keep updating here.

nzw0301 · 2024-03-27T13:45:57Z

Hi from the optuna community. When I did copy.copy(model), which is called in optuna when we set user_attr, by the way, and copy.deepcopy(model) inside the objective, both copy models' best_iteration values become -1. I think some copy part sounds a root issue, but I'm not sure, so sorry if I'm wrong.

!conda install lightgbm
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
import lightgbm as lgb
import optuna
from lightgbm import early_stopping

dataset = load_breast_cancer()
x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.2)


def objective(trial, train_set, valid_set, num_iterations): 
    params = {
        'objective':'binary', 
        'metric': ['auc'],
        'verbosity':-1, 
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.5)
    }

    pruning_callback = optuna_integration.LightGBMPruningCallback(trial, 'auc', valid_name='valid_set')
    
    model = lgb.train(
        params,
        num_boost_round=num_iterations,
        train_set=train_set, 
        valid_sets=[train_set, valid_set], 
        valid_names=['train_set', 'valid_set'],
        callbacks=[pruning_callback, early_stopping(50)]
    )
    
    print(model.best_iteration, copy.copy(model).best_iteration, copy.deepcopy(model).best_iteration)
    
    prob_pred = model.predict(x_test, num_iteration=model.best_iteration)
    return roc_auc_score(y_test, prob_pred, labels=[0,1])

train_set = lgb.Dataset(x_train, label=y_train)
valid_set = lgb.Dataset(x_test, label=y_test)
func = lambda trial: objective(trial=trial, train_set=train_set, valid_set=valid_set, num_iterations=num_iterations)

num_iterations = 100
study = optuna.create_study(
    pruner=optuna.pruners.HyperbandPruner(), 
    direction='maximize'
)

study.optimize(func,  n_trials=1)

The output looks like

Training until validation scores don't improve for 50 rounds
Did not meet early stopping. Best iteration is:
[98]	train_set's auc: 1	valid_set's auc: 0.993318
98 -1 -1

jameslamb · 2024-03-27T13:54:10Z

If you suspect this is a LightGBM issue, and if you're familiar with optuna's internals, we would really appreciate a minimal, reproducible example that does not involve optuna.

YingJie-Zhao · 2024-03-27T14:03:20Z

Sorry for interrupting your conversation, I think I might found a possible reproducible example without using Optuna.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
import lightgbm as lgb
from lightgbm import early_stopping
import copy


dataset = load_breast_cancer()
x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.2)
train_set = lgb.Dataset(x_train, label=y_train)
valid_set = lgb.Dataset(x_test, label=y_test)

params = {
    'objective':'binary', 
    'metric': ['auc'],
    'verbosity':-1, 
    'num_iteration': 100
}

model = lgb.train(
    params,
    train_set=train_set, 
    valid_sets=[train_set, valid_set], 
    valid_names=['train_set', 'valid_set'],
    callbacks=[early_stopping(50)]
)

print(model.best_iteration, copy.copy(model).best_iteration, copy.deepcopy(model).best_iteration)

Output

Training until validation scores don't improve for 50 rounds
Did not meet early stopping. Best iteration is:
[59]	train_set's auc: 1	valid_set's auc: 0.995414
59 -1 -1

YingJie-Zhao · 2024-03-27T14:16:11Z

I just checked the source code of __copy__ and __deepcopy__.

LightGBM/python-package/lightgbm/basic.py

Lines 2841 to 2847 in 0c0eb2a

 def __copy__(self) -> "Booster": 

 return self.__deepcopy__(None) 

 def __deepcopy__(self, _) -> "Booster": 

 model_str = self.model_to_string(num_iteration=-1) 

 booster = Booster(model_str=model_str) 

 return booster

Seems like the model_to_string function doesn't save best_iteration and invoking __init__ method will reset the booster's best_iteration to -1 by default.

LightGBM/python-package/lightgbm/basic.py

Lines 3587 to 3643 in 0c0eb2a

 def model_to_string( 

 self, 

 num_iteration: Optional[int] = None, 

 start_iteration: int = 0, 

 importance_type: str = 'split' 

 ) -> str: 

 """Save Booster to string. 

  Parameters 

  ---------- 

  num_iteration : int or None, optional (default=None) 

  Index of the iteration that should be saved. 

  If None, if the best iteration exists, it is saved; otherwise, all iterations are saved. 

  If <= 0, all iterations are saved. 

  start_iteration : int, optional (default=0) 

  Start index of the iteration that should be saved. 

  importance_type : str, optional (default="split") 

  What type of feature importance should be saved. 

  If "split", result contains numbers of times the feature is used in a model. 

  If "gain", result contains total gains of splits which use the feature. 

  Returns 

  ------- 

  str_repr : str 

  String representation of Booster. 

  """ 

 if num_iteration is None: 

 num_iteration = self.best_iteration 

 importance_type_int = FEATURE_IMPORTANCE_TYPE_MAPPER[importance_type] 

 buffer_len = 1 << 20 

 tmp_out_len = ctypes.c_int64(0) 

 string_buffer = ctypes.create_string_buffer(buffer_len) 

 ptr_string_buffer = ctypes.c_char_p(*[ctypes.addressof(string_buffer)]) 

 _safe_call(_LIB.LGBM_BoosterSaveModelToString( 

 self.handle, 

 ctypes.c_int(start_iteration), 

 ctypes.c_int(num_iteration), 

 ctypes.c_int(importance_type_int), 

 ctypes.c_int64(buffer_len), 

 ctypes.byref(tmp_out_len), 

 ptr_string_buffer)) 

 actual_len = tmp_out_len.value 

 # if buffer length is not long enough, re-allocate a buffer 

 if actual_len > buffer_len: 

 string_buffer = ctypes.create_string_buffer(actual_len) 

 ptr_string_buffer = ctypes.c_char_p(*[ctypes.addressof(string_buffer)]) 

 _safe_call(_LIB.LGBM_BoosterSaveModelToString( 

 self.handle, 

 ctypes.c_int(start_iteration), 

 ctypes.c_int(num_iteration), 

 ctypes.c_int(importance_type_int), 

 ctypes.c_int64(actual_len), 

 ctypes.byref(tmp_out_len), 

 ptr_string_buffer)) 

 ret = string_buffer.value.decode('utf-8') 

 ret += _dump_pandas_categorical(self.pandas_categorical) 

 return ret

LightGBM/python-package/lightgbm/basic.py

Lines 2705 to 2733 in 0c0eb2a

 class Booster: 

 """Booster in LightGBM.""" 

 def __init__( 

 self, 

 params: Optional[Dict[str, Any]] = None, 

 train_set: Optional[Dataset] = None, 

 model_file: Optional[Union[str, Path]] = None, 

 model_str: Optional[str] = None 

 ): 

 """Initialize the Booster. 

  Parameters 

  ---------- 

  params : dict or None, optional (default=None) 

  Parameters for Booster. 

  train_set : Dataset or None, optional (default=None) 

  Training dataset. 

  model_file : str, pathlib.Path or None, optional (default=None) 

  Path to the model file. 

  model_str : str or None, optional (default=None) 

  Model will be loaded from this string. 

  """ 

 self.handle = None 

 self.network = False 

 self.__need_reload_eval_info = True 

 self._train_data_name = "training" 

 self.__set_objective_to_none = False 

 self.best_iteration = -1

Pardon me if I am wrong.

jmoralez · 2024-03-27T14:46:52Z

Oh I forgot about the copy. Linking #5539, which is similar.

YingJie-Zhao · 2024-03-27T15:00:54Z

Thank you for your information.
I checked the PR #6101 which related to #5539 but I don't think it will solve this problem as it only changed the way of copying the params rather than model's self attributes instead.

Perhaps the model's attributes(e.g. best_iteration) should be copied as well as params?

jameslamb changed the title ~~Can't retrieve best_iteration after Optuna optimization~~ [python-package] Can't retrieve best_iteration after Optuna optimization Mar 26, 2024

jameslamb added the question label Mar 26, 2024

jameslamb added the awaiting response label Mar 26, 2024

github-actions bot removed the awaiting response label Mar 26, 2024

YingJie-Zhao mentioned this issue Mar 27, 2024

Can't retrieve lightgbm model's best_iteration after Optuna optimization optuna/optuna#5363

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] Can't retrieve best_iteration after Optuna optimization #6384

[python-package] Can't retrieve best_iteration after Optuna optimization #6384

YingJie-Zhao commented Mar 26, 2024 •

edited

jameslamb commented Mar 26, 2024

YingJie-Zhao commented Mar 26, 2024

jameslamb commented Mar 26, 2024

YingJie-Zhao commented Mar 26, 2024

jameslamb commented Mar 26, 2024

YingJie-Zhao commented Mar 26, 2024

jmoralez commented Mar 26, 2024

YingJie-Zhao commented Mar 27, 2024

jmoralez commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024

jmoralez commented Mar 27, 2024 •

edited

YingJie-Zhao commented Mar 27, 2024

nzw0301 commented Mar 27, 2024

jameslamb commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024

jmoralez commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024 •

edited

[python-package] Can't retrieve best_iteration after Optuna optimization #6384

[python-package] Can't retrieve best_iteration after Optuna optimization #6384

Comments

YingJie-Zhao commented Mar 26, 2024 • edited

Environment info

Description

Reproducible example

jameslamb commented Mar 26, 2024

YingJie-Zhao commented Mar 26, 2024

jameslamb commented Mar 26, 2024

YingJie-Zhao commented Mar 26, 2024

jameslamb commented Mar 26, 2024

YingJie-Zhao commented Mar 26, 2024

jmoralez commented Mar 26, 2024

YingJie-Zhao commented Mar 27, 2024

jmoralez commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024

jmoralez commented Mar 27, 2024 • edited

YingJie-Zhao commented Mar 27, 2024

nzw0301 commented Mar 27, 2024

jameslamb commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024

jmoralez commented Mar 27, 2024

YingJie-Zhao commented Mar 27, 2024 • edited

YingJie-Zhao commented Mar 26, 2024 •

edited

jmoralez commented Mar 27, 2024 •

edited

YingJie-Zhao commented Mar 27, 2024 •

edited