Feature/Tutorial Request: Hyperparameter tuning #257

ParadaCarleton · 2023-10-03T02:13:29Z

Grad student descent is definitely not fun, so it would be very nice to have a way to tune hyperparameters efficiently, and a tutorial on how to do this. (MLJTuning.jl lets you do it in theory, but only provides a handful of black-box optimizers like random or grid search.)

jeremiedb · 2023-10-04T05:00:35Z

Are there specific hyper tuning methods you'd like to see covered?
With regard to demonstation with internal EvoTrees API, I'd tend to recommend a simple random search.
And for more specific tuning technics, I'd tend to favor developing them in a mostly algo agnostic way. MLJ seems like a good target in that regard. Were you seeing reasons to build a more elaborate hyper tuning wihtin a specific algo?

ParadaCarleton · 2023-10-06T17:18:57Z

Are there specific hyper tuning methods you'd like to see covered?

Mostly just a gradient method for the continuous parameters. Grid search should be fine for the discrete hyperparameters, given there's only 1 or 2.

jeremiedb · 2023-10-17T05:44:59Z

Could you precise the nature of the hyper search you're envisioning? I'm not clear how a gradient method could be applied here for hyper-search as the an EvoTree loss function isn't differentiable with respect to its hyper-parameter. Perhaps you're referring to apply a gradient method to eval metric outcomes to inform on next hyper candidate to test?
Other than Random search, my undertanding is that bayesian search may be the other most useful approach, but I may well have blind spots in my portrait of the hyper-tuning landscape.

ParadaCarleton · 2023-10-17T17:20:44Z

Whoops, this is supposed to be in EvoLinear.jl 😅

(Although, I thought the loss was differentiable with respect to lambda? But I might be mixing that up with some other decision tree algorithm.)

jeremiedb · 2023-10-19T22:44:15Z

Even in the context of EvoLinear, I'm not understanding the applicability of a gradient method for hyper params tuning.
Would you have an example (package/paper) of what you're trying to achieve?
Hyper param tuning is typically about figuring a hyper-param that leads to better generalisation on an out-of-sample dataset. In that context, I have difficulty to see how the feedback from the out-of-sample may be used to infer a udate to the hyper-param. Taking a minimal use case, linear regression with L2 regularization, how would L2 be updated from the out of sample metric?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/Tutorial Request: Hyperparameter tuning #257

Feature/Tutorial Request: Hyperparameter tuning #257

ParadaCarleton commented Oct 3, 2023

jeremiedb commented Oct 4, 2023

ParadaCarleton commented Oct 6, 2023

jeremiedb commented Oct 17, 2023

ParadaCarleton commented Oct 17, 2023 •

edited

jeremiedb commented Oct 19, 2023

Feature/Tutorial Request: Hyperparameter tuning #257

Feature/Tutorial Request: Hyperparameter tuning #257

Comments

ParadaCarleton commented Oct 3, 2023

jeremiedb commented Oct 4, 2023

ParadaCarleton commented Oct 6, 2023

jeremiedb commented Oct 17, 2023

ParadaCarleton commented Oct 17, 2023 • edited

jeremiedb commented Oct 19, 2023

ParadaCarleton commented Oct 17, 2023 •

edited