You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This technical note was inspired by the following question from a Data Science user:
I’ve trained a GLM and a RF using weights, but when I try to predict on a test data I get this error:
> h2o.predict(glm_model, test_df[1:10,])
Error in .h2o.doSafeREST(conn = conn, h2oRestApiVersion = h2oRestApiVersion, :
Test dataset is missing weights vector 'weights' (needed because a response was found and metrics are to be computed).
Why does H2O need weights when it generates predictions?
Do the predictions depend on the weights?
Discussion
If the user goes through the effort to create weights for training, they most likely want to use weights for validation (often a holdout set). We abort to prevent a user mistake. Only a pure test set without a response is accepted without weights, since no metrics are to be computed. Metrics need to know the row weights.
It's easy to add a trivial weights column to a validation frame. Here is an example of how to do that in R.
validation_frame$weights <- 1
Alternately, remove the response column from the data set to be predicted. Then metrics will not be computed and the error will not be hit.
Note that the name of the weights column is specified by the user when the model is built, and can be any name, not just "weights".
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Motivation
This technical note was inspired by the following question from a Data Science user:
I’ve trained a GLM and a RF using weights, but when I try to predict on a test data I get this error:
Discussion
If the user goes through the effort to create weights for training, they most likely want to use weights for validation (often a holdout set). We abort to prevent a user mistake. Only a pure test set without a response is accepted without weights, since no metrics are to be computed. Metrics need to know the row weights.
It's easy to add a trivial weights column to a validation frame. Here is an example of how to do that in R.
validation_frame$weights <- 1
Alternately, remove the response column from the data set to be predicted. Then metrics will not be computed and the error will not be hit.
Note that the name of the weights column is specified by the user when the model is built, and can be any name, not just "weights".
Example
JIRA Issue Migration Info
Jira Issue: TN-1
Assignee: TomK
Reporter: TomK
State: Resolved
Relates to: #14943
Beta Was this translation helpful? Give feedback.
All reactions