Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with h2o and class names #2406

Open
andrewcparnell opened this issue Aug 2, 2018 · 1 comment
Open

Problem with h2o and class names #2406

andrewcparnell opened this issue Aug 2, 2018 · 1 comment

Comments

@andrewcparnell
Copy link

andrewcparnell commented Aug 2, 2018

Hi,

A similar possible bug to #1787 (I was asked to start a new issue) I've just found exists when spaces
(and possibly other non-regular characters) occur in the class names when using h2o.

Here's an example based on #1787 that still breaks for me:

(reprex added by @pat-s)

library(mlr)
#> Loading required package: ParamHelpers

set.seed(123)
df <- data.frame(matrix(runif(100, 0, 1), 100, 9))
classx <- sample(paste(letters[1:4],letters[1:4]), 100, replace = TRUE)
df <- cbind(classx, df)

classif.task = makeClassifTask(id = "example", 
                               data = df,
                               target = "classx")

gb.lrn  = makeLearner("classif.h2o.randomForest", 
                      predict.type = "prob")

rdesc = makeResampleDesc("CV", iters = 3, stratify = TRUE)
rin = makeResampleInstance(rdesc, task = classif.task)

r = resample(gb.lrn, classif.task, rin, 
             measures = list(mmce))
#> Resampling: cross-validation
#> Measures:             mmce
#>  Connection successful!
#> 
#> R is connected to the H2O cluster: 
#>     H2O cluster uptime:         1 minutes 1 seconds 
#>     H2O cluster timezone:       Europe/Berlin 
#>     H2O data parsing timezone:  UTC 
#>     H2O cluster version:        3.26.0.2 
#>     H2O cluster version age:    5 months and 4 days !!! 
#>     H2O cluster name:           H2O_started_from_R_patrickschratz_fmz906 
#>     H2O cluster total nodes:    1 
#>     H2O cluster total memory:   4.00 GB 
#>     H2O cluster total cores:    8 
#>     H2O cluster allowed cores:  8 
#>     H2O cluster healthy:        TRUE 
#>     H2O Connection ip:          localhost 
#>     H2O Connection port:        54321 
#>     H2O Connection proxy:       NA 
#>     H2O Internal Security:      FALSE 
#>     H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 
#>     R Version:                  R version 3.6.2 Patched (2019-12-12 r77564)
#> Warning in h2o.clusterInfo(): 
#> Your H2O cluster version is too old (5 months and 4 days)!
#> Please download and install the latest version from http://h2o.ai/download/
#> 
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#>   |                                                                              |                                                                      |   0%  |                                                                              |==============                                                        |  20%  |                                                                              |======================================================================| 100%
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#> Error in checkPredictLearnerOutput(.learner, .model, p): predictLearner for classif.h2o.randomForest has returned not the class levels as column names: a.a,b.b,c.c,d.d

Created on 2019-12-31 by the reprex package (v0.3.0)

This will run when replacing e.g. classif.h2o.randomForest with classif.randomForestSRC. It seems to fail with other non-standard characters in the class names but I haven't done an exhaustive search.

Many thanks for the wonderful package.

Andrew

@andrewcparnell andrewcparnell changed the title Problem with c Problem with h2o and class names Aug 2, 2018
@krltrl
Copy link

krltrl commented Apr 2, 2019

I just wanted to add that this also happens for any kind of mulitlabel classification with h2o, even when the labels do not have spaces or special characters. This means you cannot use h2o for multilabel classfication at all.

Example:

lrn.br = makeLearner("classif.h2o.deeplearning", predict.type = "prob")
lrn.br = makeMultilabelBinaryRelevanceWrapper(lrn.br)
mod=train(lrn.br, yeast.task)
pred = predict(mod,yeast.task)

Error in checkPredictLearnerOutput(.learner, .model, p) :
predictLearner for classif.h2o.deeplearning has returned not the class levels as column names: FALSE.,TRUE.

@pat-s pat-s added the type-bug label Dec 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants