Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TabTransformer TypeError caused by MixtureDensityHead input_dim is None #424

Open
nejox opened this issue Mar 26, 2024 · 4 comments
Open

Comments

@nejox
Copy link

nejox commented Mar 26, 2024

Describe the bug
While experimenting with different settings and just passing head="MixtuerDensityHead" to the TabTransformerConfig I ran into the lib/python3.9/site-packages/torch/nn/modules/linear.py", line 96, in __init__ self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs)) TypeError: empty(): argument 'size' failed to unpack the object at pos 2 with error "type must be tuple of ints,but got NoneType"

This is caused as self.pi = nn.Linear(self.hparams.input_dim, self.hparams.num_gaussian) in the _build_network() method of the MixtureDensityHead is called with self.hparams.input_dim having None value.

To Reproduce
My TabTransformerConfig:

tab_config = TabTransformerConfig(
        task="regression",
        learning_rate=1e-3,
        seed=42,
        head="MixtureDensityHead",
    )

Versions
Python: 3.9
Pytorch-tabular: 1.1.0

@nejox
Copy link
Author

nejox commented Mar 26, 2024

I tried to hack a bit around and got the input_dim working but it seems MixtureDensityHead is not fully supported yet as an Option for TabTransformer? The loss calculation fails with

lib/python3.9/site-packages/pytorch_tabular/models/base_model.py", line 276, in calculate_loss
    _loss = self.loss(y_hat[:, i], y[:, i])
TypeError: tuple indices must be integers or slices, not tuple

as y_hat is an tuple with pi, sigma & mu and not a tensor directly.

@manujosephv
Copy link
Owner

TabTransformer does need some special processing to make it work for MixtureDensityNetworks... But I thought I had done that. Can you post a reproduceable and self-contained code to replicate the issue (including sample data etc)? Maybe there is some thing weird going on. It's weird seeing the calculate_loss from base.py getting called. Ideally, Mixture Networks have a separate calculate_loss.

@nejox
Copy link
Author

nejox commented Apr 2, 2024

Yeah i tried it quite naively so there is a high change the error is on me but here you go.
This code raises the said exception:

from pytorch_tabular import TabularModel
from pytorch_tabular.config import DataConfig, TrainerConfig, OptimizerConfig, ExperimentConfig
from pytorch_tabular.models import TabTransformerConfig

if __name__ == '__main__':
    from pytorch_tabular.utils.data_utils import make_regression, load_covertype_dataset

    data, cat_col_names, num_col_names, target_name = load_covertype_dataset()
    
    from sklearn.model_selection import train_test_split
    train, test = train_test_split(data, random_state=42, test_size=0.2)
    train, val = train_test_split(train, random_state=42, test_size=0.2)
    print(f"Train Shape: {train.shape} | Val Shape: {val.shape} | Test Shape: {test.shape}")

    data_config = DataConfig(
        target=[
            target_name
        ],
        continuous_cols=num_col_names,
        categorical_cols=cat_col_names,
    )
    
    trainer_config = TrainerConfig(
        max_epochs=2,
        batch_size=1024,
        seed=42,
        progress_bar="tqdm",
        load_best=True,
        precision=16,
    )

    optimizer_config = OptimizerConfig(
        optimizer="Adam",
        lr_scheduler="ReduceLROnPlateau",
        lr_scheduler_params={"patience": 2, "factor": 0.1, "min_lr": 1e-6, "verbose": True},
        lr_scheduler_monitor_metric="valid_loss",
    )

    tab_config = TabTransformerConfig(
        task="classification",
        learning_rate=1e-3,
        seed=42,
        head="MixtureDensityHead",
    )

    tabular_model = TabularModel(
        data_config=data_config,
        model_config=tab_config,
        optimizer_config=optimizer_config,
        trainer_config=trainer_config,
        verbose=True,
    )

    tabular_model.fit(train=train, validation=val)

    results = tabular_model.evaluate(test)

    preds = tabular_model.predict(test)

    from sklearn.metrics import accuracy_score
    print("accuracy:", accuracy_score(test[target_name], preds))

@manujosephv
Copy link
Owner

Okay..So I finally got some time to check this out..

MDN is supposed to be used separately with a different config and not just as a head. I'll add some protection against this usage

This is how MDN models are used:

 from pytorch_tabular import TabularModel
from pytorch_tabular.config import DataConfig, TrainerConfig, OptimizerConfig, ExperimentConfig
from pytorch_tabular.models import TabTransformerConfig, MDNConfig

if __name__ == '__main__':
    from pytorch_tabular.utils.data_utils import make_mixed_dataset, load_covertype_dataset

    data, cat_col_names, num_col_names = make_mixed_dataset(task="regression", n_samples=1000)
    target_name = "target"
    
    from sklearn.model_selection import train_test_split
    train, test = train_test_split(data, random_state=42, test_size=0.2)
    train, val = train_test_split(train, random_state=42, test_size=0.2)
    print(f"Train Shape: {train.shape} | Val Shape: {val.shape} | Test Shape: {test.shape}")

    data_config = DataConfig(
        target=[
            target_name
        ],
        continuous_cols=num_col_names,
        categorical_cols=cat_col_names,
    )
    
    trainer_config = TrainerConfig(
        max_epochs=2,
        batch_size=1024,
        seed=42,
        progress_bar="tqdm",
        load_best=True,
        precision=16,
    )

    optimizer_config = OptimizerConfig(
        optimizer="Adam",
        lr_scheduler="ReduceLROnPlateau",
        lr_scheduler_params={"patience": 2, "factor": 0.1, "min_lr": 1e-6, "verbose": True},
        lr_scheduler_monitor_metric="valid_loss",
    )

    mdn_config = {"num_gaussian": 3}
    model_config_params = dict(
        task="regression",
        learning_rate=1e-3,
        seed=42,
        head_config=mdn_config,
        backbone_config_class="TabTransformerConfig",
        backbone_config_params={"task": "backbone"},
    )
    

    tab_config = MDNConfig(
        **model_config_params
    )

    tabular_model = TabularModel(
        data_config=data_config,
        model_config=tab_config,
        optimizer_config=optimizer_config,
        trainer_config=trainer_config,
        verbose=True,
    )

    tabular_model.fit(train=train, validation=val)

    results = tabular_model.evaluate(test)

    preds = tabular_model.predict(test)

This should work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants