TabTransformer TypeError caused by MixtureDensityHead input_dim is None #424

nejox · 2024-03-26T17:51:25Z

Describe the bug
While experimenting with different settings and just passing head="MixtuerDensityHead" to the TabTransformerConfig I ran into the lib/python3.9/site-packages/torch/nn/modules/linear.py", line 96, in __init__ self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs)) TypeError: empty(): argument 'size' failed to unpack the object at pos 2 with error "type must be tuple of ints,but got NoneType"

This is caused as self.pi = nn.Linear(self.hparams.input_dim, self.hparams.num_gaussian) in the _build_network() method of the MixtureDensityHead is called with self.hparams.input_dim having None value.

To Reproduce
My TabTransformerConfig:

tab_config = TabTransformerConfig(
        task="regression",
        learning_rate=1e-3,
        seed=42,
        head="MixtureDensityHead",
    )

Versions
Python: 3.9
Pytorch-tabular: 1.1.0

The text was updated successfully, but these errors were encountered:

nejox · 2024-03-26T19:52:51Z

I tried to hack a bit around and got the input_dim working but it seems MixtureDensityHead is not fully supported yet as an Option for TabTransformer? The loss calculation fails with

lib/python3.9/site-packages/pytorch_tabular/models/base_model.py", line 276, in calculate_loss
    _loss = self.loss(y_hat[:, i], y[:, i])
TypeError: tuple indices must be integers or slices, not tuple

as y_hat is an tuple with pi, sigma & mu and not a tensor directly.

manujosephv · 2024-04-01T01:23:25Z

TabTransformer does need some special processing to make it work for MixtureDensityNetworks... But I thought I had done that. Can you post a reproduceable and self-contained code to replicate the issue (including sample data etc)? Maybe there is some thing weird going on. It's weird seeing the calculate_loss from base.py getting called. Ideally, Mixture Networks have a separate calculate_loss.

nejox · 2024-04-02T09:45:23Z

Yeah i tried it quite naively so there is a high change the error is on me but here you go.
This code raises the said exception:

from pytorch_tabular import TabularModel
from pytorch_tabular.config import DataConfig, TrainerConfig, OptimizerConfig, ExperimentConfig
from pytorch_tabular.models import TabTransformerConfig

if __name__ == '__main__':
    from pytorch_tabular.utils.data_utils import make_regression, load_covertype_dataset

    data, cat_col_names, num_col_names, target_name = load_covertype_dataset()
    
    from sklearn.model_selection import train_test_split
    train, test = train_test_split(data, random_state=42, test_size=0.2)
    train, val = train_test_split(train, random_state=42, test_size=0.2)
    print(f"Train Shape: {train.shape} | Val Shape: {val.shape} | Test Shape: {test.shape}")

    data_config = DataConfig(
        target=[
            target_name
        ],
        continuous_cols=num_col_names,
        categorical_cols=cat_col_names,
    )
    
    trainer_config = TrainerConfig(
        max_epochs=2,
        batch_size=1024,
        seed=42,
        progress_bar="tqdm",
        load_best=True,
        precision=16,
    )

    optimizer_config = OptimizerConfig(
        optimizer="Adam",
        lr_scheduler="ReduceLROnPlateau",
        lr_scheduler_params={"patience": 2, "factor": 0.1, "min_lr": 1e-6, "verbose": True},
        lr_scheduler_monitor_metric="valid_loss",
    )

    tab_config = TabTransformerConfig(
        task="classification",
        learning_rate=1e-3,
        seed=42,
        head="MixtureDensityHead",
    )

    tabular_model = TabularModel(
        data_config=data_config,
        model_config=tab_config,
        optimizer_config=optimizer_config,
        trainer_config=trainer_config,
        verbose=True,
    )

    tabular_model.fit(train=train, validation=val)

    results = tabular_model.evaluate(test)

    preds = tabular_model.predict(test)

    from sklearn.metrics import accuracy_score
    print("accuracy:", accuracy_score(test[target_name], preds))

manujosephv · 2024-05-29T11:30:24Z

Okay..So I finally got some time to check this out..

MDN is supposed to be used separately with a different config and not just as a head. I'll add some protection against this usage

This is how MDN models are used:

 from pytorch_tabular import TabularModel
from pytorch_tabular.config import DataConfig, TrainerConfig, OptimizerConfig, ExperimentConfig
from pytorch_tabular.models import TabTransformerConfig, MDNConfig

if __name__ == '__main__':
    from pytorch_tabular.utils.data_utils import make_mixed_dataset, load_covertype_dataset

    data, cat_col_names, num_col_names = make_mixed_dataset(task="regression", n_samples=1000)
    target_name = "target"
    
    from sklearn.model_selection import train_test_split
    train, test = train_test_split(data, random_state=42, test_size=0.2)
    train, val = train_test_split(train, random_state=42, test_size=0.2)
    print(f"Train Shape: {train.shape} | Val Shape: {val.shape} | Test Shape: {test.shape}")

    data_config = DataConfig(
        target=[
            target_name
        ],
        continuous_cols=num_col_names,
        categorical_cols=cat_col_names,
    )
    
    trainer_config = TrainerConfig(
        max_epochs=2,
        batch_size=1024,
        seed=42,
        progress_bar="tqdm",
        load_best=True,
        precision=16,
    )

    optimizer_config = OptimizerConfig(
        optimizer="Adam",
        lr_scheduler="ReduceLROnPlateau",
        lr_scheduler_params={"patience": 2, "factor": 0.1, "min_lr": 1e-6, "verbose": True},
        lr_scheduler_monitor_metric="valid_loss",
    )

    mdn_config = {"num_gaussian": 3}
    model_config_params = dict(
        task="regression",
        learning_rate=1e-3,
        seed=42,
        head_config=mdn_config,
        backbone_config_class="TabTransformerConfig",
        backbone_config_params={"task": "backbone"},
    )
    

    tab_config = MDNConfig(
        **model_config_params
    )

    tabular_model = TabularModel(
        data_config=data_config,
        model_config=tab_config,
        optimizer_config=optimizer_config,
        trainer_config=trainer_config,
        verbose=True,
    )

    tabular_model.fit(train=train, validation=val)

    results = tabular_model.evaluate(test)

    preds = tabular_model.predict(test)

This should work

manujosephv mentioned this issue May 29, 2024

Protection for MDN Head misuse #448

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TabTransformer TypeError caused by MixtureDensityHead input_dim is None #424

TabTransformer TypeError caused by MixtureDensityHead input_dim is None #424

nejox commented Mar 26, 2024

nejox commented Mar 26, 2024

manujosephv commented Apr 1, 2024

nejox commented Apr 2, 2024

manujosephv commented May 29, 2024

TabTransformer TypeError caused by MixtureDensityHead input_dim is None #424

TabTransformer TypeError caused by MixtureDensityHead input_dim is None #424

Comments

nejox commented Mar 26, 2024

nejox commented Mar 26, 2024

manujosephv commented Apr 1, 2024

nejox commented Apr 2, 2024

manujosephv commented May 29, 2024