Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] TransformerPipeline does not allow fitting #6417

Open
helloplayer1 opened this issue May 13, 2024 · 0 comments
Open

[BUG] TransformerPipeline does not allow fitting #6417

helloplayer1 opened this issue May 13, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@helloplayer1
Copy link
Contributor

Describe the bug

Trying to fit a pipeline including a KalmanFilterTransformerFP, TSInterpolator and an FCNRegressor with panel x data and a 1D numpy Array for y data produces an error.
To Reproduce

import numpy as np
import pandas as pd
from sktime.pipeline import make_pipeline
from sktime.transformations.series.kalman_filter import KalmanFilterTransformerFP
from sktime.transformations.compose import FitInTransform
from datetime import datetime
from sktime.transformations.panel.interpolate import TSInterpolator
from sklearn.model_selection import train_test_split
from sktime.regression.deep_learning import FCNRegressor

# Define the multi-index
index = pd.MultiIndex.from_tuples([
    (0, datetime.strptime('2024-04-20 18:22:14.877500', '%Y-%m-%d %H:%M:%S.%f')),
    (0, datetime.strptime('2024-04-20 18:22:14.903000', '%Y-%m-%d %H:%M:%S.%f')),
    (1, datetime.strptime('2024-04-20 18:24:42.453400', '%Y-%m-%d %H:%M:%S.%f')),
    (1, datetime.strptime('2024-04-20 18:24:42.478800', '%Y-%m-%d %H:%M:%S.%f'))
], names=['instance', 'Time'])

x_data = pd.DataFrame({
    'LeftControllerVelocity_0': [-0.01, -0.01, 0.06, 0.06]
}, index=index)
y_data = np.array([1,0.5]);

# Split the data into training and testing data
instances = x_data.index.get_level_values('instance').unique()
train_indicies, test_indicies = train_test_split(instances, test_size=0.3)

x_train = x_data.loc[train_indicies]
y_train = y_data[train_indicies]
y_test = y_data[test_indicies]
x_test = x_data.loc[test_indicies]

noise_filter = FitInTransform(KalmanFilterTransformerFP(1, denoising=True))
interpolator = TSInterpolator(4000)
regressor = FCNRegressor(verbose=True, n_epochs=80000)

model = make_pipeline(noise_filter, interpolator, regressor)

model.fit(x_train, y_train)

Expected behavior
Model is fitted.

Additional context
If you instead chain these estimators by yourself, it works, but only if you do not provide y_data for the fitting:

x_train = interpolator.fit_transform(noise_filter.fit_transform(x_train))
x_test = interpolator.transform(noise_filter.transform(x_test))
model = regressor

Versions

System: python: 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0] executable: /usr/bin/python machine: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35

Python dependencies:
pip: 24.0
sktime: 0.29.0
sklearn: 1.4.2
skbase: 0.7.8
numpy: 1.26.4
scipy: 1.13.0
pandas: 2.2.2
matplotlib: 3.8.4
joblib: 1.4.2
numba: 0.59.1
statsmodels: 0.14.2
pmdarima: None
statsforecast: None
tsfresh: 0.20.2
tslearn: None
torch: None
tensorflow: 2.16.1
tensorflow_probability: None

@helloplayer1 helloplayer1 added the bug Something isn't working label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant