Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to transform on new unseen test data? #161

Open
tricha-zemoso opened this issue Jul 21, 2023 · 3 comments
Open

How to transform on new unseen test data? #161

tricha-zemoso opened this issue Jul 21, 2023 · 3 comments

Comments

@tricha-zemoso
Copy link

I am able to fit the MCA model to training data. I would like to use the model then to find the row coordinates of new unseen data. The transform function is not working for unseen data as I get a keyerror. Is it possible to "transform" new data using model fitted to training data, like the sklearn transformation functions?
Please help me to understand.

@MaxHalford
Copy link
Owner

Hello. Can you provide a reproducible example?

@tricha-zemoso
Copy link
Author

tricha-zemoso commented Jul 24, 2023

data.csv
Python Code:

from prince import MCA
import pandas as pd

data = pd.read_csv("data.csv", index_col=[0])
mca = MCA(n_components=10, n_iter=3, copy=True, check_input=True, engine='sklearn',random_state=42)
mca = mca.fit(data[:3])
print(mca.eigenvalues_summary)
print(mca.row_coordinates(data[:3]))
print(mca.transform(data[3:]))

Error:
KeyError: "['category_Application Access', '[email protected]', '[email protected]', 'location_San Jose, CA, United States', 'applicationname_B', 'applicationname_C', 'browser_Other'] not in index"

@MaxHalford
Copy link
Owner

Indeed, MCA does not yet work with rows/columns which have not been seen before. It's on my TODO, but I don't know when I'll tackle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants