Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Idea: Weights #160

Open
cdbcheng opened this issue Jul 16, 2023 · 3 comments
Open

Feature Idea: Weights #160

cdbcheng opened this issue Jul 16, 2023 · 3 comments

Comments

@cdbcheng
Copy link

Have you considered adding support for sampling weights to the package? This would help when dealing with weighted survey samples, especially with the MCA (since most surveys consist of multiple choice questions).

Thanks!

@MaxHalford
Copy link
Owner

Hey there. Yes, this is very relevant. It's a reasonably big undertaking though. It took me time to figure out and test the current non-weighted implementations. But I'm sure it's doable. One would have to start with PCA, then CA, then MCA.

@cdbcheng
Copy link
Author

The following might (or might not) be helpful to start:

Mathematical notation and some examples of how to implement weighted and eigenvalue PCA, relying only on numpy and scikit-learn. However, there is nothing for CA and MCA, and I believe (but don't quote me on this) that it is possible to conduct WPCA without repeating rows, since it is possible to calculate weighted variance without repeating rows.
https://github.com/nogilnick/WeightedPCA

Let me know if there's anything else that could help!

@MaxHalford
Copy link
Owner

For sure I believe we want an implementation which does not require duplicating rows. That would be neither practical or elegant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants