Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behaviour in init #221

Open
wdevazelhes opened this issue Jun 14, 2019 · 1 comment
Open

Weird behaviour in init #221

wdevazelhes opened this issue Jun 14, 2019 · 1 comment

Comments

@wdevazelhes
Copy link
Member

wdevazelhes commented Jun 14, 2019

In the auto init, there is a misleading behaviour that needs to be addressed: if I specify init='lda' , with a number of components of n_features, (and I have n_classes < n_features for isntance), lda can still be used, and a transformer that does dimensionality reduction will be returned (the number of output components will be equal to n_classes -1). This happens silently

Here is an example (on the current master)

from metric_learn import LMNN
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=100, n_classes=3,
                           n_informative=3, class_sep=4., n_features=5,
                           n_redundant=0)

lmnn = LMNN(init='lda', n_components=5)
lmnn.fit(X, y)
print(lmnn.transformer_.shape[0])  # is 2 not 5
@bellet
Copy link
Member

bellet commented Jun 14, 2019

There are different options to fix this:

  • show a warning saying that LDA init enforces n_components=n_classes-1
  • throw an error saying that LDA init is not compatible with the provided value of n_components
  • use LDA to initialize the n_classes-1 first components and set the other to zero

Maybe the latter actually makes more sense? in this case we may want to change a bit the 'auto' behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants