Weird behaviour in init #221

wdevazelhes · 2019-06-14T12:38:32Z

In the auto init, there is a misleading behaviour that needs to be addressed: if I specify init='lda' , with a number of components of n_features, (and I have n_classes < n_features for isntance), lda can still be used, and a transformer that does dimensionality reduction will be returned (the number of output components will be equal to n_classes -1). This happens silently

Here is an example (on the current master)

from metric_learn import LMNN
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=100, n_classes=3,
                           n_informative=3, class_sep=4., n_features=5,
                           n_redundant=0)

lmnn = LMNN(init='lda', n_components=5)
lmnn.fit(X, y)
print(lmnn.transformer_.shape[0])  # is 2 not 5

The text was updated successfully, but these errors were encountered:

bellet · 2019-06-14T17:57:24Z

There are different options to fix this:

show a warning saying that LDA init enforces n_components=n_classes-1
throw an error saying that LDA init is not compatible with the provided value of n_components
use LDA to initialize the n_classes-1 first components and set the other to zero

Maybe the latter actually makes more sense? in this case we may want to change a bit the 'auto' behavior?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird behaviour in init #221

Weird behaviour in init #221

wdevazelhes commented Jun 14, 2019 •

edited

bellet commented Jun 14, 2019 •

edited

Weird behaviour in init #221

Weird behaviour in init #221

Comments

wdevazelhes commented Jun 14, 2019 • edited

bellet commented Jun 14, 2019 • edited

wdevazelhes commented Jun 14, 2019 •

edited

bellet commented Jun 14, 2019 •

edited