Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-label Classification task #2

Open
abdullahshafin opened this issue Mar 21, 2019 · 17 comments
Open

Multi-label Classification task #2

abdullahshafin opened this issue Mar 21, 2019 · 17 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@abdullahshafin
Copy link

I just wanted to know if this can be applied to the case where there is multi-label classification problem with sigmoid output activation unit. As in, there are multiple labels that can be 1 at the same time and hence, the sum of probabilities is not necessarily equal to 1 (as is the case with softmax).

I came to your repo from this issue. Please let me know which loss function I can use in this scenario. I actually saw the code but wasn't entirely sure that the binary_focal_loss function is suitable in this problem. It looked to me as if it's only for binary classification and not for multi-label classification task.

@abdullahshafin abdullahshafin changed the title Multi-label Classification task Multi-label Classification task label:question Mar 21, 2019
@umbertogriffo umbertogriffo added the question Further information is requested label Mar 22, 2019
@umbertogriffo
Copy link
Owner

Hi @abdullahshafin, you can't apply this to the case where there is multi-label classification problem with sigmoid output activation unit. Only support the case with softmax.

@abdullahshafin abdullahshafin changed the title Multi-label Classification task label:question Multi-label Classification task Mar 22, 2019
@abdullahshafin
Copy link
Author

Hi @umbertogriffo

Thanks for the reply!

Do you mean using softmax for multi-label classification (like facebook paper)? It's still a bit unclear. Normally, softmax is not used for multi-label classification. Can you explain what inputs you expect for your two functions binary_focal_loss and categorical_focal_loss? Do you expect only 2 classes (binary) or does it work for more than 2 classes?

From my understanding, when talking about multiple target classes, keras uses the term binary_crossentropy (keras.losses.binary_crossentropy) for multi-label classification tasks where the output activation unit should then be sigmoid. And categorical_crossentropy (keras.losses.categorical_crossentropy) is used for multi-class classification tasks with softmax output activation unit.

Just to be sure that we are both on the same page, I will explain below what I mean with multi-label and multi-class classification terminologies.

In multi-label classification with 3 classes and 5 examples, the target vector would look like:

[0 0 1
 1 0 1
 0 0 0
 0 1 1
 0 1 0]

Target vector for multi-class classification for a similar configuration would look like:

[0 0 1
 1 0 0
 0 1 0
 0 1 0
 1 0 0]

@umbertogriffo
Copy link
Owner

@abdullahshafin you're absolutely right.
I meant that you can apply this to the multi-class classification problem with softmax output activation unit. multi-label classification isn't supported yet.

@abdullahshafin
Copy link
Author

@umbertogriffo thanks a lot for your reply and for the clarification. I tried to use your code with a few modifications for multi-label classification. After looking at the code in detail, I strongly believe it should work. However since I am not getting good results at my classification task, I cannot verify that yet. Actually, my task is already quite difficult to learn and I haven't had success learning using weighted binary CE loss either.

Once I try it on some other task and I can verify that it works/does not work for multi-label classification, I will update here.

@umbertogriffo
Copy link
Owner

@abdullahshafin thanks for open this issue. Would be absolutely great if you help me to adapt the code for the multi-label classification. I'll try to find a task that you can use for your experiments.

@umbertogriffo umbertogriffo added the enhancement New feature or request label Mar 26, 2019
@abdullahshafin
Copy link
Author

abdullahshafin commented Apr 3, 2019

@umbertogriffo Sorry, I've been busy in verifying if my approach was indeed right or not. It seems as of now, my loss function is not correct. Once I have the correct focal loss implementation for multi-label classification, I will definitely share it.
For now, I am trying to approach the problem using other methods like 1) Weighted Binary CE loss 2) Under-/over-sampling the dataset.

@umbertogriffo
Copy link
Owner

umbertogriffo commented Apr 4, 2019

@abdullahshafin don't worry, let me know if I can help you somehow.

@oleksandrlazariev
Copy link

@abdullahshafin you could just remove K.sum from the final return statement. That should work for multi-label classification task

@xingyi-li
Copy link

@oleksandrlazariev Hi, I' m interested in your statement, but I don't clearly understand what you mean, would you explain your idea in detail? Thanks a lot!

@bryanmooremd
Copy link

@umbertogriffo My understanding is that with alpha = 1 and gamma = 0, then the focal loss should produce identical results to cross entropy. However, when I compile with loss=[categorical_focal_loss(alpha=.25, gamma=2)] vs loss = sparse_categorical_crossentropy, I get very different results. Have you directly compared the two and can you comment? I have 0/1 labels that are not one-hot-encoded.

@jizhang02
Copy link

Hello, in multi-class loss function?
Do we need to do one-hot encoding?

@talhaanwarch
Copy link

i am just checking if focal loss for multilabel classification has been implemented or not

@Sandeep418
Copy link

@umbertogriffo thanks a lot for your reply and for the clarification. I tried to use your code with a few modifications for multi-label classification. After looking at the code in detail, I strongly believe it should work. However since I am not getting good results at my classification task, I cannot verify that yet. Actually, my task is already quite difficult to learn and I haven't had success learning using weighted binary CE loss either.

Once I try it on some other task and I can verify that it works/does not work for multi-label classification, I will update here.

Hi @abdullahshafin have you succeed to change to multilabel classification.

@gnai
Copy link

gnai commented Jun 23, 2020

Hello, in multi-class loss function?
Do we need to do one-hot encoding?

As far as I know yes, check this link, it might be useful :
https://www.depends-on-the-definition.com/guide-to-multi-label-classification-with-neural-networks/

@umbertogriffo
Copy link
Owner

@umbertogriffo My understanding is that with alpha = 1 and gamma = 0, then the focal loss should produce identical results to cross entropy. However, when I compile with loss=[categorical_focal_loss(alpha=.25, gamma=2)] vs loss = sparse_categorical_crossentropy, I get very different results. Have you directly compared the two and can you comment? I have 0/1 labels that are not one-hot-encoded.

There was a bug that has been fixed.

@thusinh1969
Copy link

Try this https://www.programmersought.com/article/60001511310/
Both binary, multi-class and multi-label. It seems to work for me.

Steve

@longsc2603
Copy link

Hi, I'm sorry that I bump this old thread. But I come across this repo of yours and wonder if I can apply for my case. I'm having a BiLSTM + CRF model that has the output shape like this: (None, sequence_length, num_class). Since it is a CRF-extended model (I'm using keras-contrib for CRF layer btw), the output is one-hot-encoded. So I can not really use class_weight in model.fit, is there any ways I can use this loss for my case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests