It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss #131

y78h11b09 · 2020-03-17T07:16:39Z

Recently, I found that efficientDet-do didn't work for 1600 objects because cls_loss still didn't downgrade in training.

So , i modified the funtion **"torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)"
into "torch.clamp(classification, min=1e-8, max=1.0 - 1e-8)" in focol_loss,
and then efficientDet-d0 can be traind on for 1600 objects.
i
Who can tell me what advantage is the funtion of torch.clamp(） in focal_loss？ i think it should be removed completely!

rmcavoy · 2020-03-25T18:37:18Z

The clamp function probably improves stability in some cases but is very much unnecessary as you can switch to using the "with logits" version of the focal loss as is used in the tensorflow version of the official code (quote below from the official code's comments)

# Below are comments/derivations for computing modulator.

# For brevity, let x = logits,  z = targets, r = gamma, and p_t = sigmod(x)
# for positive samples and 1 - sigmoid(x) for negative examples.
#
# The modulator, defined as (1 - P_t)^r, is a critical part in focal loss
# computation. For r > 0, it puts more weights on hard examples, and less
# weights on easier ones. However if it is directly computed as (1 - P_t)^r,
# its back-propagation is not stable when r < 1. The implementation here
# resolves the issue.
#
# For positive samples (labels being 1),
#    (1 - p_t)^r
#  = (1 - sigmoid(x))^r
#  = (1 - (1 / (1 + exp(-x))))^r
#  = (exp(-x) / (1 + exp(-x)))^r
#  = exp(log((exp(-x) / (1 + exp(-x)))^r))
#  = exp(r * log(exp(-x)) - r * log(1 + exp(-x)))
#  = exp(- r * x - r * log(1 + exp(-x)))
#
# For negative samples (labels being 0),
#    (1 - p_t)^r
#  = (sigmoid(x))^r
#  = (1 / (1 + exp(-x)))^r
#  = exp(log((1 / (1 + exp(-x)))^r))
#  = exp(-r * log(1 + exp(-x)))
#
# Therefore one unified form for positive (z = 1) and negative (z = 0)
# samples is:
#      (1 - p_t)^r = exp(-r * z * x - r * log(1 + exp(-x))).

y78h11b09 · 2020-03-25T23:58:53Z

Good job, thank you very much , i will try it Have a nice day to you [email protected] From: rmcavoy Date: 2020-03-26 02:37 To: toandaominh1997/EfficientDet.Pytorch CC: y78h11b09; Author Subject: Re: [toandaominh1997/EfficientDet.Pytorch] It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss (#131) The clamp function probably improves stability in some cases but is very much unnecessary as you can switch to using the "with logits" version of the focal loss as is used in the tensorflow version of the official code (quote below from the official code's comments) # Below are comments/derivations for computing modulator. # For brevity, let x = logits, z = targets, r = gamma, and p_t = sigmod(x) # for positive samples and 1 - sigmoid(x) for negative examples. # # The modulator, defined as (1 - P_t)^r, is a critical part in focal loss # computation. For r > 0, it puts more weights on hard examples, and less # weights on easier ones. However if it is directly computed as (1 - P_t)^r, # its back-propagation is not stable when r < 1. The implementation here # resolves the issue. # # For positive samples (labels being 1), # (1 - p_t)^r # = (1 - sigmoid(x))^r # = (1 - (1 / (1 + exp(-x))))^r # = (exp(-x) / (1 + exp(-x)))^r # = exp(log((exp(-x) / (1 + exp(-x)))^r)) # = exp(r * log(exp(-x)) - r * log(1 + exp(-x))) # = exp(- r * x - r * log(1 + exp(-x))) # # For negative samples (labels being 0), # (1 - p_t)^r # = (sigmoid(x))^r # = (1 / (1 + exp(-x)))^r # = exp(log((1 / (1 + exp(-x)))^r)) # = exp(-r * log(1 + exp(-x))) # # Therefore one unified form for positive (z = 1) and negative (z = 0) # samples is: # (1 - p_t)^r = exp(-r * z * x - r * log(1 + exp(-x))). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss #131

It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss #131

y78h11b09 commented Mar 17, 2020 •

edited

rmcavoy commented Mar 25, 2020

y78h11b09 commented Mar 25, 2020 via email

It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss #131

It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss #131

Comments

y78h11b09 commented Mar 17, 2020 • edited

rmcavoy commented Mar 25, 2020

y78h11b09 commented Mar 25, 2020 via email

y78h11b09 commented Mar 17, 2020 •

edited