Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two questions about denoising design, which might effect the performace #343

Open
DeclK opened this issue Mar 7, 2024 · 0 comments
Open

Comments

@DeclK
Copy link

DeclK commented Mar 7, 2024

Hi, I love the detrex project, such a great work! I am reading the DINO codes, and I have 2 questions about the denoising design:

  1. When making the denoising query, it will pad some zero to make it a batch. But in the calculation, it seems the padded zero will be calculated in attention, because the attention mask does not consider the zero padding. Since the performance of DINO and DN-DETR are great, so it makes me think, how much would this design affect the network. Have you guys tried to consider this masking?
  2. When doing contrastive denoising, the negative noise would be large, according to the code, there might be a change that there might be invalid negative boxes, because the w & h could be less than 0, is this also part of the plan?
    rand_sign = (
    torch.randint_like(known_bboxs, low=0, high=2, dtype=torch.float32) * 2.0 - 1.0
    )
    rand_part = torch.rand_like(known_bboxs)
    rand_part[negative_idx] += 1.0
    rand_part *= rand_sign
    known_bbox_ = known_bbox_ + torch.mul(rand_part, diff).cuda() * box_noise_scale
    known_bbox_ = known_bbox_.clamp(min=0.0, max=1.0)
    known_bbox_expand[:, :2] = (known_bbox_[:, :2] + known_bbox_[:, 2:]) / 2
    known_bbox_expand[:, 2:] = known_bbox_[:, 2:] - known_bbox_[:, :2]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant