-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For drmm model, dense_output should be flipped at dim=-1? #144
Labels
bug
Something isn't working
Comments
My understand is that query is padded to the left while match_hist is to the right. When calculating the einsum between them, should this be consistent ? |
Thanks for your advices, we will check it right now! |
Hi @wangcongcong123, the MatchZoo-py/matchzoo/dataloader/callbacks/padding.py Lines 226 to 239 in 49548ad
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
In drmm model,
Should it be
x = torch.einsum('bl,bl->b', torch.flip(dense_output,(-1,)), attention_probs)
Instead of
x = torch.einsum('bl,bl->b', dense_output, attention_probs)
After I revise this, I got the training loss reduction much faster than this as in
https://github.com/NTMC-Community/MatchZoo-py/blob/master/tutorials/ranking/drmm.ipynb
Below are my training logs:
Epoch 1/10: 100%|██████████| 319/319 [02:36<00:00, 6.02it/s, loss=2.167][Iter-319 Loss-2.134]:
Validation: normalized_discounted_cumulative_gain@3(0.0): 0.5825 - normalized_discounted_cumulative_gain@5(0.0): 0.6421 - mean_average_precision(0.0): 0.6019
Epoch 1/10: 100%|██████████| 319/319 [02:44<00:00, 1.93it/s, loss=2.167]
Epoch 2/10: 100%|█████████▉| 318/319 [01:11<00:00, 6.27it/s, loss=1.729][Iter-638 Loss-1.776]:
Epoch 2/10: 100%|██████████| 319/319 [01:19<00:00, 4.02it/s, loss=0.877]
0%| | 0/319 [00:00<?, ?it/s] Validation: normalized_discounted_cumulative_gain@3(0.0): 0.5726 - normalized_discounted_cumulative_gain@5(0.0): 0.6363 - mean_average_precision(0.0): 0.589
The text was updated successfully, but these errors were encountered: