For drmm model, dense_output should be flipped at dim=-1? #144

wangcongcong123 · 2020-05-22T16:31:57Z

Hi,

In drmm model,

Should it be
x = torch.einsum('bl,bl->b', torch.flip(dense_output,(-1,)), attention_probs)

Instead of
x = torch.einsum('bl,bl->b', dense_output, attention_probs)

After I revise this, I got the training loss reduction much faster than this as in
https://github.com/NTMC-Community/MatchZoo-py/blob/master/tutorials/ranking/drmm.ipynb

Below are my training logs:

Epoch 1/10: 100%|██████████| 319/319 [02:36<00:00, 6.02it/s, loss=2.167][Iter-319 Loss-2.134]:
Validation: normalized_discounted_cumulative_gain@3(0.0): 0.5825 - normalized_discounted_cumulative_gain@5(0.0): 0.6421 - mean_average_precision(0.0): 0.6019
Epoch 1/10: 100%|██████████| 319/319 [02:44<00:00, 1.93it/s, loss=2.167]
Epoch 2/10: 100%|█████████▉| 318/319 [01:11<00:00, 6.27it/s, loss=1.729][Iter-638 Loss-1.776]:
Epoch 2/10: 100%|██████████| 319/319 [01:19<00:00, 4.02it/s, loss=0.877]
0%| | 0/319 [00:00<?, ?it/s] Validation: normalized_discounted_cumulative_gain@3(0.0): 0.5726 - normalized_discounted_cumulative_gain@5(0.0): 0.6363 - mean_average_precision(0.0): 0.589

wangcongcong123 · 2020-05-22T16:38:24Z

My understand is that query is padded to the left while match_hist is to the right. When calculating the einsum between them, should this be consistent ?

faneshion · 2020-09-23T02:30:43Z

Thanks for your advices, we will check it right now!

Chriskuei · 2020-09-23T02:58:14Z

Hi @wangcongcong123, the pad_mode of text_left and match_histogram is consistent.

MatchZoo-py/matchzoo/dataloader/callbacks/padding.py

Lines 226 to 239 in 49548ad

 if key == 'text_left': 

 padded_value = np.full([batch_size, pad_length_left], 

 self._pad_value, dtype=dtype) 

 _padding_2D(value, padded_value, self._pad_mode) 

 elif key == 'text_right': 

 padded_value = np.full([batch_size, pad_length_right], 

 self._pad_value, dtype=dtype) 

 _padding_2D(value, padded_value, self._pad_mode) 

 else: # key == 'match_histogram' 

 padded_value = np.full( 

 [batch_size, pad_length_left, bin_size], 

 self._pad_value, dtype=dtype) 

 _padding_3D(value, padded_value, self._pad_mode) 

 x[key] = padded_value

wangcongcong123 added the bug Something isn't working label May 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For drmm model, dense_output should be flipped at dim=-1? #144

For drmm model, dense_output should be flipped at dim=-1? #144

wangcongcong123 commented May 22, 2020

wangcongcong123 commented May 22, 2020

faneshion commented Sep 23, 2020

Chriskuei commented Sep 23, 2020

For drmm model, dense_output should be flipped at dim=-1? #144

For drmm model, dense_output should be flipped at dim=-1? #144

Comments

wangcongcong123 commented May 22, 2020

wangcongcong123 commented May 22, 2020

faneshion commented Sep 23, 2020

Chriskuei commented Sep 23, 2020