Add readme for the mlm study #3

amarasovic · 2020-04-21T18:00:00Z

We want to report issues that could affect the reproducibility of the masked LM loss calculation at test time.

First, we do not get exactly the same results reported in Table 3 of paper when we use the fairseq library instead of the transformers library, after we convert the transformers checkpoint to a fairseq checkpoint.

A related pull request was opened and closed, but did not fix our problem. Second, the results in Table 3 are calculated using the batch size of 1. With the batch sizes larger than 1, we do not get the same results. In particular, the results change for a sample of reviews. As we have already mentioned, reviews are much shorter than documents from other domains. Therefore, unlike documents in other domains that are usually of the maximum length, reviews need to be padded to the maximum length. For this reason, we suspect that padding somehow influences the masked LM loss calculation. However, with the batch size of 1 we do not need to pad, and therefore we find results in Table 3 reliable.

kernelmachine added the help wanted Extra attention is needed label Apr 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add readme for the mlm study #3

Add readme for the mlm study #3

amarasovic commented Apr 21, 2020

Add readme for the mlm study #3

Add readme for the mlm study #3

Comments

amarasovic commented Apr 21, 2020