bad results when use models downloaded from huggingface #5

beanandrew · 2021-11-04T08:46:14Z

Hi,
I try to reproduct your work with pytorch BERT model download from huggingface, only to get a very bad result, the training loss keeps around 1.0 in the first 10 epochs. But when I follow your instruction, download google BERT model and converte it with the helper script, then the training process seems to go well.
I wonder why this is happening? Is this because these two models are very different?
Huggingface download link here: https://huggingface.co/bert-base-uncased/tree/main

frankaging · 2021-11-04T18:50:34Z

Hi,
Thanks for your comments. I believe the reason is about the variable namings.

If you look at this line of code in the training set-up https://github.com/frankaging/Quasi-Attention-ABSA/blob/main/code/util/train_helper.py#L300,
model.bert.load_state_dict(torch.load(init_checkpoint, map_location='cpu'), strict=False)
this load_state_dict will load parameters based on names. I think, with the current code, the HuggingFace model has different names for all variables in BERT. As a result, you are not loading any weights from the pre-trained BERT. You can check this by simply printing out weights before this line and after this line. And you will see what I am talking about, I think.

There are two solutions: (1) using the google one for pre-trained weights importing like what you are doing. (2) change the code to integrate with both models. The second approach will require you to modify the variable namings of the model.

Does this make sense?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bad results when use models downloaded from huggingface #5

bad results when use models downloaded from huggingface #5

beanandrew commented Nov 4, 2021

frankaging commented Nov 4, 2021 •

edited

bad results when use models downloaded from huggingface #5

bad results when use models downloaded from huggingface #5

Comments

beanandrew commented Nov 4, 2021

frankaging commented Nov 4, 2021 • edited

frankaging commented Nov 4, 2021 •

edited