You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
现在我对p-tuning/modeling_chatglm.py中第850行代码起添加了如下代码:
`
if self.pre_seq_len is not None:
for param in self.parameters():
param.requires_grad = False
self.prefix_tokens = torch.arange(self.pre_seq_len).long()
self.prefix_encoder = PrefixEncoder(config)
self.dropout = torch.nn.Dropout(0.1)
for k, v in self.prefix_encoder.named_parameters():
v.requires_grad = False
for k, v in self.layers[0].named_parameters():
v.requires_grad = True
`
在继续微调时会导致梯度爆炸,loss出现nan。(LR修改成了全量微调时的1e-4)
Expected Behavior
No response
Steps To Reproduce
在p-tuning/modeling_chatglm.py中第850行代码起添加如下代码: for k, v in self.prefix_encoder.named_parameters(): v.requires_grad = False for k, v in self.layers[0].named_parameters(): v.requires_grad = True
Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Anything else?
No response
The text was updated successfully, but these errors were encountered:
Is there an existing issue for this?
Current Behavior
现在我对p-tuning/modeling_chatglm.py中第850行代码起添加了如下代码:
`
if self.pre_seq_len is not None:
for param in self.parameters():
param.requires_grad = False
self.prefix_tokens = torch.arange(self.pre_seq_len).long()
self.prefix_encoder = PrefixEncoder(config)
self.dropout = torch.nn.Dropout(0.1)
`
在继续微调时会导致梯度爆炸,loss出现nan。(LR修改成了全量微调时的1e-4)
Expected Behavior
No response
Steps To Reproduce
在p-tuning/modeling_chatglm.py中第850行代码起添加如下代码:
for k, v in self.prefix_encoder.named_parameters(): v.requires_grad = False for k, v in self.layers[0].named_parameters(): v.requires_grad = True
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: