Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM during finetuning #77

Open
Shiro-LK opened this issue Jun 17, 2020 · 0 comments
Open

OOM during finetuning #77

Shiro-LK opened this issue Jun 17, 2020 · 0 comments

Comments

@Shiro-LK
Copy link

Shiro-LK commented Jun 17, 2020

Hi,

Thank you for sharing your repo.

I am trying to finetune a LM with multifit on custom dataset and then finetune the classifier for prediction. Unfortunately I got an OOM after few steps with multifit during the training of the CLS.
I tried to first train the LM then close the session to clean the gpu memory and then train the classifier (loading the encoder weights if I am not wrong in my code) but it does not help. I can not use the same batch size. Is it normal or am I doing something wrong ?
PS : bs = 256
`---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in ()
3 learn_cls_fwd.load_encoder("encoder_lm_fr_fwd")
4 learn_cls_fwd.freeze()
----> 5 learn_cls_fwd.fit_one_cycle(3)
6 learn_cls_fwd.save("multifit_cls_pretrained_fr")

9 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py in (.0)
253 def concat(self, arrs:Sequence[Sequence[Tensor]])->List[Tensor]:
254 "Concatenate the arrs along the batch dimension."
--> 255 return [torch.cat([l[si] for l in arrs], dim=1) for si in range_of(arrs[0])]
256
257 def reset(self):

RuntimeError: CUDA out of memory. Tried to allocate 1.02 GiB (GPU 0; 15.90 GiB total capacity; 12.72 GiB already allocated; 599.88 MiB free; 14.61 GiB reserved in total by PyTorch)`

My piece of code :

# pretrained LM
if pretrained_lm:
  data_lm_fwd = (TextList.from_df(lm_tr.iloc[:10000], path, cols='comment_text', **fa_config)
                  .split_by_rand_pct(0.05, seed=42)
                  .label_for_lm()
                  .databunch(bs=bs, num_workers=4))
  data_lm_fwd.save("fr_data_lm_forward")
if pretrained_lm:
  learn_fwd = exp.finetune_lm.get_learner(data_lm_fwd)
  learn_fwd.model.cuda()

  learn_fwd.lr_find()
  learn_fwd.recorder.plot()

# learn is a preconfigured fastai learner with a pretrained model loaded
if pretrained_lm:
  learn_fwd.fit_one_cycle(2)
  learn_fwd.unfreeze()
  for i in range(5):
    learn_fwd.fit_one_cycle(2)
    learn_fwd.save_encoder("encoder_lm_fr_fwd")

# cls

if pretrained_cls:
  data_cls = (TextList.from_df(tr1, path, cols="comment_text", **fa_config)
      .split_from_df(col="val")
      .label_from_df(cols="toxic")
      .databunch(bs=64, num_workers=2))

if pretrained_cls:
  learn_cls_fwd = exp.classifier.get_learner(data_cls)#, metrics=[AUROC])
  learn_cls_fwd.load_encoder("encoder_lm_fr_fwd")
  learn_cls_fwd.freeze()
  learn_cls_fwd.fit_one_cycle(3)
  learn_cls_fwd.save("multifit_cls_pretrained_fr")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant