Extensions #15

pythonometrist · 2019-09-24T18:55:26Z

Thanks to your help - I have added custom losses, special initialization and a bunch of other things as extensions.

I am now trying to mess with the sentence classification model itself. It is a linear layer on top of the bert model. What I would like to do is a) freeze all of bert. b) add a cnn over and above. https://github.com/Shawn1993/cnn-text-classification-pytorch/blob/master/model.py

I ant to compare results with a fozen and unfrozen bert. Any pointers would be most appreciated.

ThilinaRajapakse · 2019-09-24T19:03:22Z

Should be pretty similar to adding custom losses. You can freeze all the layers by setting requires_grad = False for all of them in your subclassed model. You can add your convolutional layers to it as well, and define how you want them to be used in the forward method.
Hopefully, it won't mess with loading the weights from the pretrained model. I don't think it will.

pythonometrist · 2019-09-24T19:14:42Z

Cool - let me try it out . While config.hidden_size is the size of the last layer from bert (and in some sense the size of my embedding, I guess I am struggling to figure out the size of vocabulary. It's probably the Bert vocabulary size hiding somewhere in the config. max_seq_length is user specified so we already can assume padded sequences.Agreed the rest is carefully initializing the model and writing up the forward correctly... (which might be non trivial for me!) Let me get back to you. Thanks.

ThilinaRajapakse · 2019-09-24T19:21:23Z

If it doesn't work, you can always decouple BERT and the CNN and just feed the BERT outputs to the CNN.

I'm no expert myself, but you seem to be doing fine to me!

pythonometrist · 2019-09-24T23:18:35Z

Well - I got a model to work with some simple linear layers. So that is progress. I need to work out tensor sizes - bert is sending out tensors (64x768) - where 64 is batch size. I assume for each sentence I am receiving once embedding of size 768. I 've got to figure out how to go from there to a Vocabulary x Document matrix - I think it means that somewhere BERT is averaging over the words. OR I simply ned to forget about word embeddings and simply do a 1D convolution at the document level....will think some more and update.

pythonometrist · 2019-09-26T15:05:01Z

You da boss. Yep can do all sorts of models once you realize they offer up access to all layers to convolve /lstm over. I am curious if you know about the apex installation - one seems to be pure python vs the other uses c compiler - which one do you use?

ThilinaRajapakse · 2019-09-26T15:11:48Z

Great!

I use the Apex version with C++ extensions. The pure python version is lacking a few features. I don't see any reason not to use the C++ version.

pythonometrist · 2019-09-27T00:28:59Z

I am having some issue with apex on a debian server....well fingers crossed. Thanks for all the input! i had been wanting to get into pytorch for a while and now I am in!

ThilinaRajapakse · 2019-09-27T02:00:09Z

Odd. I never had issues with any Ubuntu based distros.

Welcome to Pytorch!

pythonometrist · 2019-09-27T05:40:53Z

Thanks - its a server which is stuck on pip 8.1 . But looks like i could get it to wirk with conda. fingers crossed.

pythonometrist · 2019-09-28T01:31:19Z

Ok it works with conda!!! - should apex batchnorm 32 be True? and O1 vs O2 - which way worked for you?

ThilinaRajapakse · 2019-09-28T02:22:12Z

I don't think I changed batchnorm. Doesn't it get set when you change the opt level? I used opt 1. Opt 2 was giving me NaN losses.

pythonometrist · 2019-09-28T03:38:46Z

Defaults for this optimization level are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic

Tht is the default when I run the models - not sure if that should be something else.keep_batchnorm_fp32 : None , I'll dig around and report.

ThilinaRajapakse · 2019-09-28T03:40:00Z

Yeah, I just kept the defaults there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extensions #15

Extensions #15

pythonometrist commented Sep 24, 2019

ThilinaRajapakse commented Sep 24, 2019

pythonometrist commented Sep 24, 2019

ThilinaRajapakse commented Sep 24, 2019

pythonometrist commented Sep 24, 2019 •

edited

pythonometrist commented Sep 26, 2019

ThilinaRajapakse commented Sep 26, 2019

pythonometrist commented Sep 27, 2019

ThilinaRajapakse commented Sep 27, 2019

pythonometrist commented Sep 27, 2019

pythonometrist commented Sep 28, 2019

ThilinaRajapakse commented Sep 28, 2019

pythonometrist commented Sep 28, 2019

ThilinaRajapakse commented Sep 28, 2019

Extensions #15

Extensions #15

Comments

pythonometrist commented Sep 24, 2019

ThilinaRajapakse commented Sep 24, 2019

pythonometrist commented Sep 24, 2019

ThilinaRajapakse commented Sep 24, 2019

pythonometrist commented Sep 24, 2019 • edited

pythonometrist commented Sep 26, 2019

ThilinaRajapakse commented Sep 26, 2019

pythonometrist commented Sep 27, 2019

ThilinaRajapakse commented Sep 27, 2019

pythonometrist commented Sep 27, 2019

pythonometrist commented Sep 28, 2019

ThilinaRajapakse commented Sep 28, 2019

pythonometrist commented Sep 28, 2019

ThilinaRajapakse commented Sep 28, 2019

pythonometrist commented Sep 24, 2019 •

edited