How to correctly provide padding tokens to forward pass of pretrained model? #36

josiahbjorgaard · 2023-12-19T23:43:57Z

Hi there, thanks for this repo and the pretrained models.

I have a question on batching sequences of varying length. I've found the padding token and tokenizer to work effectively, but I see no input of an attention mask to the forward pass of the model.

I've tried passing a padded sequence, e.g. padded with 4s as output by the tokenizer, and a non-padded sequence. The resulting embeddings of at least the last few tokens are very different between these two examples.

The common pattern is to also provide an attention mask. I try to pass this like model(input_ids, attn_mask=attn_mask) but it this isn't how it's set up. I looked through the source code and can't find the way that an attention mask mechanism would work in it.

Is there a way to batch sequences of varying length and how should I do this?

exnx · 2024-02-12T18:13:31Z

Hyena doesn't support attention masks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to correctly provide padding tokens to forward pass of pretrained model? #36

How to correctly provide padding tokens to forward pass of pretrained model? #36

josiahbjorgaard commented Dec 19, 2023

exnx commented Feb 12, 2024

How to correctly provide padding tokens to forward pass of pretrained model? #36

How to correctly provide padding tokens to forward pass of pretrained model? #36

Comments

josiahbjorgaard commented Dec 19, 2023

exnx commented Feb 12, 2024