Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to correctly provide padding tokens to forward pass of pretrained model? #36

Open
josiahbjorgaard opened this issue Dec 19, 2023 · 1 comment

Comments

@josiahbjorgaard
Copy link

Hi there, thanks for this repo and the pretrained models.

I have a question on batching sequences of varying length. I've found the padding token and tokenizer to work effectively, but I see no input of an attention mask to the forward pass of the model.

I've tried passing a padded sequence, e.g. padded with 4s as output by the tokenizer, and a non-padded sequence. The resulting embeddings of at least the last few tokens are very different between these two examples.

The common pattern is to also provide an attention mask. I try to pass this like model(input_ids, attn_mask=attn_mask) but it this isn't how it's set up. I looked through the source code and can't find the way that an attention mask mechanism would work in it.

Is there a way to batch sequences of varying length and how should I do this?

@exnx
Copy link
Collaborator

exnx commented Feb 12, 2024

Hyena doesn't support attention masks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants