Question about documentation with encoder-decoder models #740

NatanFreeman · 2024-02-20T07:20:51Z

I have a question regarding the LLM section of the documentation. In the case of encoder-decoder models, is one supposed to treat the encoder and decoder as separate models as shown with Whisper? Should there be one set of keys prefixed with [llm].encoder and another with [llm].decoder?

In the case of BART for example, should [llm].attention.head_count be implemented as these two keys: bart.encoder.attention.head_count and bart.decoder.attention.head_count?

Whatever the case, I think a section in the documentation clarifying this would be beneficial.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about documentation with encoder-decoder models #740

Question about documentation with encoder-decoder models #740

NatanFreeman commented Feb 20, 2024

Question about documentation with encoder-decoder models #740

Question about documentation with encoder-decoder models #740

Comments

NatanFreeman commented Feb 20, 2024