You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question regarding the LLM section of the documentation. In the case of encoder-decoder models, is one supposed to treat the encoder and decoder as separate models as shown with Whisper? Should there be one set of keys prefixed with [llm].encoder and another with [llm].decoder?
In the case of BART for example, should [llm].attention.head_count be implemented as these two keys: bart.encoder.attention.head_count and bart.decoder.attention.head_count?
Whatever the case, I think a section in the documentation clarifying this would be beneficial.
The text was updated successfully, but these errors were encountered:
I have a question regarding the
LLM
section of the documentation. In the case of encoder-decoder models, is one supposed to treat the encoder and decoder as separate models as shown with Whisper? Should there be one set of keys prefixed with[llm].encoder
and another with[llm].decoder
?In the case of BART for example, should
[llm].attention.head_count
be implemented as these two keys:bart.encoder.attention.head_count
andbart.decoder.attention.head_count
?Whatever the case, I think a section in the documentation clarifying this would be beneficial.
The text was updated successfully, but these errors were encountered: