New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing model layers throws an index error. #30508
Comments
@candemircan I'm pretty sure I ran into something similar when trying to chop/remove the majority of a model for local dev and it makes what you are trying to do somewhat impossible. Seems like there are two possibilities, one is figure out what args you need to pass in that possibly allow you to use the model without much modifying, for instance with what you posted using import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
torch_dtype = torch.bfloat16
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch_dtype)
model.model.layers.pop(16)
prompt = "hello"
tokenized = tokenizer(prompt, return_tensors="pt").to(model.device)["input_ids"]
output = model(tokenized, use_cache=False, return_dict=True, output_hidden_states=True) should work. The other approach which is less layer adaptable but seems to be less prone to breaking across various forwards (but might not be helpful for what you are trying to do) is to modify a config that you pass into the model creation and change the number of layers (e.g. config.num_hidden_layers) before it is passed to Kind of annoying and I agree the model layers that are in something like a ModuleList should be decoupled from the model to allow for more easily debugging/dev locally without having to wrap/subclass/etc the Model/Config. |
Layer index is mostly ( and only) used for the chace. from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "meta-llama/Meta-Llama-3-8B"
torch_dtype = torch.bfloat16
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name, torch_dtype=torch_dtype,
)
for i, layer in enumerate(model.model.layers[:-1]):
if i<16:
model.model.layers[i]
else:
model.model.layers[i] = model.model.layers[i+1]
model.model.layers[i].layer_idx = i+1
prompt = "hello"
tokenized = tokenizer(prompt, return_tensors="pt").to(model.device)["input_ids"]
output = model(tokenized, return_dict=True, output_hidden_states=True) It's not really part of the API not sure we want to add some kind of trick to automatically update the layer idx |
hi Arthur, thanks for the response. I think the workarounds you and @grahamannett suggested suit what I need |
Glad we could help! 🤗 |
Feature request
Hello,
When I try to remove a layer from the LLaMa models using the code snippet below, I get an index error (pasted below the snippet). From what I could tell,
layer_idx
attribute ofself.attn
is being used for generation, and thelayer_idx
are not updated automatically. I believe the same behaviour holds in other models (e.g. gemma-2b). Apologies if there is another existing way to remove layers. I'm posting this after an extensive search.Motivation
I think it'd be fantastic to decouple the
layer_idx
variable somehow to allow easy removal of entire blocks. I imagine this would be useful for the general research community to experiment with these models.Your contribution
I'm not very familiar with the inner workings of the library, however I'd be happy to make a PR if you can give me some high level suggestions on how to make this change. Thanks!
The text was updated successfully, but these errors were encountered: