Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running with standard Huggingface config and trainer files does not give optimal results #60

Open
leannmlindsey opened this issue Mar 29, 2024 · 0 comments

Comments

@leannmlindsey
Copy link

leannmlindsey commented Mar 29, 2024

Hello, I have been running your model since last summer using a standard huggingface model framework, (see code below). And it has not been giving us the same results on the benchmarking tests as you report in the paper. For example:

GenomicBenchmarks

Mouse Enhancers, you report 85.1. our results 63.6
Human Enhancers Cohn, you report 74.2, our results 66.3

I think it is possible that it is because we are not using the parameter that you have at the bottom of the config file,
**freeze_backbone: false **
but I am not sure how to incorporate this into a standard Huggingface trainer.

Do you support the huggingface trainer or only the hydra/lightning trainer?

My concern is that since we are not able to match your reported results, perhaps your model is not performing optimally on our specific classification task. I had expected it to be comparable in performance to DNABERT2 but it was not. I think this may be because we have not set up our run correctly. Any direction would be appreciated. Thank you.

Sample Code
checkpoint = 'LongSafari/hyenadna-tiny-16k-seqlen-d128-hf'
max_length = 4010
args = {
"output_dir": "test_output",
"num_train_epochs": 25,
"per_device_train_batch_size": 512,
"per_device_eval_batch_size": 512,
"gradient_accumulation_steps": 4,
"gradient_checkpointing": False,
"learning_rate": 2e-5,
"evaluation_strategy": "steps",
"eval_steps": 1,
"wandb": "null",
}
training_args = TrainingArguments(**args)

trainer = Trainer(model=model, args=training_args, train_dataset=ds_tok_train, eval_dataset=ds_tok_val, compute_metrics=compute_metrics)
trainer.train()
tokenizer = AutoTokenizer.from_pretrained(checkpoint, trust_remote_code=True)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="auto", pad_token_id=tokenizer.pad_token_id, trust_remote_code=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant