New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add support for LLM models with custom code like microsoft/phi-1_5
and tiiuae/falcon-7b
#3632
base: master
Are you sure you want to change the base?
Conversation
AutoConfig.from_pretrained(model_name) | ||
# TODO(Arnav): Why is this called twice, and why is trust_remote_code false the second time? | ||
# breakpoint() | ||
AutoConfig.from_pretrained(pretrained_model_name_or_path=model_name, trust_remote_code=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See if there's a potentially sensible/clear CTA to raise to users when trying to use models that require trust_remote_code
to be set and how to set it as part of their Ludwig config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Somewhere in here, figure out how to also add nice support for loading in the model when using ECD with one of these models as a text encoder. It will also require this flag to be piped through and used at initialization time (via the AutoTransformer encoder)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit Test Results 4 files ± 0 4 suites ±0 1h 13m 55s ⏱️ + 43m 57s For more details on these failures, see this check. Results for commit d6b26e2. ± Comparison against base commit 42723e3. This pull request removes 12 and adds 2776 tests. Note that renamed tests count towards both.
This pull request removes 3 skipped tests and adds 7 skipped tests. Note that renamed tests count towards both.
|
Very messy code, WIP, just trying to understand how the code flows and some functionality related things. Will clean this up significantly before actually creating a PR.