Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add support for LLM models with custom code like microsoft/phi-1_5 and tiiuae/falcon-7b #3632

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

arnavgarg1
Copy link
Contributor

Very messy code, WIP, just trying to understand how the code flows and some functionality related things. Will clean this up significantly before actually creating a PR.

@arnavgarg1 arnavgarg1 mentioned this pull request Sep 18, 2023
Comment on lines -57 to +59
AutoConfig.from_pretrained(model_name)
# TODO(Arnav): Why is this called twice, and why is trust_remote_code false the second time?
# breakpoint()
AutoConfig.from_pretrained(pretrained_model_name_or_path=model_name, trust_remote_code=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if there's a potentially sensible/clear CTA to raise to users when trying to use models that require trust_remote_code to be set and how to set it as part of their Ludwig config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Somewhere in here, figure out how to also add nice support for loading in the model when using ECD with one of these models as a text encoder. It will also require this flag to be piped through and used at initialization time (via the AutoTransformer encoder)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its hard to tell exactly when it happens but i think it may be something to do with the interaction between retrying and the fact that kwargs are popped out rather often:

image

@github-actions
Copy link

Unit Test Results

       4 files  ±       0         4 suites  ±0   1h 13m 55s ⏱️ + 43m 57s
2 795 tests +2 764  2 777 ✔️ +2 751    9 💤 +  4    9 +  9 
5 592 runs  +5 530  5 554 ✔️ +5 502  20 💤 +10  18 +18 

For more details on these failures, see this check.

Results for commit d6b26e2. ± Comparison against base commit 42723e3.

This pull request removes 12 and adds 2776 tests. Note that renamed tests count towards both.
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[adult_census_income.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[adult_census_income.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[sarcos.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[sarcos.gbm.yaml]
tests.regression_tests.model.test_old_models ‑ test_model_loaded_from_old_config_prediction_works
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[respiratory]
…
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops0]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops1]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[None]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops1]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops4]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[random_horizontal_flip]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_load_model_with_augmentation_pipeline
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_local_model_training_with_augmentation_pipeline[preprocessing0-encoder0-False]
…
This pull request removes 3 skipped tests and adds 7 skipped tests. Note that renamed tests count towards both.
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[sarcos.ecd.yaml]
tests.ludwig.automl.test_base_config
tests.ludwig.automl.test_utils
tests.ludwig.backend.test_ray
tests.ludwig.data.test_ray_data
tests.ludwig.utils.test_fs_utils ‑ test_get_fs_and_path_invalid_windows
tests.ludwig.utils.test_hyperopt_ray_utils ‑ test_grid_strategy[test_1]
tests.ludwig.utils.test_hyperopt_ray_utils ‑ test_grid_strategy[test_2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants