-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[training examples] reduce complexity by running final validations before export #7959
Comments
I also think that code seems unnecessarily complicated. The scripts should be set up so that the pipeline gets constructed first, then components get pulled out as necessary depending on what needs to be finetuned, e.g. the unet. Right now it looks like the opposite -- various components get instantiated depending on what is needed in the training script, then a pipeline gets created each time validation happens. If I can add some complementary thoughts: in my own versions of those training scripts, I actually construct a
|
Thanks for initiating the discussion! @christopher-beckham, Thanks for your thoughts, too. It's all very reasonable, especially the one on the scheduler. However, we don't use the
Running validation inference is conditional. So, how would constructing a pipeline no matter what would be sensible here?
The pipeline gets created with already instantiated components, so, I don't understand the negative implications, fully. Finally, as pointed out, I find it better to be explicit about this
It mimics the situation when a user would load up the trained model in a fresh environment. Even if it takes more time, I think it's better to be explicit in this case. |
Hi @sayakpaul, Thanks for your input.
Good point. This is something I missed. I think in some old code of mine I originally took from this repo the training scheduler used was Euler (one based on a controlnet). But maybe to keep the discussion clean I can just solely make reference to the current version of the SDXL training script (which indeed does use DDPMScheduler for training).
In my own refactoring of the script I was trying to clean up this. Every individual component (the two text encoders, the tokenizers, etc.) is instantiated individually which just seems overly verbose to me. If each component had to come from a different
and then pull things out as needed for training, e.g. If one went with that design decision, then it would be convenient to just also pass in the pipeline into
Yes I agree, I completely forgot it's cheap to do this given the components are already instantiated. |
I was thinking about Sayak's suggestion lately that the training examples are too long, and went through looking for redundant/unnecessary code sections that can be reduced or eliminated for readability.
The main thing that stands out is how the validations occur during the trainer unwind stage.
During training, we have access to the unet and other components - we pass
is_final_validation=False
to thelog_validations
method, which behaves differently across different training examples. In the ControlNet example, it ends up importing the ControlNet model as a pipeline from theargs.output_dir
:this seems to only happen because at the end of training, this method is called after everything is unloaded:
del
them before loading the new onesmax_train_steps = 1000
andvalidation_steps = 100
or some other value that goes evenly intomax_train_steps
, we run two validations - one just before exiting the training loop, and then this oneIf we just remove the final inference code, the earlier condition can be updated to run the validation before exiting the loop, which would solve issues 2-4:
From:
To:
The text was updated successfully, but these errors were encountered: