Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single model workers configs should mean a less aggressive memory cleanup scheme #140

Open
tazlin opened this issue Feb 25, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@tazlin
Copy link
Member

tazlin commented Feb 25, 2024

The primary intent behind leaving a certain amount of free system ram is to allow a cushion for potentially very large other models to load (such as SDXL models). However, in the situation where the worker is configured only to run a single model, the memory conditions become much more predictable and will fail anyway if an OOM occurs.

  • If the worker has one model only
    • If the model has only a single model file
      • Keep the model entirely on VRAM 100% of the time
    • If the model has multiple models (as is the case with Stable Cascade)
      • Avoid offloading to disk if possible, swapping the models only between RAM and VRAM.

If failures are met in this situation, its likely the model overhead would only be encouraging the worker to run in very poor memory conditions (as they would constantly be loading off disk for little to no reason).

@tazlin tazlin added the enhancement New feature or request label Feb 25, 2024
@db0 db0 transferred this issue from Haidra-Org/horde-worker-reGen Feb 28, 2024
@db0 db0 transferred this issue from Haidra-Org/AI-Horde-image-model-reference Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant