Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable_sequential_cpu_offload HuggingFace Diffusers error with sd2 example on T4 GPU #2

Open
BEpresent opened this issue Apr 13, 2023 · 2 comments
Assignees

Comments

@BEpresent
Copy link

BEpresent commented Apr 13, 2023

Hi, I was following this example https://modelserving.com/blog/creating-stable-diffusion-20-service-with-bentoml-and-diffusers

or this by git clone of this example repo https://github.com/bentoml/diffusers-examples/tree/main/sd2

which results in a simple service.py file like this:

import torch
from diffusers import StableDiffusionPipeline

import bentoml
from bentoml.io import Image, JSON, Multipart

bento_model = bentoml.diffusers.get("sd2:latest")
stable_diffusion_runner = bento_model.to_runner()

svc = bentoml.Service("stable_diffusion_v2", runners=[stable_diffusion_runner])

@svc.api(input=JSON(), output=Image())
def txt2img(input_data):
    images, _ = stable_diffusion_runner.run(**input_data)
    return images[0]

After bentoml serve service:svc --production I get the following error (happens also with another custom model that I tried). It seems to be related to enable_sequential_cpu_offload by HuggingFace.

[ERROR] [runner:sd2:1] Traceback (most recent call last):
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/starlette/routing.py", line 671, in lifespan
    async with self.lifespan_context(app):
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/starlette/routing.py", line 566, in __aenter__
    await self._router.startup()
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/starlette/routing.py", line 650, in startup
    handler()
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 303, in init_local
    raise e
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 293, in init_local
    self._set_handle(LocalRunnerRef)
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 139, in _set_handle
    runner_handle = handle_class(self, *args, **kwargs)
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 24, in __init__
    self._runnable = runner.runnable_class(**runner.runnable_init_params)  # type: ignore
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/bentoml/_internal/frameworks/diffusers.py", line 443, in __init__
    self.pipeline: diffusers.DiffusionPipeline = load_model(
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/bentoml/_internal/frameworks/diffusers.py", line 182, in load_model
    pipeline = pipeline.to(device_id)
  File "/home/be/miniconda3/envs/diffusers310/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 639, in to
    raise ValueError(
ValueError: It seems like you have activated sequential model offloading by calling `enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing the move altogether if you use sequential offloading.

As general info, it runs on a GCS VM instance with T4 GPU - could this be the issue?

@BEpresent
Copy link
Author

update, also happening on a 3090 GPU

2023-04-13T20:05:33+0000 [ERROR] [runner:sd2:1] Application startup failed. Exiting.
/usr/local/lib/python3.10/dist-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
  warnings.warn(
2023-04-13T20:05:41+0000 [ERROR] [runner:sd2:1] An exception occurred while instantiating runner 'sd2', see details below:
2023-04-13T20:05:41+0000 [ERROR] [runner:sd2:1] Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/bentoml/_internal/runner/runner.py", line 293, in init_local
    self._set_handle(LocalRunnerRef)
  File "/usr/local/lib/python3.10/dist-packages/bentoml/_internal/runner/runner.py", line 139, in _set_handle
    runner_handle = handle_class(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/bentoml/_internal/runner/runner_handle/local.py", line 24, in __init__
    self._runnable = runner.runnable_class(**runner.runnable_init_params)  # type: ignore
  File "/usr/local/lib/python3.10/dist-packages/bentoml/_internal/frameworks/diffusers.py", line 443, in __init__
    self.pipeline: diffusers.DiffusionPipeline = load_model(
  File "/usr/local/lib/python3.10/dist-packages/bentoml/_internal/frameworks/diffusers.py", line 182, in load_model
    pipeline = pipeline.to(device_id)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 626, in to
    raise ValueError(
ValueError: It seems like you have activated sequential model offloading by calling `enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing the move altogether if you use sequential offloading.

@larme larme self-assigned this Apr 18, 2023
@larme
Copy link
Member

larme commented Apr 18, 2023

Hi @BEpresent
I think there's a diffusers update breaking bentoml.diffusers. We are going to fix this one. You can lock diffusers==0.13.1 for a temporal fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants