Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak #1246

Open
1 task done
a383615194 opened this issue Mar 19, 2024 · 6 comments
Open
1 task done

Memory leak #1246

a383615194 opened this issue Mar 19, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@a383615194
Copy link

a383615194 commented Mar 19, 2024

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

I am using the AsyncAzureOpenAI class to instantiate a client and using a stream call to client.chat.completions.create. Even after performing close() on both client and response within a try-finally block, I am still encountering a memory leak that eventually leads to server crash.
I tried the solution outlined in #1181, where the pydantic package was upgraded to 2.6.3, but this hasn't resolved my issue.
I noticed using the gc library that memory usage increases after each call to this service. Our service is used for centralized management of AzureOpenAI accounts, hence a client is instantiated for every incoming request. Given the concurrent nature of this service, I'm wondering if client.with_options can support concurrent usage. Do you have any good solutions to address this memory leak issue?

To Reproduce

Several calls in a row, for example, to embeddings that are wrapped with asynс.

Code snippets

class LlmStreamApiHandler(tornado.web.RequestHandler):
    executor = ThreadPoolExecutor(200)

    def __init__(self, *args, **kwargs):
        super(LlmStreamApiHandler, self).__init__(*args, **kwargs)
        self.set_header('Content-Type', 'text/event-stream')
        self.set_header('Access-Control-Allow-Origin', "*")
        self.set_header("Access-Control-Allow-Headers", "*")
        self.set_header("Access-Control-Allow-Methods", "*")

    def on_finish(self):
        return super().on_finish()


    async def post(self):

        try:
            result = await self.process(...)
        except Exception as e:
            ...

        self.write(json.dumps(result) + "\n")
        await self.flush()


    async def process(self, ...)
        client = openai.AsyncAzureOpenAI(
            api_version=api_version,
            api_key=api_key,
            azure_endpoint=azure_endpoint,
            http_client=httpx.AsyncClient(
                proxies=config.api_proxy,
            ),
            max_retries=0
        )
        response_text = False
        try:

            response_text = await client.chat.completions.create(**prompt)
            async for chunk in response_text:
                chunk = chunk.model_dump()
                if chunk['choices'] == [] and chunk['id'] == "" and chunk['model'] == "" and chunk['object'] == "":
                    continue
                chunk_message = chunk['choices'][0]['delta']
                current_text = chunk_message.get('content', '')
                if bool(chunk_message) and current_text:
                    ...
                elif chunk['choices'][0]["finish_reason"] == "stop":
                    break

                elif current_text == '' and chunk_message.get('role', '') == "assistant":
                    ...
                elif chunk['choices'][0]["finish_reason"] == "content_filter":
                    ...
                else:
                    continue
                self.write(json.dumps(json_data) + "\n")
                await self.flush()
        except Exception as e:
            ...
            raise ...
        finally:
            if response_text:
                await response_text.close()
            await client.close()
        return ...

OS

CentOS

Python version

Python 3.8

Library version

openai v1.12.0

@a383615194 a383615194 added the bug Something isn't working label Mar 19, 2024
@antont
Copy link

antont commented Mar 19, 2024

It's possible to reuse the same client for many requests, I think a single one during the lifetime of the process can work fine, at least that's what I've been doing in production so far. (using Python 3.11)

So you could move your client creation to the init maybe? I don't know Tornado though, am using Starlette.

Sure, creating and closing clients should not leak either, and they have been fixing bugs there earlier. But just a note.

@a383615194
Copy link
Author

@antont May I kindly ask, in your service, is every call made using the same set of initialization parameters for the client? In my service, different request sources have their own specified api_version, api_key, and azure_endpoint parameters. Therefore, I initialize a new, corresponding client object for each request.

I've also noticed the client.with_options() method that can dynamically change these parameters. However, what I'm uncertain about is, if only a single client is used and concurrently called, would client.with_options() lead to an erroneous override of the client parameters?

@antont
Copy link

antont commented Mar 19, 2024

@a383615194

May I kindly ask, in your service, is every call made using the same set of initialization parameters for the client?

Spot on, that is the case for us.

I've also noticed the client.with_options() method that can dynamically change these parameters. However, what I'm uncertain about is, if only a single client is used and concurrently called, would client.with_options() lead to an erroneous override of the client parameters?

Yep that's why that method exists. AFAIK it works correctly, so that you can reuse the same client but have different e.g. API keys for different requests. I have not used it, however, but only seen it mentioned in previous similar issues here. Some people are vary of it, one person here at least explained how he creates new client objects just to be sure. I would probably try to read it to review if it seems clear and trustworthy and use it then. I guess there are tests for it too, though strange bug cases can be hard to cover if some such would happen with it.

@rattrayalex
Copy link
Collaborator

Can you share a repository that demonstrates a minimal reproduction? (The code you shared is a helpful starting point, but something we can download and run and see the error would be very helpful).

I'd also +1 @antont's suggestion to reuse the client.

@zhnglicho
Copy link

@a383615194 have you fixed this issue ?

@zhnglicho
Copy link

I reuse the client , the issue has gone. the version is 1.23.6 that I used

@zhnglicho zhnglicho mentioned this issue Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants