Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastAPI deployed with hypercorn in GCP Cloud Run returning 503 sporadically #221

Open
bgregoinductiva opened this issue May 6, 2024 · 6 comments

Comments

@bgregoinductiva
Copy link

I have a FastAPI project deployed in Cloud Run using the hypercorn server. I'm using Uvloop as the event loop and leaving the other configurations with default values:

hypercorn app.main:app --bind 0.0.0.0:80 --worker-class uvloop

Here are the Cloud Run configurations:

  • Memory: 1 GiB
  • CPU: 1
  • Maximum concurrent requests per instance: 80
  • CPU is only allocated during request processing
  • Minimum number of instances: 1
  • Maximum number of instances: 30
  • Startup CPU boost
  • Use HTTP/2 end-to-end

When I get a peak of concurrent requests during integration testing, about 30, I usually get a 503, and then a new instance is started.

Has anyone faced a similar problem before?

Thanks in advance.

@nabheet
Copy link

nabheet commented May 13, 2024

Yes, based on what I have learnt so far, your instance was terminated because it accessed more memory that its defined limit.

Even though this says that Cloud Run will return a 500. In my testing I was able to prove the it actually returns a 503. Their documentation leaves a lot to be desired.

Hope this helps.

@nielsbox
Copy link

nielsbox commented May 21, 2024

We have the same issue, only at 40% memory usage at 99 percentile.

@nielsbox
Copy link

Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.

@pgjones
Copy link
Owner

pgjones commented May 26, 2024

Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.

Is the HTTP/1 traffic encrypted? There seems to be an asyncio memory leak with SSL

@nielsbox
Copy link

Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.

Is the HTTP/1 traffic encrypted? There seems to be an asyncio memory leak with SSL

Cloudrun terminates TLS.
https://cloud.google.com/run/docs/container-contract#tls

@nabheet
Copy link

nabheet commented May 27, 2024

Also, I hate to admit this in public, but I wasn't closing SQL connections in the health check endpoint so that was leaking file descriptors. This was causing our Cloud Run containers to crash without log events returning a 503 from the Cloud Run LB.

So another thing to check would be your file descriptor count.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants