Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different execution times for python spin app on local machine and Fermyon Cloud #2482

Open
abdulmonum opened this issue May 1, 2024 · 6 comments

Comments

@abdulmonum
Copy link

Hello,

I am testing a Python-based spin serverless app for execution performance (https://github.com/python/pyperformance/blob/main/pyperformance/data-files/benchmarks/bm_float/run_benchmark.py). (Edited the source code to fit the spin application format). I cannot seem to understand why is there more than a 2x difference in execution time when I time the application locally, time curl http://127.0.0.1:3000/float gives around 0.84 sec compared 0.41 sec when deployed on Fermyon Cloud (including the network latency). Moreover, if I execute the http trigger using a python program to simulate a poisson workload (on avg 2req/sec ), then for many requests I get execution times of even 1.5s. I do not think that there should be a 2x execution time difference even for a local setup (64GB RAM 8cores VM) compared to the cloud platform.

The spin.toml file is the following:

`spin_manifest_version = 2

[application]
authors = ["Abdul Monum [email protected]"]
description = "float serverless function adapted from pyperformance"
name = "float"
version = "0.1.0"

[[trigger.http]]
route = "/float"
component = "float"

[component.float]
source = "app.wasm"
[component.float.build]
command = "componentize-py -w spin-http componentize app -o app.wasm"
watch = ["*.py", "requirements.txt"]`

  • Spin version (spin --version)
    spin 2.4.2 (340378e 2024-04-03)
  • Wasmtime version
    wasmtime-cli 20.0.0 (9e1084ffa 2024-04-22)
  • Installed plugin versions (cloud 0.8.0 [installed] js2wasm 0.6.1 [installed] py2wasm 0.3.2 [installed])
@lann
Copy link
Collaborator

lann commented May 1, 2024

Hello, thanks for the report. I am not aware of any reason for a local execution to be that much slower than Fermyon Cloud. Would it be possible for you to publish the code required to reproduce?

@abdulmonum
Copy link
Author

You can find the code in this repository to reproduce.
https://github.com/abdulmonum/spin-python-app.git

@lann
Copy link
Collaborator

lann commented May 2, 2024

For comparison, on my Linux AMD 5900X desktop, time curl http://127.0.0.1:3000/float takes ~0.22s. Could you give more information about your local environment?

@abdulmonum
Copy link
Author

abdulmonum commented May 9, 2024

Hello, I changed my local environment and for simple runs time curl http://127.0.0.1:3000/float takes ~0.35s which makes sense. However, if I run a Poisson workload of ~ 2 reqs/sec, then many requests take around ~ 0.7s. If I run bombardier (https://github.com/codesenberg/bombardier) ./bombardier http://127.0.0.1:3000/float, I get the following output:

Bombarding http://127.0.0.1:3000/float for 10s using 125 connection(s)
[======================================================================================================================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec         2.19      19.23     250.16
  Latency         8.78s      2.99s     10.01s
  HTTP codes:
    1xx - 0, 2xx - 26, 3xx - 0, 4xx - 0, 5xx - 0
    others - 120
  Errors:
       timeout - 120
  Throughput:     3.61KB/s

Why is the spin app on the local environment not able to handle many requests at the same time? If I understand correctly, for every http request sent, a new webassembly instance is spawned, serves the request, and tears down. Theoretically, that should mean we should have consistent response times (atleast when the arrival rate is as low as 2 reqs/sec). I observe that on Fermyon cloud where I get an average response time of 0.41 sec (including network latency), but shouldn't the response times be consistent in the local environment? Is there some sort of queuing of requests because this does not seem to me a Wasm issue.

My current environment:
Intel E3-1230 v3 @ 3.30GHz
16GB RAM
Ubuntu 22.04

@abdulmonum
Copy link
Author

@lann Any explanation for this?

@lann
Copy link
Collaborator

lann commented May 20, 2024

Sorry, missed your previous update.

Is there some sort of queuing of requests because this does not seem to me a Wasm issue.

Yes, there is implicit queuing of async tasks in the Tokio multi thread runtime.

Any explanation for this?

I ran a few tests at different concurrency levels (bombaridier -c N ...):

  • -c 1: ~200ms avg, ~2ms SD
  • -c 10: ~260ms avg, ~38ms SD
  • -c 100: ~3000ms avg, ~2000ms SD

My host has 24 cores, though bombardier itself will cause some extra contention when testing entirely locally. This roughly makes sense to me for CPU-bound workloads: as request concurrency reaches multiples of the number of cores you would expect avg latency to scale similarly.

The CPU you mention appears to have 8 "cores" (threads), so at a concurrency of 125 / 8 = ~15.6 * 350ms = ~5.4s, which seems reasonably close to your 8.8s avg when accounting for various sources of overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants