Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/publish concurrency #1271

Merged
merged 11 commits into from
Jun 15, 2024
Merged

Fix/publish concurrency #1271

merged 11 commits into from
Jun 15, 2024

Conversation

neolynx
Copy link
Member

@neolynx neolynx commented Apr 17, 2024

Replaces #1261
Fixes #1276, #1125

Description of the Change

PR Updated: synchronous and asynchronous requests are queued until the resoures become available.

This MR addresses a concurrency issue with the api/publish endpoint, where concurrent PUTs typically fail. The MR in it self is not pretty, so consider this initial state of the MR a starting point of discussion.
The commits are intentionally separated in order to make it as easy as possible to observe the failing test (and test it against other likely better code changes).

@neolynx neolynx self-assigned this Apr 18, 2024
@neolynx
Copy link
Member Author

neolynx commented Apr 20, 2024

the test t12_api:TaskAPITestParallelTasks failes with this implementation... does it need to be updated?

2024-04-20T21:03:23.8650330Z [GIN] 2024/04/20 - 21:03:23 | 202 | 12.838215884s |       127.0.0.1 | PUT      "/api/mirrors/x41qcXfiIf7ho72?_async=True"
2024-04-20T21:03:23.8662294Z Traceback (most recent call last):
2024-04-20T21:03:23.8675218Z   File "/home/runner/work/aptly/aptly/system/run.py", line 102, in run
2024-04-20T21:03:23.8676107Z     t.test()
2024-04-20T21:03:23.8676889Z   File "/home/runner/work/aptly/aptly/system/lib.py", line 178, in test
2024-04-20T21:03:23.8677717Z     self.check()
2024-04-20T21:03:23.8678774Z   File "/home/runner/work/aptly/aptly/system/t12_api/tasks.py", line 81, in check
2024-04-20T21:03:23.8680027Z     mirror_task_id, mirror_name = self._create_mirror(mirror_dist)
2024-04-20T21:03:23.8680826Z                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-20T21:03:23.8681926Z   File "/home/runner/work/aptly/aptly/system/t12_api/tasks.py", line 25, in _create_mirror
2024-04-20T21:03:23.8683079Z     self.check_equal(resp2.status_code, 409)
2024-04-20T21:03:23.8684153Z   File "/home/runner/work/aptly/aptly/system/lib.py", line 418, in check_equal
2024-04-20T21:03:23.8684851Z     self.verify_match(a, b, match_prepare=pprint.pformat)
2024-04-20T21:03:23.8685526Z   File "/home/runner/work/aptly/aptly/system/lib.py", line 469, in verify_match
2024-04-20T21:03:23.8686425Z     raise Exception("content doesn't match:\n" + diff + "\n")
2024-04-20T21:03:23.8686894Z Exception: content doesn't match:
2024-04-20T21:03:23.8687342Z --- 
2024-04-20T21:03:23.8687629Z +++ 
2024-04-20T21:03:23.8688024Z @@ -1 +1 @@
2024-04-20T21:03:23.8688566Z -202
2024-04-20T21:03:23.8688908Z +409

Create Mirror in parallel seems now to return 202, but 409 was expected.

@neolynx neolynx added the fix tests Tests are failing label Apr 20, 2024
@ramonnr
Copy link

ramonnr commented Apr 24, 2024

Sorry for the tardy response, I'll look over what happened after the rebase :)

@neolynx neolynx force-pushed the fix/publish-concurrency branch 2 times, most recently from fb753df to 348ea3e Compare April 24, 2024 15:42
@neolynx neolynx force-pushed the fix/publish-concurrency branch 2 times, most recently from 2f20f12 to 2eedb67 Compare June 6, 2024 17:49
Copy link

codecov bot commented Jun 8, 2024

Codecov Report

Attention: Patch coverage is 91.66667% with 7 lines in your changes missing coverage. Please review.

Project coverage is 74.63%. Comparing base (787f954) to head (fdce655).

Files Patch % Lines
api/api.go 55.55% 3 Missing and 1 partial ⚠️
task/list.go 95.23% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1271      +/-   ##
==========================================
- Coverage   74.86%   74.63%   -0.24%     
==========================================
  Files         144      144              
  Lines       16261    16299      +38     
==========================================
- Hits        12174    12164      -10     
- Misses       3149     3188      +39     
- Partials      938      947       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@neolynx
Copy link
Member Author

neolynx commented Jun 8, 2024

Hi !

I implemented a queue, async tasks are now executed squentially in order, as soon as the required resources are avaiilable.

could you give this a try ?

@neolynx neolynx force-pushed the fix/publish-concurrency branch 2 times, most recently from 471872d to 629957a Compare June 8, 2024 08:52
@neolynx neolynx added needs review Ready for review & merge and removed fix tests Tests are failing needs implementation WIP labels Jun 8, 2024
@neolynx
Copy link
Member Author

neolynx commented Jun 8, 2024

Aptly is now also using the queue for synchronous tasks. this should fix the concurrency problem completely.

Could you test and confirm ? Thanks !

runitonmetal and others added 11 commits June 15, 2024 16:23
This commit introduces a test which runs concurrent publishes (which
could be parallell with multiproccessing, python is fun).
The test purposly fails (at the point in history that this patch is
written) in order to make it as easy as possible to verify later patches,
which hopefully addresses concurrency problems.

The same behaviour can easily be tested outside of the system tests with
the following (or similar) shell

$ aptly serve -listen=:8080 -no-lock
$ aptly repo create create -distributions=testing local-repo
$ atply publish repo -architectures=amd64 local-repo
$ apt download aptly
$ aptly repo add local-repo ./aptly*.deb
$ for _ in $(seq 10); do curl -X PUT 127.0.0.1:8080/api/publish//testing

In the local testing perfomed (on a dual core vm) the first 1-4 jobs
would typically succeed and the rest would error out.
This commit blocks concurrent calls to RunTaskInBackground which is
intended to fix the quirky behaviour where concurrent PUT calls to
api/publish/<prefix>/<distribution> would immedietly reuturn an error.

The solution proposed in this commit is not elegant and probaly has
unintended side-effects. The intention of this commit is to highlight
the area that actually needs to be addressed.
Ideally this patch is amended or dropped entierly in favor of a better
fixup.
@neolynx neolynx merged commit f0bf519 into master Jun 15, 2024
9 checks passed
@neolynx neolynx deleted the fix/publish-concurrency branch June 15, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs review Ready for review & merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error #01: Needed resources are used by other tasks.
4 participants