Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use 'asks' for concurrent requests #44

Open
pieper opened this issue Jan 9, 2021 · 11 comments
Open

use 'asks' for concurrent requests #44

pieper opened this issue Jan 9, 2021 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@pieper
Copy link
Member

pieper commented Jan 9, 2021

Google suggests using up to 20 concurrent requests for better overall network performance. But I believe we currently only do one at a time with requests.

It looks like we could switch to asks for concurrent requests.

@hackermd
Copy link
Collaborator

I am not sure whether this would be possible in a backwards compatible manner. We expose the requests API via the constructor of DICOMwebClient. I just tested whether asks.sessions.Session could serve as a drop-in replacement of requests.Session, but that doesn't work:

from asks.sessions import Session
from dicomweb_client import DICOMwebClient

url = '...'
session = Session()
client = DICOMwebClient(url, session=session)
AttributeError: 'coroutine' object has no attribute 'status_code'

@hackermd hackermd added the question Further information is requested label Jan 10, 2021
@hackermd hackermd self-assigned this Jan 10, 2021
@hackermd
Copy link
Collaborator

Maybe we could implement a ConcurrentSession based on asks, that implements the same interface of requests.Session.

@ntenenz
Copy link
Contributor

ntenenz commented Jan 10, 2021

Have only used it in passing, but another option to explore may be requests toolbelt.

@hackermd
Copy link
Collaborator

The requests documentation recommends a couple of other libraries as well: https://requests.readthedocs.io/en/latest/user/advanced/#blocking-or-non-blocking

For example

@pieper
Copy link
Member Author

pieper commented Jan 10, 2021

Yes, these alternatives could be good too. The reason I suggested asks is that it builds on trio which opens the possibility of using qtrio to integrate with the Qt event loop cleanly. (Maybe this is possible with the other options too, I haven't looked closely).

The goal would be to use it with the Slicer DICOMweb Browser. Backwards compatibility would not be an issue, at least for us. It's not clear that qtrio would work with the PythonQt code used in Slicer out of the box, but should be doable in theory.

I'll note that we've also been toying with the idea of implementing a DICOMweb client library in C++ using Qt and putting it in CTK. This may still be a better idea even if we do find a good way to do concurrency in python. @lassoan @nolden @jcfr

@hackermd
Copy link
Collaborator

Backwards compatibility would not be an issue, at least for us.

I am not in favor of breaking the API. We made the call to expose requests via the constructor. I was initially not in favor of this approach, but being able to pass an authorized requests.Session object to the constructor has turned out to be really useful.

Instead of changing the implementation of the existing dicomweb.DICOMwebClient class, we could create a dicomweb.ConcurrentDICOMwebClient, which would provide the same (or a very similar) interface, but would be implemented fully async. We could even consider implementing it as a C/C++ extension if that would make sense.

@pieper
Copy link
Member Author

pieper commented Jan 10, 2021

we could create a dicomweb.ConcurrentDICOMwebClient, which would provide the same (or a very similar) interface, but would be implemented fully async. We

Yes, that's what I meant - a new API would be fine.

@hackermd
Copy link
Collaborator

Yes, that's what I meant - a new API would be fine.

Sounds great. We can experiment with the different libraries. I defer to your expertise regarding choice of the underlying library to support the Qt use case.

How should the API of the dicomweb.ConcurrentDICOMwebClient look like. I assume that the names of methods and parameters could stay the same. However, what about the return values? What would methods return to the caller? Would they resolve internally or return a "promise"?

@hackermd hackermd added enhancement New feature or request and removed question Further information is requested labels Jan 10, 2021
@ntenenz
Copy link
Contributor

ntenenz commented Jan 10, 2021

Are you looking to make it async or merely parallelizable? If you're looking to maintain API compatibility between the clients (a nice-to have, but certainly not mandatory),

  • requests-toolbox may allow for API compatibility out of the box, however it leverages threading and is not async
  • Alternatively, one may be able to write an adapter to convert between session/auth types of using asks/httpx.

@pieper
Copy link
Member Author

pieper commented Jan 10, 2021

I think async is more fundamental than the threading since most of the time will be spent waiting for the network anyway. I haven't worked with any of the native python async code so I'm not sure what's the cleanest. Something like a promises or signal/slot interface could make sense, but whatever it is it needs to be non-blocking and integrate with the application's event management. I looked at asyncio and it didn't seem convenient to integrate with other event loops.

If I were writing a pure python utility to do the networking I'd probably use select directly. But for integrating with an application that has it's own event loop instead I'd want to see the socket file descriptors exposed so that the app can use them with their own wrapper around select (e.g. a QSocketNotifier). Either way, the dicomweb-client library should have methods to operate on the socket whenever it becomes ready, handle the increment of the task that it can perform without blocking, and then just return control to the application. The socket handling methods should be thread safe in case the application wants to use them that way.

@ntenenz
Copy link
Contributor

ntenenz commented Jan 10, 2021

For IO-bound tasks, threads are able to release the GIL enabling true parallelism. That being said, there's almost certainly a higher overhead than async code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants