Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(retry) add optional randomness min/max #50

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pieper
Copy link
Member

@pieper pieper commented Mar 20, 2021

To help when accessing heavily loaded servers. Google's dicom stores for example seem very prone to overloading, in which case they return a 429 code.

This exposes a few more flags from the core library. It would be nicer to somehow expose the native interface to avoid this kind of API duplication.

Even better might be a way to support different mechanisms. The retry package for instance supports logging, which would be helpful when picking parameters

To help when accessing heavily loaded servers
@ntenenz
Copy link
Contributor

ntenenz commented Mar 21, 2021

A few thoughts:

  • One option is to enforce the retry policy via proxy (e.g., a sidecar like envoy) rather than in code. This tends to make the enforcement of consistent policy easier across applications, frameworks, and languages.
  • If the preference is to enforce the retry policy in code, one could simply pass the created retrying.retry object (or any decorator really) to augment the functions which invoke external endpoints. Would also likely facilitate testing as it could be used for mocks.

@pieper
Copy link
Member Author

pieper commented Mar 21, 2021

Yes, I think handling network/server failures should be outside the core dicomweb logic. Being able to pass in a retry decorator instead of a retrying one would have been nice for the logging feature. Otherwise it's just guesswork.

Using something like envoy could be a good solution.

@hackermd
Copy link
Collaborator

@pieper I generally don't have any major concerns to expose additional retry parameters. The only thing we should keep in mind that we may want to change the implementation at some point and use another library under the hood for implementing the retry logic.

@ntenenz
Copy link
Contributor

ntenenz commented Mar 30, 2021

@hackermd, thoughts on allowing the user to pass an optional sequence of decorators which would wrap all of the http calls? would decouple the retry logic, streamline the addition of new functionality, and could be done in a backwards-compatible fashion.

Note: Depending on the interface, this could possibly be used to layer on custom authn as well.

Copy link
Collaborator

@hackermd hackermd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pieper. Some minor comments related to naming, type annotations and docstrings.

@@ -141,6 +141,8 @@ def set_http_retry_params(
retry: bool = True,
max_attempts: int = 5,
wait_exponential_multiplier: int = 1000,
wait_random_min: int = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wait_random_min: int = None,
wait_random_min: Optional[int] = None,

@@ -141,6 +141,8 @@ def set_http_retry_params(
retry: bool = True,
max_attempts: int = 5,
wait_exponential_multiplier: int = 1000,
wait_random_min: int = None,
wait_random_max: int = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wait_random_max: int = None,
wait_random_max: Optional[int] = None,

@@ -164,6 +167,10 @@ def set_http_retry_params(
the maximum number of request attempts.
wait_exponential_multiplier: float, optional
exponential multiplier applied to delay between attempts in ms.
wait_random_min: float, optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wait_random_min: float, optional
wait_random_min: int, optional

@@ -164,6 +167,10 @@ def set_http_retry_params(
the maximum number of request attempts.
wait_exponential_multiplier: float, optional
exponential multiplier applied to delay between attempts in ms.
wait_random_min: float, optional
minimum amount of jitter in ms.
wait_random_max: float, optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wait_random_max: float, optional
wait_random_max: int, optional

@@ -141,6 +141,8 @@ def set_http_retry_params(
retry: bool = True,
max_attempts: int = 5,
wait_exponential_multiplier: int = 1000,
wait_random_min: int = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the random from the parameter names and just use wait_min and wait_max rather than wait_random_min and wait_random_max, respectively?

@pieper
Copy link
Member Author

pieper commented Mar 30, 2021

@hackermd I like the suggestion from @ntenenz that the retry logic should be outside of the library itself, or to avoid breaking changes we keep your current minimal implementation without exposing new options (i.e. drop this PR). For my use case I would have wanted to use the logging feature of retrying anyway and maybe would have wanted other custom debugging access.

@ntenenz
Copy link
Contributor

ntenenz commented Mar 30, 2021

FWIW @pieper, an easy way to ensure backwards-compatibility would be to:

  1. retain set_retry_params with its current signature
  2. allow the user to supply decorators, either upon creation or via a method, and store them in the DICOMwebClient object as a list
  3. if set_retry_params is invoked, add the current retry decorator to the list
  4. [optional] deprecate the use of set_retry_params

@hackermd
Copy link
Collaborator

@hackermd, thoughts on allowing the user to pass an optional sequence of decorators which would wrap all of the http calls? would decouple the retry logic, streamline the addition of new functionality, and could be done in a backwards-compatible fashion.

Note: Depending on the interface, this could possibly be used to layer on custom authn as well.

Let me think about this. It seems like an elegant and more general approach.

Another possibility for how this could be achieved without making any changes to the library code is by passing a session object with a different implementation of the http methods (potentially decorated). However, it is more hacky and makes assumptions about the inner workings of the class.

@hackermd
Copy link
Collaborator

For my use case I would have wanted to use the logging feature of retrying anyway and maybe would have wanted other custom debugging access.

@pieper did you try the callback mechanism that we are providing based on requests event hooks? Would would provide you access to the response objects for debugging purposes.

@hackermd hackermd added the enhancement New feature or request label Mar 30, 2021
@pieper
Copy link
Member Author

pieper commented Mar 30, 2021

No, I didn't see the event hooks. Not sure that would have been exactly on point for my needs without more work though since I wanted to just know what requests were being retried and why, retry seemed like it would have been a one-liner.

@hackermd
Copy link
Collaborator

No, I didn't see the event hooks. Not sure that would have been exactly on point for my needs without more work though since I wanted to just know what requests were being retried and why, retry seemed like it would have been a one-liner.

If it's primarily about getting logging output, what about passing the existing logging object to the retry call? Getting the retry warning messages shouldn't do any harm and would probably be appreciated by a lot of users.

In the meantime, we can further explore @ntenenz's proposal.

@hackermd
Copy link
Collaborator

@ntenenz would you kindly further elaborate on what you are proposing. Are you suggesting to add a parameter to the constructor to pass a higher-order function that can wrap request HTTP method calls? So the caller would for example pass an "instance" of retrying.retry. Am I understanding this correctly?

@ntenenz
Copy link
Contributor

ntenenz commented Apr 19, 2021

Currently, DICOMwebClient has three methods named:

  • _invoke_get_request
  • _invoke_post_request
  • _invoke_delete_request

I'm proposing that these methods retain their current signatures and that the user be allowed to supply, via __init__, a list of decorators to wrap these methods. I'd argue separate parameters for each of the above methods is preferable as it would allow for more fine-grained control. Given they're composable, decorators would be required to return methods of the same signature.

@hackermd
Copy link
Collaborator

hackermd commented Apr 20, 2021

@ntenenz Thanks for the elaboration.

I'd argue separate parameters for each of the above methods is preferable as it would allow for more fine-grained control. Given they're composable, decorators would be required to return methods of the same signature.

I think that the decorator approach is elegant and would give users a lot of control. However, it would also expose low level implementation details and would be difficult to document (just try to document the signature of the callable with type annotations).

Are there any other use cases beyond retrying where it would be useful to pass custom decorators for these functions?

@pieper
Copy link
Member Author

pieper commented Apr 20, 2021

Are there any other use cases beyond retrying where it would be useful to pass custom decorators for these functions?

Logging and retries are the two obvious ones. Related to logging could be auditing and gathering performance or usage statistics. Perhaps also injecting headers or otherwise manipulating the request or response could be useful.

@ntenenz
Copy link
Contributor

ntenenz commented Apr 20, 2021

As @pieper stated, header manipulation (auth, tracing, etc), logging, retry ... the list goes on.

Regarding the exposing of internal details, I'd argue url, body, and headers are fairly generic and could serve as an API between the client and user without binding yourself to a particular implementation.

@hackermd
Copy link
Collaborator

header manipulation (auth, tracing, etc), logging, retry ... the list goes on.

Couldn't most of the features in this list already be supported by implementing a class derived from requests.session.Session, overriding the get or post methods, and passing an instance of this custom session class to the constructor via the session parameter? This is how we currently handle advanced auth logic (e.g., OAuth 2.0 authorization flows).

I don't want to get into a discussion of object-orientated versus functional programming paradigms, but am wondering whether it will ultimately be beneficial to support several mechanisms to manipulate requests/responses that could potentially interfere with each other.

I am not categorically against the decorator approach, but still have doubts whether we would really need it to achieve @pieper's goal. I like the fact that the retry logic is currently abstracted from the user, which seems to cover the needs of most users.

If we decide to allow users to pass decorators, there are two things we would need to sort out:

  1. How would the decorator(s) be passed via the constructor? Something like this request_wrapper: Callable[[Callable[..., requests.models.Response], requests.models.Response]?
  2. How would the current retry logic be preserved without breaking the API? What would we do with set_retry_params()?

@ntenenz
Copy link
Contributor

ntenenz commented Apr 23, 2021

Honestly, to me this is not at all about functional vs object-oriented programming, but rather, providing flexibility to the user without the need to create derived classes as well as consolidation of logic. For the instances you just described, the user would need to

  • create functionality x
  • identify where and how to patch
  • create a derived type which achieves this end

This means the user must be aware of (and is bound to) the underlying implementation details. If you ever wanted to move from requests to some other library, user code would break. Additionally, logic for this is distributed between the Session, other API calls, etc.

Alternatively, if you take the decorator approach, you're extending the API surface and providing a contract to the user. They need not know/care HOW the request is issued, only that your API is remaining static. Furthermore, request modification logic is consolidated in a single location, providing a single source of truth.


Imagine a situation where I wanted to employ custom retry logic, a different auth scheme, add some headers, do some custom logging, deploy distributed tracing, etc -- all customizable via a config file. I could:

  • implement a bunch of derived classes and select the correct one based on the config. with combinations, this growth exponentially in the number of config options.
  • create a few decorators and inject/loop through them at run time. this would grow linearly in the number of options.

Simply put, composition >>> inheritance here. This could be achieve by inserting callable objects or decorators, but the result is largely identical.


To answer your questions at the bottom,

  1. The decorators would have a signature roughly of the form below. The actual request logic, at the innermost point of this recursion, would break the pattern implementing the request functionality.
CallableType = Callable[[URLType, BodyType, HeaderType], ResponseType]
def my_decorator(f: CallableType) -> CallableType:
    def func(url: URLType, body: BodyType, headers: HeaderType) -> ResponseType:
        # do something
        out = f(url, body, headers)
        # do something else
        return out
    return func 
  1. The current retry logic sets a few parameters. These parameters could/would be used to construct a retry decorator on behalf of the user and inject it as the outermost decorator.
  2. Not asked, but this would also allow you to short-circuit the actual request itself, eliminating the need to monkey-patch requests for testing.

@hackermd
Copy link
Collaborator

This means the user must be aware of (and is bound to) the underlying implementation details. If you ever wanted to move from requests to some other library, user code would break. Additionally, logic for this is distributed between the Session, other API calls, etc.
Alternatively, if you take the decorator approach, you're extending the API surface and providing a contract to the user. They need not know/care HOW the request is issued, only that your API is remaining static.

The decorator approach still relies on the requests API. The decorate will have to rely on the wrapped function to return an instance of requests.models.Response.

If the underlying implementation changes, e.g., if we would move away from requests, the decorator would have the same problem.

@hackermd
Copy link
Collaborator

hackermd commented Apr 23, 2021

Honestly, to me this is not at all about functional vs object-oriented programming, but rather, providing flexibility to the user without the need to create derived classes as well as consolidation of logic. For the instances you just described, the user would need to
create functionality x
identify where and how to patch
create a derived type which achieves this end

How would that really be different with a decorator? The user would have to implement a decorator that wraps requests.session.Session.request instead of defining a type derived from (or with the same interface as) requests.session.Session.

I don't want to argue about the pros and cons of either approach (inheritance versus composition), but discuss whether we should support both. I agree with you that the latter would have advantages. However, we already allow the user to pass an instance of requests.session.Session or a derived class. The question at hand is whether we should in addition allow the user to provide a decorator to manipulate requests? I think this could lead to problems if both approaches are not designed to work together. For example, both may manipulate the headers and interfere with each other.

@ntenenz
Copy link
Contributor

ntenenz commented Apr 23, 2021

How would that really be different with a decorator? The user would have to implement a decorator that wraps requests.session.Session.request instead of defining a type derived from (or with the same interface as) requests.session.Session.

If we were to decorate the function call as defined in my sample above, decorators could be composed arbitrarily by placing them in a list and combining them at runtime. If we were to create a derived class for each piece of additional functionality, functionality could not be combined unless a) the combination was predefined (less flexible) or b) the classes were created dynamically at runtime (more complex).

If the underlying implementation changes, e.g., if we would move away from requests, the decorator would have the same problem.

Indeed. But you'd have already provided an abstraction over the request itself. Placing an abstraction over the response would presumably be far easier, especially given that httpx has a Response object with a highly similar interface. Regardless, I don't view this as the primary advantage of a decorator approach.

At the end of the day, you're the owner/maintainer of this library. The decision is fundamentally up to you. I'm just trying to provide feedback.

@hackermd
Copy link
Collaborator

If we were to decorate the function call as defined in my sample above, decorators could be composed arbitrarily by placing them in a list and combining them at runtime.

Agreed, that would be nice (similar to the torchvision.transforms.functional API).

I would suggest implementing the approach and play with it. That will probably help us with the remaining API design questions.

@pieper do you want to give it a try for your use case?

@pieper
Copy link
Member Author

pieper commented Apr 29, 2021

Yes, now that we have a good plan I can try that out next time I do a dive into that use case. Now that I know what the core issue was (hitting the 5000 requests/minute quota) I don't need the logging function as much as I did before. But a logging decorator or other ways of manipulating the requests would have made debugging much simpler so it's still worth adding.

@hackermd
Copy link
Collaborator

hackermd commented May 7, 2021

@pieper did you try the callback approach? It would not yet solve the problem of setting the min/max retry values, but it should allow you to inject log messages and thereby partially solve (or at least allow you to debug) your issue.

Moving forward, I think we could introduce request_hooks and response_hooks parameters and retire the callbacks parameter (which would then be covered by response_hooks).
I am wondering whether the request hook mechanism should be implemented in the requests library rather than in dicomweb-client. The requests library currently supports response hooks (see advanced usage section in the docs) and we could propose adding support for request hooks, too.

@ntenenz
Copy link
Contributor

ntenenz commented May 7, 2021

I believe there are two axes worth distinguishing and worthy of discussion here.

  • Request vs Response: As you alluded to, one could create a set of hooks which are invoked before the request is issued and another set that are invoked after the response is received. Alternatively, one could merge these into a single hook which receives the "next function" in the chain. The former would be akin to a list of callable objects, whereas the latter would be similar to a list of decorators.
  • Universal vs Per Request-Type Hooks: Some hooks may only make sense on GETs whereas others may only be relevant to POSTs. Distinguishing by request type provides more granular control at the cost of a more complex interface.

Personally, I lean (slightly) towards the decorator approach for the former, but see the merits of both. For the latter, we may be able to achieve both by accepting either a list or a dict whereas a list would apply all actions to all requests and the dict would map request type -> action.

Thoughts?

@pieper
Copy link
Member Author

pieper commented May 7, 2021

@pieper did you try the callback approach? It would not yet solve the problem of setting the min/max retry values, but it should allow you to inject log messages and thereby partially solve (or at least allow you to debug) your issue.

I did not - now that I know what the issue is I no longer need to log/debug. But I can foresee similar issues coming up in the future, so it'll be good to have a plan.

Regarding the axes that @ntenenz raises, I think it's better to keep the logic in the hook/decorator and not in the dicomweb-client or requests library. That is, I may want to only log certain request types but only when they have a particular url parameter or all request types that hit a particular studyUID. So I think it's enough to have a list of hooks or cascade of decorators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants