Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring timeout for external sources #1812

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

pouyan021
Copy link

@pouyan021 pouyan021 commented Apr 18, 2024

This pull request closes #1624. It adds and enforces the ability to set a new property abort-after to external sources. As discussed in the issue, it supports both a global prop and a maven property that overrides the global if it is set.

spiffcs
spiffcs previously approved these changes Apr 22, 2024
README.md Show resolved Hide resolved
@spiffcs
Copy link
Contributor

spiffcs commented Apr 22, 2024

@pouyan021 Approved and running checks now - Thank you so much for the contribution!

@spiffcs
Copy link
Contributor

spiffcs commented Apr 22, 2024

@pouyan021 looks like there is a small change needed where go mod tidy is run

@pouyan021 pouyan021 force-pushed the feat/allow-configuring-timeout-for-external-sources branch 2 times, most recently from 5cc6f4b to e5ed88d Compare April 22, 2024 17:46
@pouyan021
Copy link
Author

pouyan021 commented Apr 22, 2024

@spiffcs Appreciate your support, thanks a lot! The problem with go.mod is addressed but I forgot to sign-off my commit! I fixed that and pushed again. Could you kindly approve the checks once more?

@pouyan021 pouyan021 requested a review from spiffcs April 25, 2024 06:19
Copy link
Contributor

@spiffcs spiffcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else LGTM - Just waiting for @wagoodman and his final say on the config direction he wants to go here

base-url: https://search.maven.org/solrsearch/select
abort-after: 5m #override the global config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @wagoodman - I know he's pretty sensitive to duplicate fields that override each other so I'd like him to chime in on where he sees this going or what his preference would be

@spiffcs spiffcs self-requested a review April 25, 2024 18:34
@spiffcs spiffcs dismissed their stale review April 25, 2024 18:35

stale review and waiting for IC input

@wagoodman wagoodman changed the title Feat/allow configuring timeout for external sources Allow configuring timeout for external sources May 16, 2024
Copy link
Contributor

@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the functionality looks good, but have some comments on testing and configuration. I'll push up some changes shortly to help out.

I do think that RequestTimeout is a better name for this config item -- what do you think @pouyan021 ?

BaseURL string `yaml:"base-url" json:"baseUrl" mapstructure:"base-url"`
SearchUpstreamBySha1 bool `yaml:"search-upstream" json:"searchUpstreamBySha1" mapstructure:"search-maven-upstream"`
BaseURL string `yaml:"base-url" json:"baseUrl" mapstructure:"base-url"`
AbortAfter *string `yaml:"abort-after" json:"abortAfter" mapstructure:"abort-after"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using time.Duration natively here should work, and also allows for invalid values to be a concern of application configuration processing, which is already done and is much earlier in execution, thus will error earlier upon a misconfiguration:

GRYPE_EXTERNAL_SOURCES_ABORT_AFTER=blurb grype alpine:latest
invalid application config: 1 error(s) decoding:

* error decoding 'external-sources.abort-after': time: invalid duration "blurb"
exit status 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, no action needed, I made the adjustment to use the time.Duration in the application configuration.

return &m.pkg, nil
func (m mockMavenSearcher) GetMavenPackageBySha(ctx context.Context, sha1 string) (*pkg.Package, error) {
deadline, ok := ctx.Deadline()
fmt.Println("GetMavenPackageBySha called with deadline:", deadline, "deadline set:", ok)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test should not have a print, but if output is helpful this should use t.Log instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, no action needed, I made the mock to take testing.TB for logging.

fmt.Println("GetMavenPackageBySha called with deadline:", deadline, "deadline set:", ok)
// Sleep for a duration longer than the context's deadline
select {
case <-time.After(10 * time.Second):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will inadvertently cause existing tests (TestMatcherJava_matchUpstreamMavenPackage) to take 10 seconds. This mock could probably be refactored to account for this, where there is a configurable work duration and error behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, no action needed, I made the adjustment to change the mock to not require sleeping in problematic cases.

@@ -278,9 +278,11 @@ feature is currently disabled by default. To enable this feature add the followi
```yaml
external-sources:
enable: true
abort-after: 10m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can clarify what this means by changing the name some. This could be interpreted as either:
a. aborting looking up from external sources in general after the duration elapses
b. aborting a single request to an external source after the duration elapses

From the functionality implemented b is implied.

Regarding naming and the above context, request-timeout feels like a more descriptive name.

Comment on lines 24 to 30
func defaultExternalSources() externalSources {
return externalSources{
AbortAfter: defaultAbortAfter,
Maven: maven{
SearchUpstreamBySha1: true,
BaseURL: defaultMavenBaseURL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the following quality to multi-teir config items like this:

  • a user specifying top-level fields overrides UNspecified nested fields
  • a user specifying top-level fields does NOT override specified nested fields
  • a user specifying nested fields does NOT top-level fields

Where:

  • specified: a user explicitly provided a value
  • unspecified: a user did NOT explicitly provide a value. This might mean no value should be set or a default value should be set instead.

The implementation does follow this, however, the rendered configuration unintentionally conveys that there is no maven abort-after value:

# running: grype -vv
...
  external-sources:
      abort-after: 10m
      maven:
          abort-after: null

Here from a user perspective it appears as if an override has been set at the top level, but has not been inherited. This is because we're conflating inheriting default values vs overriding nested values.

What we really want is to convey that the user did not specify overriding values in top-level fields, and that the individual sources have sane defaults:

# running: grype -vv
...
  external-sources:
      abort-after: null
      maven:
          abort-after: 10m0s

This has the added benefit that different sources can have different default values (since the default value for nested properties is not inheriting a top-level property).

We've dealt with multi-level properties in syft, see the pretty fields for an example: https://github.com/anchore/syft/blob/v1.4.1/cmd/syft/internal/options/format.go . We end up making both fields optional to show what is specified vs unspecified, then reason about the final values in a PostLoad function, which is invoked while rendering the configuration.

Specifying an override then appears in both places correctly:

# run: GRYPE_EXTERNAL_SOURCES_ABORT_AFTER=40m grype -vv 
  external-sources:
      abort-after: 40m0s
      maven:
          abort-after: 40m0s

Instead of no value being set:

# run: GRYPE_EXTERNAL_SOURCES_ABORT_AFTER=40m grype -vv 
  external-sources:
      abort-after: 40m
      maven:
          abort-after: null

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, no action needed, I made the adjustment to use the PostLoad in a follow up commit.

@wagoodman wagoodman force-pushed the feat/allow-configuring-timeout-for-external-sources branch from ca5d50c to 0bf64dd Compare May 16, 2024 15:29
@wagoodman
Copy link
Contributor

note: I force pushed to get this branch rebased onto the latest commit on main

@pouyan021
Copy link
Author

pouyan021 commented May 17, 2024

Overall the functionality looks good, but have some comments on testing and configuration. I'll push up some changes shortly to help out.

I do think that RequestTimeout is a better name for this config item -- what do you think @pouyan021 ?

Hey @wagoodman thanks a lot for the thorough review and your additional changes. The name was chosen as abort-after based on the suggestion by one of the contributors here. I agree with you that RequestTimeout is more self-explanatory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow configurting timeout for external-sources
3 participants