Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: shared webhooks #110

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

aoldershaw
Copy link

Rendered

Please comment on individual lines, not at the top-level.

Signed-off-by: Aidan Oldershaw <[email protected]>
@taylorsilva taylorsilva self-assigned this Jul 27, 2021
@concourse concourse deleted a comment from EstebanFS Jul 27, 2021
@schmurfy
Copy link

schmurfy commented Aug 7, 2021

Nice to see some movement on this front, I just read it and although I am not familiar enough with the way prototypes works it seems overly complicated. This may not be the type of feedback you wanted but since prototypes are in development since I started using concourse (~ 1.5 years ago) I imagine that following through with this proposal means that the current webhook system is going to be there for quite a while.

I really love concourse and this is by far the best automation system I have used so far but it each features takes years to develop it might become the ultimate automation tool at some point but in the meantime the temporary fixes become permanent and it just generate confusion for newcomers.

I am currently using concourse-webhook-broadcaster because there is not real way to use webhooks otherwise with dynamically generated pipelines (for pull-request, release, deployment).

Copy link

@ari-becker ari-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to see this - we also hit the 20 webhook limit in GitHub, so this is really necessary.


# Open Questions

### Can we avoid needing to implement `types` within Concourse?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea here would be to simply register webhooks on a team- or Concourse-install level, and then have a webhook_ref field under resource to register that resource with the global webhook handler.

Pro: doesn't require waiting for prototypes
Con: not as "clean" as automatic registration with prototypes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I was thinking originally. However, I feel the benefits of automatic registration are super valuable in large-scale deployments with many independent teams, where orchestrating a change to hundreds/thousands of pipelines may not be so easy.

That said, maybe we could support both automatic registration and manual registration via the pipeline config. That way, we can still see some benefits without needing to wait for prototype support

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, even configuring the webhook manually via the pipeline config may still require implementing some types within Concourse (at least with how I'm envisioning it) - e.g. a slack webhook still needs special logic to respond to the webhook challenge.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, obviously there are many benefits to a prototype-driven model.

Honestly, my perspective right now is clouded by a current pipeline that I'm working on where I have a templated Terraform configuration that is currently instanced 4x with a current need to scale it to 100x and plans to scale it to 400-500x and beyond within a year, with the configuration per instance plus the template stored in the same repository. I want to have a job per instance, so that individual instances can be triggered when the configuration for an individual instance changes, without triggering the rest of the instance jobs unnecessarily. This would require different resources per instance, which doesn't scale with webhooks (as I will max out the webhook count on the repository when I hit about 20 instances), so currently I have a single resource for all of the instances which triggers every instance job. So it's rather inefficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a cool use-case, and exactly the kind that this RFC is motivated by! Some clarifying questions:

  • Do all of these instances have configuration within the same repository?
  • Are you looking to use a single webhook on the repository for all of these instances?
  • Is each resource that's responding to these webhooks just tracking the config file for the single instance?

Assuming the answer to all of these questions is yes for the time being...

I was thinking it could look something like this:

resources:
- name: config
  type: git
  source:
    uri: [email protected]:my-org/my-repo.git
    paths: [config/abc.yml]
  check_every: 24h
  webhooks:
  - type: github
    filter:
      repository: {full_name: my-org/my-repo}
      commits:
      - modified: [config/abc.yml]

(I imagine a filter is important here, since otherwise every commit would trigger 400-500 checks)

There's some redundancy here that would be avoided with the prototypes approach, but this feels like a decent compromise. WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yes, and no (but basically yes). Actually we were inspired by what https://github.com/concourse/governance did with keeping most of the data in YAML, so we have a layout like

data/
  templates/
    instance.yaml
    role.yaml
  roles/
    chef.yaml
    programmer.yaml
    teacher.yaml
  instance/
    foo.yaml
    bar.yaml
tf/
  locals.tf
  example.tf

Where, for instance, foo (as defined inside the YAML) might be a chef and a programmer and bar might be a programmer and a teacher. Terraform compiles what the different roles are supposed to mean in terms of specific Terraform resources, and creating a new instance is as simple as cp data/templates/instance.yaml data/instance/baz.yaml and then re-generating the pipeline (probably I will eventually move it to an instanced pipeline, especially if this PR succeeds), where a specific instance of Terraform is isolated by running e.g. terraform workspace select foo && terraform apply -var instance_id=foo, which allows Terraform to arbitrarily scale while staying performant by not keeping too many instances in the same state.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a crack at implementing the manual form of this RFC here: concourse/concourse#7445

Only supports team-scoped webhooks, but would be easy to extend to global. Curious to hear your thoughts, and whether you feel it would suit your needs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aoldershaw Nice! I needed to go back and forth a little bit to see how it would filter the webhook events with commits.modified (it's not inherently obvious, and would much benefit from being documented as an example) but it looks like it fits the bill. Can't wait for a release!

url: https://ci.concourse-ci.org/api/v1/teams/my-team/webhooks/github?token=abc123def456
```

...and a `webhooks` field to be added to the [prototype info response] that tells Concourse what webhook payload(s) should trigger a check for the resource. For instance, the resource:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schmurfy thanks for your comment - going to move the conversation to a thread so it's hopefully easier to follow.

although I am not familiar enough with the way prototypes works it seems overly complicated. This may not be the type of feedback you wanted but since prototypes are in development since I started using concourse (~ 1.5 years ago) I imagine that following through with this proposal means that the current webhook system is going to be there for quite a while

We definitely appreciate that the work on prototypes has been slow moving. We're only just starting to prioritize this prototypes work, so hopefully we'll be able to make some headway soon.

One nice thing about prototypes is how much flexibility they enable. For instance, this proposal just builds upon the existing prototype concepts in a way that would be fully backward compatible, so they provide the ability to add new behaviour as new use cases come up - meaning prototypes don't need to be blocked on this webhooks proposal.

There's also the possibility of implementing some form of shared webhooks without needing to wait for prototypes to be ready - see #110 (comment).

I am currently using concourse-webhook-broadcaster because there is not real way to use webhooks otherwise with dynamically generated pipelines (for pull-request, release, deployment).

This is an interesting point, and something I looked at for inspiration when writing the proposal. Having external tools that sit between Concourse and external services is an interesting idea, but ultimately, I think it'd be much nicer to have built-in support for webhooks without needing to manage yet another deployment (or deployments (and without needing to keep these tools in sync with changes to Concourse - just noticed that you're the author of sapcc/webhook-broadcaster#10)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda missed your answer xD
We currently use concourse-webhook-broadcaster not because we really want to but because in its current state concourse can't provide the feature, as soon as concourse supports it natively we will probably make the switch.
in the mean time it allows us to setup one webhook per repository for a decently large number of pipelines running concurrently now.

aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 20, 2021
the idea here is like concourse/rfcs#110 - operators can configure
webhooks at a level higher than resources (in this case, at the team
level). webhooks have an endpoint `/api/v1/teams/:team/webhooks/:webhook`
that processes the webhook payload and delegates to the appropriate
resources, spawning checks for each resource that "matches".

the proposal in the RFC assumes the existence of prototypes to allow
resources to automatically opt-in to webhooks based on their source
configuration. what is implemented in this commit is the manual form of
that - allowing resource configs to define a list of webhook types and
payload filters that the resource should trigger a new check on.

for instance, consider the following resource definition:

```
resources:
- name: repo
  type: git
  source:
    uri: https://github.com/aoldershaw/test-release.git
  webhooks:
  - type: github
    filter:
      repository: {full_name: aoldershaw/test-release}
```

if a webhook request comes in for a webhook configured to be of type
"github", and the payload "contains" the filter specified above, a check
will be queued up for the resource. note: containment is as defined in
https://www.postgresql.org/docs/9.6/datatype-json.html#JSON-CONTAINMENT

this commit does not deal with the operator interaction to manage
webhooks via fly - just the webhook handler side of things.

Signed-off-by: Aidan Oldershaw <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants