Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDK post-deployment experience #437

Open
1 of 11 tasks
rix0rrr opened this issue May 26, 2022 · 6 comments · May be fixed by NOUIY/aws-cdk-rfcs#83 or NOUIY/aws-cdk-rfcs#87
Open
1 of 11 tasks

CDK post-deployment experience #437

rix0rrr opened this issue May 26, 2022 · 6 comments · May be fixed by NOUIY/aws-cdk-rfcs#83 or NOUIY/aws-cdk-rfcs#87

Comments

@rix0rrr
Copy link
Contributor

rix0rrr commented May 26, 2022

Description

We want to support operators of CDK applications during their common tasks. What are the biggest problems/frustrations you would like us to address?

  • Getting an overview of the application?
  • Creating dashboards and alarms?
  • Ticketing?
  • Operational tasks?
  • Log inspection?

Let us know in the discussion below.

Roles

Role User
Proposed by @rix0rrr
Author(s)
API Bar Raiser
Stakeholders

See RFC Process for details

Workflow

  • Tracking issue created (label: status/proposed)
  • API bar raiser assigned (ping us at #aws-cdk-rfcs if needed)
  • Kick off meeting
  • RFC pull request submitted (label: status/review)
  • Community reach out (via Slack and/or Twitter)
  • API signed-off (label api-approved applied to pull request)
  • Final comments period (label: status/final-comments-period)
  • Approved and merged (label: status/approved)
  • Execution plan submitted (label: status/planning)
  • Plan approved and merged (label: status/implementing)
  • Implementation complete (label: status/done)

Author is responsible to progress the RFC according to this checklist, and
apply the relevant labels to this issue so that the RFC table in README gets
updated.

@sholtomaud
Copy link

Yes! This would be fantastic.

I have questions about the boundary between the DevOps and SysOps post-deployment experience.

  1. Is this focused on a "per-application single-pane of glass" post-deployment experience for DevOps?
  2. What about the "multi-account, multi-app single-pane of glass" post-deployment experience for SysOps?
  • There may not be clear Team/Enterprise boundaries for DevOps & SysOps, but regardless, it would be nice to have patterns/worflows for a clear pathway to the multi-account, multi-app post-deployment experience on the SysOps side.

  • Perhaps eventbridge "cross-account event backbone" could be used as an enabler for SysOps monitoring Dashboards?
    https://dev.to/eoinsha/how-to-use-eventbridge-as-a-cross-account-event-backbone-5fik

  • Integration with OpenSearch/Graphana/Prometheus/aws-discovery-agent going back to prerequisites for custom managed infra.

@kadrach
Copy link
Member

kadrach commented May 30, 2022

We want to support operators of CDK applications during their common tasks. What are the biggest problems/frustrations you would like us to address?

I have come across two personas of operators in this sense, one being the more "DevOps" aligned build and run team, the other being the traditional "ops" team with often little insight into the application.

A frustration I've seen repeatedly with customer teams is the complexity involved in correlating logs (both "system logs" like CloudFormation deployments, custom resources, flow logs; and "application logs"). This issue does not stem from CDK itself, and can e.g. be addressed with appropriately crafted Insights queries. The CDK should be able to improve the user experience here.

An intuitive "here are all the logs" view would benefit both build and run, as well as run-only teams. Personally I'd love to see an extensible cdk logs --follow capability (see #277).

@sholtomaud
Copy link

@kadrach yeah, that's what I'm getting at with a SysOps workbench "single pane of Glass" post-deployment experience.
AWS have OpenSearch, Graphana, Prometheus etc which they could develop a CDK Secure Org bootstrap pattern for which bundles all account logs into an OpenSearch searchable interface with Graphana etc over the top.

@rix0rrr
Copy link
Contributor Author

rix0rrr commented Jun 15, 2022

What about the "multi-account, multi-app single-pane of glass" post-deployment experience for SysOps?

I think by default it would be "whatever you say belongs together will go together"... if there are multiple levels of hierarchy would that be sufficient? What is the use case you are thinking of? At the very least we do want to support multiple accounts.

Integration with OpenSearch/Graphana/Prometheus/aws-discovery-agent going back to prerequisites for custom managed infra.

Interesting. I'm not sure what that would look like. The simplest to achieve would be to embed arbitrary pages, I suppose. But that may not be enough integration, to which the alternative would be having to write adapters that can query various metrics backends.

An intuitive "here are all the logs" view would benefit both build and run, as well as run-only teams

I agree that there are many logs that are widely spread out and they're not always easy to search. I definitely see the benefit here.

(Honestly -- a unified logs viewer that pulls from a bunch of different log groups and other sources is totally something someone could build today)

@sholtomaud
Copy link

What is the use case you are thinking of?

BAU SysOps for monitoring and alerting of all the things aiming at Site Reliability Eng metrics.

if there are multiple levels of hierarchy would that be sufficient?

If SysOps are having to constantly log into different accounts (maybe 100) to gain any insight or monitoring then no, we don't want account-based hierarchy.

the alternative would be having to write adapters that can query various metrics backends.

Surely AWS already have some of this implemented for SRE monitoring of the AWS Console suite?? We don't want to re-invent the wheel, but yes adapters would be fine.

unified logs viewer

Yes, a unified logs viewer and searcher += integrate with unified CMDB (what owner/costcode/system/service created the logs)

@mrpackethead
Copy link
Contributor

We want to support operators of CDK applications during their common tasks. What are the biggest >problems/frustrations you would like us to address?

Not having supported L2 constructs for AWS services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants