You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
Tell us about your request
What do you want us to build?
(Out-of-Box) NativeAWS Experience for Nvidia EC2 Instance Failure Handling on EKS include:
NIX etc. failure metrics monitoring (currently latest AWS CloudWatch Observability Add-on does not have that)
Unhealth instance detection, termination and replacement managed by EKS ManagedNodeGroup
Which service(s) is this request for?
This could be Fargate, ECS, EKS, ECR
EKS Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
Are you currently working around this issue?
How are you currently solving this problem?
Additional context
Anything else we should know?
Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
The text was updated successfully, but these errors were encountered:
Community Note
Tell us about your request
What do you want us to build?
(Out-of-Box) NativeAWS Experience for Nvidia EC2 Instance Failure Handling on EKS include:
Which service(s) is this request for?
This could be Fargate, ECS, EKS, ECR
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
Are you currently working around this issue?
How are you currently solving this problem?
Additional context
Anything else we should know?
Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
The text was updated successfully, but these errors were encountered: