Custom drain command? #737

harrytang · 2023-02-28T08:44:23Z

Hi there,

I have been using Kured and I find it to be a great tool for automating node reboots in a Kubernetes cluster.

I was wondering if there are any plans to add support for custom drain commands in Kured? It would be really helpful if we could specify our own custom drain command that Kured would execute before rebooting a node.

If this is not currently on the roadmap, I would love to know if it's something that the Kured development team would consider adding in the future.

Thank you for your time and for your work on Kured. I look forward to hearing back from you.

Best regards,
Harry

jackfrancis · 2023-02-28T21:25:03Z

Hi Harry, we could be open to that. Could you provide an example of what you'd like to do in addition to (or instead of) the normal k8s "drain node" behavior?

harrytang · 2023-02-28T22:11:58Z

Hi,

We are currently using Longhorn Storage in our cluster, and while a node is being drained, we still need some components functioning so that the volumes can be properly detached. (see https://longhorn.io/docs/1.4.0/volumes-and-nodes/maintenance/#updating-the-node-os-or-container-runtime)

We normally use this drain command:

kubectl drain NODEX --delete-emptydir-data --ignore-daemonsets --pod-selector='app!=csi-attacher,app!=csi-provisioner,longhorn.io/component!=instance-manager'

Hope you find this helpful.

Thank1

jackfrancis · 2023-02-28T23:56:11Z

That helps, thanks @harrytang!

I'll think about how we might put something like this together, stay tuned!

github-actions · 2023-04-30T01:54:08Z

This issue was automatically considered stale due to lack of activity. Please update it and/or join our slack channels to promote it, before it automatically closes (in 7 days).

harrytang · 2023-04-30T04:44:48Z

Github keep

ddsmith2-eprod · 2023-05-16T13:42:55Z

Are there any additional suggestions on how to deal with this issue? If you have volume that is not replicated the node fails to drain due to the pod disruption budget. This happens over time when a volume is not used as much and you only have one replica. I see that my predecessors used to stop dockerd and iscsid using Ansible to patch and reboot nodes.

I tried setting forceReboot=true, but it does not seem to help.

EDIT: I did find a setting in Longhorn for Allow Node Drain with the Last Healthy Replica. I'll test this and try to remember to report back here.

docbobo · 2023-06-27T06:44:38Z

I am in the same boat as everyone else regarding Longhorn draining. "Allow Node Drain with the Last Healthy Replica" is not solving this issue for me though.

tylerauerbeck · 2023-08-11T16:22:30Z

Is anybody taking a look at this? I'm currently attempting to find ways of creating alert manager silences and then removing them when the node comes back up, so I'd want to have a pre reboot command and a post reboot command just like there are for labels. I'd imagine it could tie into the same hooks that the labels use and take a similar approach to how users are able to specify their own reboot command.

I'd be happy to pull something together for this, just want to see if this approach is acceptable to folks.

ckotzbauer · 2023-08-11T17:00:11Z

Hm, it depends a bit on how you want to implement/use pre- and post-reboot commands. Do you want to call a command on the host (with nsenter as for the sentinel- and reboot-commands) or should the command work inside the container?
We're currently working on restricting privileges of kured and finding a way to avoid commands on the host with nsenter. Otherwise, there are no plans to add commands/binaries to our own docker-image which can be used within the container.

ant31 · 2023-08-15T11:41:50Z

We are looking for the same kind of features (pre-reboot).

The usecase is that we must sometime switchover leader database before rebooting a node. We could have this action triggered by Kured automatically before rebooting.

@ckotzbauer There are various ways to do that without changing the kured container image.
For example, it could be a pod template configuration, and the pod/job are then executed (with no privilege) and the controller would wait for them to terminate successfully.

--pre-boot=' {containers: [image: switchover-pg,
                                          command: ["switch-db --node-name=$(NODE_ID)"]
                       }
--pre-boot=' {containers: [image: silence-alerts,
                                          command: ["turn-off-alerts --node-name=$(NODE_ID)"]
                       }
--post-boot=' {containers: [image: silence-alerts,
                                          command: ["turn-on-alerts --node-name=$(NODE_ID)"]
                       }

I'm sure there are other ways to define/execute those kinds of commands, it's just a quick example.

IMO, the feature would be useful.
it could also reduce a bit the need for you to implement too many integrations upstream.

kingnarmer · 2024-04-11T14:15:01Z

Any plans to add this feature to roadmap ?

ckotzbauer · 2024-04-26T13:36:42Z

@kingnarmer
When there's a good concept and someone who needs this is able to support here with a PR, it can be implemented anytime.

jackfrancis self-assigned this Feb 28, 2023

jackfrancis mentioned this issue Feb 28, 2023

feat: Add reboot-required annotation #715

Closed

github-actions bot added the no-issue-activity label Apr 30, 2023

dholbach added keep This won't be closed by the stale bot. and removed no-issue-activity labels Apr 30, 2023

docbobo mentioned this issue Jun 27, 2023

Support pod-selector for drain command #788

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom drain command? #737

Custom drain command? #737

harrytang commented Feb 28, 2023

jackfrancis commented Feb 28, 2023

harrytang commented Feb 28, 2023

jackfrancis commented Feb 28, 2023

github-actions bot commented Apr 30, 2023

harrytang commented Apr 30, 2023

ddsmith2-eprod commented May 16, 2023 •

edited

Loading

docbobo commented Jun 27, 2023

tylerauerbeck commented Aug 11, 2023

ckotzbauer commented Aug 11, 2023

ant31 commented Aug 15, 2023 •

edited

Loading

kingnarmer commented Apr 11, 2024

ckotzbauer commented Apr 26, 2024

Custom drain command? #737

Custom drain command? #737

Comments

harrytang commented Feb 28, 2023

jackfrancis commented Feb 28, 2023

harrytang commented Feb 28, 2023

jackfrancis commented Feb 28, 2023

github-actions bot commented Apr 30, 2023

harrytang commented Apr 30, 2023

ddsmith2-eprod commented May 16, 2023 • edited Loading

docbobo commented Jun 27, 2023

tylerauerbeck commented Aug 11, 2023

ckotzbauer commented Aug 11, 2023

ant31 commented Aug 15, 2023 • edited Loading

kingnarmer commented Apr 11, 2024

ckotzbauer commented Apr 26, 2024

ddsmith2-eprod commented May 16, 2023 •

edited

Loading

ant31 commented Aug 15, 2023 •

edited

Loading