Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[occm] Multi region support #1924

Open
sergelogvinov opened this issue Jun 16, 2022 · 21 comments · May be fixed by #2595
Open

[occm] Multi region support #1924

sergelogvinov opened this issue Jun 16, 2022 · 21 comments · May be fixed by #2595
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@sergelogvinov
Copy link
Contributor

/kind feature

I am noticed that openstack-client supports multi-region config. https://github.com/gophercloud/utils/blob/master/openstack/clientconfig/testing/clouds.yaml#L160-L171

What do you think to add the multi-region support in one OCCM, based on the config-file, after #1900 ?
If the config file has a regions tree, OCCM checks it at boot time and will watch the nodes in those regions.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 16, 2022
@jichenjc
Copy link
Contributor

I think it's valid.. not sure such use case:

basically, OCCM is responsible for create LB
let's assume in a multi region env we might want LB created in desired region?

@sergelogvinov
Copy link
Contributor Author

Yep. You are right. I forgot about LB.
I spoke about node/node-lifecycle only.

For LB, it can be done by service annotation property...

So, we can introduce it in steps.
Unfortunately, I do not have regional openstack with LB. It will be hard to test.

@mikejoh
Copy link
Contributor

mikejoh commented Jun 16, 2022

We've just implemented our own cloud-provider with a multi-region aware cloud controller manager on OpenStack. At the moment it only covers the node and node-lifecyle controllers. Interesting about the multi-region configuration you can add to the clouds.yaml, i wonder if it's possible to query for resources across multiple regions in one go or if you still need to kind-of loop through the regions one by one.

@sergelogvinov
Copy link
Contributor Author

If the node uninitialized, we need to check all regions to fine the node.
If node has providerId, we can get the region name from it.

and if i right, we need to have go-client connection for each region.

@mikejoh
Copy link
Contributor

mikejoh commented Jun 16, 2022

If the node uninitialized, we need to check all regions to fine the node. If node has providerId, we can get the region name from it.

and if i right, we need to have go-client connection for each region.

Correct! One compute ServiceClient per region basically. We're not completely done with our implementation, it's still in PoC mode, but so far everything looks good. We made sure to implement the k8s.io/cloud-provider InstancesV2 interface which is slimmer and don't require as many queries against the underlying cloud APIs compared to the Instances interface.

As a side note: I know that there's a todo on implementing the V2 interface in the OCCM, which I think we could help out with!

@jichenjc
Copy link
Contributor

As a side note: I know that there's a todo on implementing the V2 interface in the OCCM, which I think we could help out with!

right, I did some work but distracted by too many other stuffs ..so if you can help that will be terrific

commit 5a5030e83fd72838cddb075bef19e46ba999676e
Author: ji chen <[email protected]>
Date:   Fri Apr 15 19:15:10 2022 +0800

    refactory code to prepare instance V2 implementation (#1823)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 14, 2022
@sergelogvinov
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 14, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 13, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 12, 2023
@sergelogvinov
Copy link
Contributor Author

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 12, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 12, 2023
@sergelogvinov
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 12, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 11, 2023
@sergelogvinov
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 14, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2024
@Hybrid512
Copy link

What about this feature ?
This would clearly be needed in our use case where we have 3 Openstack standalone clusters and want to have kubernetes clusters stretched on top of them.
This can be done but things get complicated since CCM can't handle more than 1 Openstack API endpoint and thus, this makes everything related to storage or LBs a hell.

Is this somewhere in the pipe or maybe there are some alternatives ?
Any help would be gladly appreciated.

@dulek
Copy link
Contributor

dulek commented Feb 8, 2024

What about this feature ? This would clearly be needed in our use case where we have 3 Openstack standalone clusters and want to have kubernetes clusters stretched on top of them. This can be done but things get complicated since CCM can't handle more than 1 Openstack API endpoint and thus, this makes everything related to storage or LBs a hell.

Is this somewhere in the pipe or maybe there are some alternatives ? Any help would be gladly appreciated.

I don't think CPO was designed with that use case in mind and it might need a lot of work to get right. Happy to help with advice and reviews if you have some development resources to throw at this.

Interestingly - what's the use case to stretch K8s like that?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 9, 2024
@Hybrid512
Copy link

What about this feature ? This would clearly be needed in our use case where we have 3 Openstack standalone clusters and want to have kubernetes clusters stretched on top of them. This can be done but things get complicated since CCM can't handle more than 1 Openstack API endpoint and thus, this makes everything related to storage or LBs a hell.
Is this somewhere in the pipe or maybe there are some alternatives ? Any help would be gladly appreciated.

I don't think CPO was designed with that use case in mind and it might need a lot of work to get right. Happy to help with advice and reviews if you have some development resources to throw at this.

Interestingly - what's the use case to stretch K8s like that?

In our case, the use cae is to have multi-AZ kubernetes clusters on top of multiple standalone Openstack clusters.
Basically, have one k8s cluster stretched on 3 different Openstack tenants, each in a standalone Openstack cluster.
Our k8s strategy is to have multiple k8s cluster for better ressource and security segmentation with a high level of automation.
Our Openstack clusters have routed networks but they are not streteched clusters themselves and they have their own dedicated storage that is not replicated to the other ones.
The idea behind that is that, since Openstack is the base supporting every k8s cluster we have, we rather secure Openstack operations by not stretching the Openstack itself (and instead, have multiple standalone clusters) so that failure (would it be hardware or security issue or even a simple human error) would impact only one AZ at a time and since our k8s clusters are stretched on top of these Openstack clusters, loosing an AZ would have nearly zero impact on the application layer.

However, I see there are PRs for this and we already try to help the best we can in this way so fingers crossed !

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 13, 2024
@sergelogvinov
Copy link
Contributor Author

Thank you for your feedback.

I did research, and made the lab based on proxmox-vm. CCM/CSI work fine. One pvc-template resource works well on different AZ too, https://github.com/sergelogvinov/proxmox-csi-plugin

I steal believe that I will repeat the same idea in this project. But only for CCM/CSI.
I do not have LB in my openstack setup. And many loadbalancers/ingress-controllers do not work with GEO (region/zonal) balancing knowledge (serve traffic only in one zone/region).

make a vote 👍 please. it helps to contributors to understand how important this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants