Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the registration work into k8ssandra-client #39

Merged

Conversation

Miles-Garnsey
Copy link
Member

@Miles-Garnsey Miles-Garnsey commented May 2, 2024

Moves the work so far on data plane registration over the k8ssandra-client.

Fixes: #38

@Miles-Garnsey Miles-Garnsey force-pushed the feature/dataplane-registration branch from 6076920 to eea7f97 Compare May 2, 2024 02:36
@Miles-Garnsey Miles-Garnsey changed the title Move the registration work into k8ssandra-client (WIP) Move the registration work into k8ssandra-client May 15, 2024

"github.com/k8ssandra/k8ssandra-client/pkg/registration"
configapi "github.com/k8ssandra/k8ssandra-operator/apis/config/v1beta1"
"github.com/k8ssandra/k8ssandra-operator/pkg/result"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Do not include pkg from k8ssandra-operator as we want to avoid circular dependencies. This project is intended to be used by the cass-operator / k8ssandra-operator / other operators / software.

Copy link
Collaborator

@burmanm burmanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments for the first round of review.

@Miles-Garnsey
Copy link
Member Author

@burmanm I've implemented most of your requested changes. The main place I'm pushing back is in your desire to remove all panics. I don't think that makes sense for a CLI application as a panic is the most straightforward way to shut things down if an error is non-recoverable. Most errors in this logic are non-recoverable, we don't want to really handle anything beyond ensuring the user has good feedback (and a stack trace is ideally part of that).

For consistency, I've done as you've asked in the test code and used require.NoError etc. but even there, I'd probably rather just panic unless we can get something more out of the test by allowing it to continue.

Let me know if I've missed anything else.

@Miles-Garnsey
Copy link
Member Author

Meeting notes:

  1. Miles to remove calls to reconcile.Result and replace with fatal vs non-fatal errors.
  2. Remaining panics can stay.
  3. One panic should be moved to an error, which is the user input validation one when src and dest context are the same.

Copy link
Contributor

@adejanovski adejanovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: it seems like we're losing the context names in the generated kubeconfigs:

  name: cluster
contexts:
- context:
    cluster: cluster
    user: cluster
  name: cluster

Everything gets turned into cluster, and I think that's going to create conflicts with multiple dataplanes being registered. Internally our APIs will list the contexts by their names, not by the clientconfig name.

cmd/kubectl-k8ssandra/register/command.go Outdated Show resolved Hide resolved
cmd/kubectl-k8ssandra/register/command.go Show resolved Hide resolved
@Miles-Garnsey
Copy link
Member Author

Miles-Garnsey commented May 27, 2024

Alex is seeing an error when trying to create a K8ssandraCluster using clientConfigs generated by this process:

2024-05-24T07:57:35.298Z	ERROR	setup	unable to create manager cluster connections	{"error": "invalid configuration: [context was not found for specified context: remote-k8ssandra-operator, cluster has no server defined]", "errorCauses": [{"error": "context was not found for specified context: remote-k8ssandra-operator"}, {"error": "cluster has no server defined"}]}

I've reproduced it using this registration command kubectl k8ssandra register --dest-context gke_k8ssandra_us-central1-c_registration-1 --source-context gke_k8ssandra_us-central1-c_registration-2 --dest-namespace mission-control --source-namespace mission-control and what I can see is as follows:

There is a clientConfig as below:

kubectl get --all-namespaces clientconfigs.config.k8ssandra.io -o yaml
apiVersion: v1
items:
- apiVersion: config.k8ssandra.io/v1beta1
  kind: ClientConfig
  metadata:
    annotations:
      k8ssandra.io/resource-hash: x2LmMyQy3pM9qGIWcftAGY6eaAXw2iNoz4u4xPPnWF0=
      k8ssandra.io/secret-hash: w0hI2mRK3/SvmBm6bTVSI9l0wH1Gg6KxgbG/zoJ22oI=
    creationTimestamp: "2024-05-24T06:29:37Z"
    generation: 1
    name: remote-k8ssandra-operator
    namespace: mission-control
    resourceVersion: "18937"
    uid: a34f1ebe-16fa-4aec-9708-9bea6b66fba4
  spec:
    kubeConfigSecret:
      name: remote-k8ssandra-operator
kind: List
metadata:
  resourceVersion: ""

It references a valid secret in the same namespace remote-k8ssandra-operator which decodes to the following kubeconfig:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://34.69.184.36
  name: cluster
contexts:
- context:
    cluster: cluster
    user: cluster
  name: cluster
current-context: ""
kind: Config
preferences: {}
users:
- name: cluster
  user:
    token: <REDACTED>

Given the nature of the error, it seems likely that we're hitting this. However, when I set the current context field, it doesn't appear to resolve the issue and I get the same error.

Going back to Alex' idea, he believes this is caused by using a static context name "cluster" for user, context and context name fields. I don't think this is the problem since (AFAIK) the server name, user name and context name are simply arbitrary values which are there to bind users to particular servers within a context. Despite that, I've tried to emulate the way the original script does things here by including the original context name back into the new kubeconfig.

We end up with something like the below:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://34.69.184.36
  name: gke_k8ssandra_us-central1-c_registration-2
contexts:
- context:
    cluster: gke_k8ssandra_us-central1-c_registration-2
    user: gke_k8ssandra_us-central1-c_registration-2
  name: gke_k8ssandra_us-central1-c_registration-2
current-context: gke_k8ssandra_us-central1-c_registration-2
kind: Config
preferences: {}
users:
- name: gke_k8ssandra_us-central1-c_registration-2
  user:
    token: ...

This has both the default context set and uses the original source context name to refer to the server and user, however, it still generates the same error in k8ssandra-operator. As near as I can tell, this is exactly the same as what would be produced by the script.

For completeness, I then tried setting the secret and clientConfig names equal to the source context name. I then modified the missioncontrolCluster so that the k8sContext field pointed to the new names. That also produced the same result using this command (after modifying the code so that the resource names default to the sanitized context name):

./kubectl-k8ssandra register --source-context gke_k8ssandra_us-central1-c_registration-2 --dest-context gke_k8ssandra_us-central1-c_registration-1  --dest-namespace test-p1w972i1  --source-namespace mission-control

Finally, I tried creating the clientConfig and secret in the cluster's namespace, instead of the operator's namespace. The operator emitted the same error.

So I went back to the original script and ran:

./scripts/create-clientconfig.sh --src-context gke_k8ssandra_us-central1-c_registration-2 --dest-context gke_k8ssandra_us-central1-c_registration-1 --namespace mission-control --serviceaccount mission-control

Obtaining:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://34.69.184.36
  name: gke_k8ssandra_us-central1-c_registration-2
contexts:
- context:
    cluster: gke_k8ssandra_us-central1-c_registration-2
    user: gke_k8ssandra_us-central1-c_registration-2-mission-control
  name: gke_k8ssandra_us-central1-c_registration-2
current-context: gke_k8ssandra_us-central1-c_registration-2
kind: Config
preferences: {}
users:
- name: gke_k8ssandra_us-central1-c_registration-2-mission-control
  user:
    token: ...

This looks almost identical to what we had above, and has a name that matches, so it should be picked up by the MissionControlCluster spec's k8sContext field.

However, after restarting the k8ssandra-operator pod, we still see the same error:

kubectl logs --follow -n mission-control mission-control-k8ssandra-operator-7d8cf6b6d9-qqhjk
2024-05-27T04:42:07.906Z	INFO	setup	watch namespace configured	{"namespace": ""}
2024-05-27T04:42:07.906Z	INFO	setup	########################### version  commit  date  ###########################
2024-05-27T04:42:07.906Z	INFO	setup	watch namespace configured	{"namespace": ""}
2024-05-27T04:42:07.938Z	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": ":8080"}
W0527 04:42:08.016325       1 warnings.go:70] Use tokens from the TokenRequest API or manually created secret-based tokens instead of auto-generated secret-based tokens.
W0527 04:42:08.023363       1 warnings.go:70] Use tokens from the TokenRequest API or manually created secret-based tokens instead of auto-generated secret-based tokens.
2024-05-27T04:42:08.054Z	ERROR	setup	unable to create manager cluster connections	{"error": "invalid configuration: [context was not found for specified context: gke-k8ssandra-us-central1-c-registration-2, cluster has no server defined]", "errorCauses": [{"error": "context was not found for specified context: gke-k8ssandra-us-central1-c-registration-2"}, {"error": "cluster has no server defined"}]}
main.main
	/workspace/main.go:174
runtime.main
	/usr/local/go/src/runtime/proc.go:250

At least the behaviour is consistent.

@Miles-Garnsey
Copy link
Member Author

OK, it looks like the problem here was that the contextName needs to match both the k8sContext within the K8ssandraCluster, then also the map keys for the context, server and user.

Running this command against an MC cluster appears to result in a successful deployment now:

./kubectl-k8ssandra register --source-context gke_k8ssandra_us-central1-c_registration-2 --dest-context gke_k8ssandra_us-central1-c_registration-1  --dest-namespace mission-control  --source-namespace mission-control --serviceaccount-name mission-control

I've made additional changes which fix problems relating to already exists type errors. We now default to a sanitized source context name for the contextName and meta.name fields in the clientConfig too, propagating the same value into the kubeconfig within the secret.

@adejanovski probably ready for another test now I think.

Copy link
Contributor

@adejanovski adejanovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is working fine now. Thanks!

@Miles-Garnsey Miles-Garnsey merged commit 82f4078 into k8ssandra:main May 27, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move the dataplane registration into the kubectl plugin
3 participants