Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynatrace / Kubeshark / Openshift compatibility #1467

Open
alongir opened this issue Dec 12, 2023 · 4 comments
Open

Dynatrace / Kubeshark / Openshift compatibility #1467

alongir opened this issue Dec 12, 2023 · 4 comments
Assignees

Comments

@alongir
Copy link
Member

alongir commented Dec 12, 2023

A problem was reported when Kubeshark was installed on Openshift where Dynatrace was installed

This environment where the problem was reported:

Client Version: 4.12.9
Kustomize Version: v4.5.7
Server Version: 4.12.19
Kubernetes Version: v1.25.8+37a9a08
  • Set an environment
  • Test compatibility
  • Provide detail instructions how to bypass any found issues
@berezins
Copy link
Contributor

berezins commented Dec 13, 2023

Hi @alongir ,

I'm unable to reproduce an issue on the simple non-STS single zone (with 2 replicas) ROSA (https://aws.amazon.com/es/rosa/) managed OpenShift cluster with below versions on which I installed latest Dynatrace and then Kubeshark

% oc version                                                  
Client Version: 4.14.3
Kustomize Version: v5.0.1
Server Version: 4.12.19
Kubernetes Version: v1.25.8+37a9a08

and as you can see that all Kubeshark and Dynatrace pods are running successfully without any errors:

% oc get pods --all-namespaces | grep -E 'kubeshark|dynatrace'

default                                            kubeshark-front-647bcc7f66-nfr2j                                               1/1     Running            0                73m
default                                            kubeshark-hub-6f68c99d8-xtd8l                                                  1/1     Running            0                73m
default                                            kubeshark-worker-daemon-set-6pkhm                                              2/2     Running            2 (46m ago)      73m
default                                            kubeshark-worker-daemon-set-7qhz4                                              2/2     Running            1 (42m ago)      73m
default                                            kubeshark-worker-daemon-set-k5fwz                                              2/2     Running            1 (26m ago)      68m
default                                            kubeshark-worker-daemon-set-nmntq                                              2/2     Running            0                73m
default                                            kubeshark-worker-daemon-set-pbxjj                                              2/2     Running            4 (9m28s ago)    73m
default                                            kubeshark-worker-daemon-set-vbkd7                                              2/2     Running            1 (70m ago)      73m
default                                            kubeshark-worker-daemon-set-wd999                                              2/2     Running            0                73m
dynatrace                                          dynatrace-operator-77fdbcb56-vb8tp                                             1/1     Running            0                89m
dynatrace                                          dynatrace-webhook-5dd6dcc547-5dvmh                                             1/1     Running            0                89m
dynatrace                                          dynatrace-webhook-5dd6dcc547-6btc7                                             1/1     Running            0                89m
dynatrace                                          my-dynatrace-openshift-activegate-0                                            1/1     Running            0                87m
dynatrace                                          my-dynatrace-openshift-oneagent-4qjpj                                          1/1     Running            0                87m
dynatrace                                          my-dynatrace-openshift-oneagent-cxbxz                                          1/1     Running            0                87m
dynatrace                                          my-dynatrace-openshift-oneagent-glchh                                          1/1     Running            0                87m
dynatrace                                          my-dynatrace-openshift-oneagent-m25dj                                          1/1     Running            0                87m
dynatrace                                          my-dynatrace-openshift-oneagent-vf46k                                          1/1     Running            0                87m

also here:
image

I even tried to double check if both Dynatrace and Kubeshark agents are run on the worker node.

% oc get nodes
NAME                                            STATUS   ROLES                  AGE     VERSION
ip-10-0-162-155.eu-central-1.compute.internal   Ready    control-plane,master   3h57m   v1.25.8+37a9a08
ip-10-0-183-189.eu-central-1.compute.internal   Ready    control-plane,master   3h56m   v1.25.8+37a9a08
ip-10-0-187-152.eu-central-1.compute.internal   Ready    infra,worker           3h27m   v1.25.8+37a9a08
ip-10-0-202-225.eu-central-1.compute.internal   Ready    infra,worker           3h26m   v1.25.8+37a9a08
ip-10-0-211-103.eu-central-1.compute.internal   Ready    worker                 3h40m   v1.25.8+37a9a08
ip-10-0-213-122.eu-central-1.compute.internal   Ready    worker                 3h41m   v1.25.8+37a9a08
ip-10-0-215-169.eu-central-1.compute.internal   Ready    control-plane,master   3h57m   v1.25.8+37a9a08
% oc debug node/ip-10-0-213-122.eu-central-1.compute.internal 
Temporary namespace openshift-debug-7rbmz is created for debugging node...
Starting pod/ip-10-0-213-122eu-central-1computeinternal-debug-tvqkk ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.213.122
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# pgrep -l '\bworker|dynatrace'
30186 dynatrace-opera
107890 worker
sh-4.4# 

Could you please give me a contact of the person who reported the issue so I would try to figure out more details on my own

@berezins
Copy link
Contributor

berezins commented Dec 19, 2023

Update

Exclusion of anything inside specific namespace described here https://docs.dynatrace.com/docs/shortlink/annotate#exclude-specific-namespaces-from-being-monitored does not work on the vanilla k8s cluster as well (EKS in our case used for testing) despite below can be seen as an evidence that described above is applied to the cluster (by Namespace Selector section in below output on the cluster).

% kubectl describe dynakubes.dynatrace.com eks-my-test-cluster  -n dynatrace       
Name:         eks-my-test-cluster
Namespace:    dynatrace
Labels:       <none>
Annotations:  <none>
API Version:  dynatrace.com/v1beta1
Kind:         DynaKube
Metadata:
  Creation Timestamp:  2023-12-19T21:30:29Z
  Generation:          2
  Resource Version:    3879438
  UID:                 dbd2f659-518a-4265-a843-255a9084570a
Spec:
  Active Gate:
    Capabilities:
      routing
      kubernetes-monitoring
      dynatrace-api
    Image:  
    Resources:
      Limits:
        Cpu:     1000m
        Memory:  1.5Gi
      Requests:
        Cpu:     500m
        Memory:  512Mi
  API URL:       https://ojl75803.live.dynatrace.com/api
  Namespace Selector:
    Match Expressions:
      Key:       kubernetes.io/metadata.name
      Operator:  NotIn
      Values:
        kubeshark
  One Agent:
    Classic Full Stack:
      Env:
        Name:   ONEAGENT_ENABLE_VOLUME_STORAGE
        Value:  false
      Image:    
      Tolerations:
        Effect:     NoSchedule
        Key:        node-role.kubernetes.io/master
        Operator:   Exists
        Effect:     NoSchedule
        Key:        node-role.kubernetes.io/control-plane
        Operator:   Exists
  Skip Cert Check:  true
Status:
  Active Gate:
    Connection Info Status:
      Endpoints:           https://sg-eu-west-1-63-34-47-236-prod29-ireland.live.dynatrace.com/communication,https://sg-eu-west-1-34-251-229-42-prod29-ireland.live.dynatrace.com/communication,https://sg-eu-west-1-18-203-197-59-prod29-ireland.live.dynatrace.com/communication,https://ojl75803.live.dynatrace.com:443/communication
      Last Request:        2023-12-19T23:01:14Z
      Tenant UUID:         ojl75803
    Image ID:              ojl75803.live.dynatrace.com/linux/activegate:latest
    Last Probe Timestamp:  2023-12-19T23:01:14Z
    Source:                tenant-registry
    Version:               1.279.112.20231130-012710
  Code Modules:
  Conditions:
    Last Transition Time:  2023-12-19T21:30:30Z
    Message:               
    Reason:                TokenReady
    Status:                True
    Type:                  Tokens
  Dynatrace API:
    Last Token Scope Request:  2023-12-19T23:01:14Z
  Kube System UUID:            e7b13ee5-cfb7-4151-807d-969d8b7bcc13
  One Agent:
    Connection Info Status:
      Communication Hosts:
        Host:        sg-eu-west-1-34-251-229-42-prod29-ireland.live.dynatrace.com
        Port:        443
        Protocol:    https
        Host:        sg-eu-west-1-18-203-197-59-prod29-ireland.live.dynatrace.com
        Port:        443
        Protocol:    https
        Host:        sg-eu-west-1-63-34-47-236-prod29-ireland.live.dynatrace.com
        Port:        443
        Protocol:    https
        Host:        ojl75803.live.dynatrace.com
        Port:        443
        Protocol:    https
      Endpoints:     https://sg-eu-west-1-34-251-229-42-prod29-ireland.live.dynatrace.com/communication;https://sg-eu-west-1-18-203-197-59-prod29-ireland.live.dynatrace.com/communication;https://sg-eu-west-1-63-34-47-236-prod29-ireland.live.dynatrace.com/communication;https://ojl75803.live.dynatrace.com:443
      Last Request:  2023-12-19T23:01:14Z
      Tenant UUID:   ojl75803
    Healthcheck:
      Interval:      10000000000
      Retries:       3
      Start Period:  1200000000000
      Test:
        /usr/bin/watchdog-healthcheck64
      Timeout:  30000000000
    Image ID:   ojl75803.live.dynatrace.com/linux/oneagent:latest
    Instances:
      ip-192-168-26-198.us-west-1.compute.internal:
        Ip Address:  192.168.26.198
        Pod Name:    eks-my-test-cluster-oneagent-gcrbr
      ip-192-168-38-39.us-west-1.compute.internal:
        Ip Address:               192.168.38.39
        Pod Name:                 eks-my-test-cluster-oneagent-sfhxx
    Last Instance Status Update:  2023-12-19T23:06:17Z
    Last Probe Timestamp:         2023-12-19T23:01:14Z
    Source:                       tenant-registry
    Version:                      1.279.166.20231201-090952
  Phase:                          Deploying
  Synthetic:
  Updated Timestamp:  2023-12-19T23:06:17Z
Events:               <none>

but as can be seen on below screenshot that "Deep monitoring" on the Kubeshark hub pod/process which is in kubeshark namespace (which expected to be excluded accordingly to above) is "Active",
image
while the status of e.g. Kubeshark worker and tracer pods/processes are in the state "Activation of deep monitoring is unsuccessful" as can be seen on below screenshot what means that Dynatrace One Agent still tries to monitor above pods/processes but unsuccessfully what potentially in some circumstance might cause a problems/conflicts
image

Next step(s)

I've provided today everything above along with a link on hub pod/process monitoring in Dynatrace web app/interface to Dynatrace Live support (Support => Live Chat) as an evidence. On the running cluster so they can double check and see/confirm on their own. So now waiting for their response. They usually respond within few hours during working hours by Central Europe time

@berezins
Copy link
Contributor

berezins commented Dec 20, 2023

Gotten from Dynatrace support, that pods/processes exclusion from monitoring (neither per-pod nor the whole namespace) does not work on the Dynatrace

  1. Classic Full Stack
  2. but only on Cloud-Native Full Stack based on this template https://github.com/Dynatrace/dynatrace-operator/blob/main/assets/samples/dynakube/cloudNativeFullStack.yaml .

So if the customer uses the 1st one - then there are no way to configure any exclusion for them, unless they are ready to switch to the 2nd one which is actually is quite easy and straightforward as well and only the point of running few commands as well, which I'm gonna to finally figure out, test and provide little later today. Of course it does not cause losing any existing monitoring data or features

@berezins
Copy link
Contributor

Finally unfortunately I was unable to get 2 above (Cloud-Native Full Stack) deployment/connection/configuration working on neither OpenShift nor vanilla EKS Kubernetes cluster. In both case it succeeds if classicFullStack is specified in below deployment manifest, but fails (below command shows below error) when the only classicFullStack is replaced to => cloudNativeFullStack (i.e. only this single change)

% kubectl get dynakube -n dynatrace
NAME              APIURL                                    STATUS   AGE
my-opshift-test   https://ojl75803.live.dynatrace.com/api   Error    33s
apiVersion: dynatrace.com/v1beta1
kind: DynaKube
metadata:
  name: my-opshift-test
  namespace: dynatrace
spec:
  # Dynatrace apiUrl including the `/api` path at the end.
  # For SaaS, set `ENVIRONMENTID` to your environment ID.
  # For Managed, change the apiUrl address.
  # For instructions on how to determine the environment ID and how to configure the apiUrl address, see https://www.dynatrace.com/support/help/reference/dynatrace-concepts/environment-id/.
  apiUrl: https://ojl75803.live.dynatrace.com/api

  # Optional: Name of the secret holding the credentials required to connect to the Dynatrace tenant
  # If unset, the name of this custom resource is used
  #
  # tokens: ""

  # Optional: Defines a custom pull secret in case you use a private registry when pulling images from the Dynatrace environment
  # The secret has to be of type 'kubernetes.io/dockerconfigjson' (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
  #
  # customPullSecret: "custom-pull-secret"

  # Optional: Disable certificate validation checks for installer download and API communication
  #
  skipCertCheck: true

  # Optional: Set custom proxy settings either directly or from a secret with the field 'proxy'
  #
  # proxy:
  #   value: my-proxy-url.com
  #   valueFrom: name-of-my-proxy-secret

  # Optional: Adds custom RootCAs from a configmap
  # The key to the data must be "certs"
  # This property only affects certificates used to communicate with the Dynatrace API.
  # The property is not applied to the ActiveGate
  #
  # trustedCAs: name-of-my-ca-configmap

  # Optional: Sets Network Zone for OneAgent and ActiveGate pods
  # Make sure networkZones are enabled on your cluster before (see https://www.dynatrace.com/support/help/setup-and-configuration/network-zones/network-zones-basic-info/)
  #
  # networkZone: name-of-my-network-zone

  # Optional: If enabled, and if Istio is installed on the Kubernetes environment, the
  # Operator will create the corresponding VirtualService and ServiceEntry objects to allow access
  # to the Dynatrace cluster from agents or activeGates. Disabled by default.
  #
  # enableIstio: false

  # The namespaces which should be injected into
  # If unset, all namespace will be injected into
  # namespaceSelector has no effect on hostMonitoring or classicFullstack
  # For examples regarding namespaceSelectors, see https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#resources-that-support-set-based-requirements
  #
  # namespaceSelector:
  #   matchLabels:
  #     app: my-app
  #   matchExpressions:
  #    - key: app
  #      operator: In
  #      values: [my-frontend, my-backend, my-database]

  # Configuration for OneAgent instances
  #
  oneAgent:
    # Enables cloud-native fullstack monitoring and changes its settings
    # Cannot be used in conjunction with classic fullstack monitoring, application-only monitoring or host monitoring
    #
    cloudNativeFullStack:
      # Optional: Sets a node selector to control on which nodes the OneAgent will be deployed.
      # For more information on node selectors, see https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
      #
      # nodeSelector: {}

      # Optional: Sets the priority class assigned to the OneAgent Pods. No class is set by default.
      # For more information on priority classes, see https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
      #
      # priorityClassName: priority-class

      # Optional: Specifies tolerations to include with the OneAgent DaemonSet.
      # For more information on tolerations, see https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
      #
      tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
          operator: Exists
        - effect: NoSchedule
          key: node-role.kubernetes.io/control-plane
          operator: Exists

      # Optional: Adds resource settings for OneAgent container
      # Consumption of the OneAgent heavily depends on the workload to monitor
      # The values should be adjusted according to the workload
      #
      # oneAgentResources:
      #   requests:
      #     cpu: 100m
      #     memory: 512Mi
      #   limits:
      #     cpu: 300m
      #     memory: 1.5Gi

      # Optional: Adds custom arguments to the OneAgent installer
      # For a list of available options, see https://www.dynatrace.com/support/help/shortlink/linux-custom-installation
      # For a list of the limitations for OneAgents in Docker, see https://www.dynatrace.com/support/help/shortlink/oneagent-docker#limitations
      #
      # args: []

      # Optional: Adds custom environment variables to OneAgent pods
      #
      # env: []

      # Optional: Enables or disables automatic updates of OneAgent pods
      # By default, if a new version is available, the OneAgent pods are restarted to apply the update
      # If set to "false", this behavior is disabled
      # Defaults to "true"
      #
      # autoUpdate: true

      # Optional: Sets the DNS Policy for OneAgent pods
      # Defaults to "ClusterFirstWithHostNet"
      # For more information on DNS policies, see https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
      #
      # dnsPolicy: "ClusterFirstWithHostNet"

      # Optional: Adds custom annotations to OneAgent pods
      #
      # annotations:
      #   custom: annotation

      # Optional: Adds custom labels to OneAgent pods
      # Can be used to structure workloads
      #
      # labels:
      #   custom: label

      # Optional: Sets the URI for the image containing the OneAgent installer used by the DaemonSet
      # Defaults to the latest OneAgent image on the tenant's registry
      #
      # image: ""

      # Optional: If specified, indicates the OneAgent version to use
      # Defaults to the configured version on your Dynatrace environment
      # The version is expected to be provided in the semver format
      # Example: {major.minor.release}, e.g., "1.200.0"
      #
      # version: ""

      # Optional: Defines resources requests and limits for the initContainer
      # See more: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers
      #
      # initResources:
      #   requests:
      #     cpu: 100m
      #     memory: 512Mi
      #   limits:
      #     cpu: 300m
      #     memory: 1.5Gi

      # Optional: The URI of the image that contains the codemodules specific OneAgent that will be injected into pods and applications.
      # For an example of a Dockerfile creating such an image, see https://dt-url.net/operator-docker-samples
      #
      # codeModulesImage: ""

  # Configuration for ActiveGate instances.
  #
  activeGate:
    # Specifies which capabilities will be enabled on ActiveGate instances
    # The following capabilities can be set:
    # - routing
    # - kubernetes-monitoring
    # - metrics-ingest
    # - dynatrace-api
    #
    capabilities:
      - routing
      - kubernetes-monitoring
      - dynatrace-api

    # Optional: Sets how many ActiveGate pods are spawned by the StatefulSet
    # Defaults to "1"
    #
    # replicas: 1

    # Optional: Sets the image used to deploy ActiveGate instances
    # Defaults to the latest ActiveGate image on the tenant's registry
    # Example: "ENVIRONMENTID.live.dynatrace.com/linux/activegate:latest"
    #
    # image: ""

    # Recommended: Sets the activation group for ActiveGate instances
    #
    # group: ""

    # Optional: Defines a custom properties file, the file contents can be provided either as a value in this yaml or as a reference to a secret.
    # If a reference to a secret is used, then the file contents must be stored under the 'customProperties' key within the secret.
    #
    # customProperties:
    #   value: |
    #     [connectivity]
    #     networkZone=
    #   valueFrom: myCustomPropertiesConfigMap

    # Optional: Specifies resource settings for ActiveGate instances
    # Consumption of the ActiveGate heavily depends on the workload to monitor
    # The values should be adjusted according to the workload
    #
    resources:
      requests:
        cpu: 500m
        memory: 512Mi
      limits:
        cpu: 1000m
        memory: 1.5Gi

    # Optional: Sets a node selector to control on which nodes the ActiveGate will be deployed.
    # For more information on node selectors, see https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
    #
    # nodeSelector: {}

    # Optional: Specifies tolerations to include with the ActiveGate StatefulSet.
    # For more information on tolerations, see https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
    #
    # tolerations:
    # - effect: NoSchedule
    #   key: node-role.kubernetes.io/master
    #   operator: Exists

    # Optional: Adds custom labels to ActiveGate pods
    # Can be used to structure workloads
    #
    # labels:
    #   custom: label

    # Optional: Adds custom environment variables to ActiveGate pods
    #
    # env: []

    # Optional: Specifies the name of a secret containing a TLS certificate, a TLS key and the TLS key's password to be used by ActiveGate instances
    # If unset, a self-signed certificate is used
    # The secret is expected to have the following key-value pairs
    # server.p12: TLS certificate and TLS key pair in pkcs12 format
    # password: passphrase to decrypt the TLS certificate and TLS key pair
    #
    # tlsSecretName: "my-tls-secret"

    # Optional: Sets the DNS Policy for ActiveGate pods
    # Defaults to "Default"
    # For more information on DNS policies, see https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
    #
    # dnsPolicy: "Default"

    # Optional: Specifies the priority class to assign to the ActiveGate Pods
    # No class is set by default
    # For more information on priority classes, see https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
    #
    # priorityClassName: priority-class

    # Optional: Adds custom annotations to ActiveGate pods
    #
    # annotations:
    #   custom: annotation

    # Optional: Adds TopologySpreadConstraints to the ActiveGate pods
    # For more information on TopologySpreadConstraints, see https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
    #
    # topologySpreadConstraints: []

I've contacted their support with this issue so will confirm if it works once they will provide me the solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants