Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

How to debug OpenShift operators on a live cluster using dlv

April 24, 2023
Swarup Ghosh
Related topics:
ContainersGoKubernetesOperators
Related products:
Red Hat OpenShift

Share:

    Debugging operators can be tricky, especially if the operator needs to be debugged on a live cluster, which is useful for developing Red Hat OpenShift cluster operators. Remotely running delve debugger inside the container helps in this case. This article is about debugging operators live in an OpenShift cluster on the fly by rebuilding the operator container image and using go dlv remotely through the oc port-forward.

    About cluster operators and Delve debugger

    Kubernetes operators are used to manage the lifecycle of applications within a Kubernetes cluster. The operator pattern is aimed at simplifying installation, management, and configuration of applications and services. OpenShift is an operator-first platform with its fundamental architecture strongly rooted to various operators. In the OpenShift world, operators help to manage the lifecycle of the running cluster as well as different applications that run on top of it. With each OpenShift cluster installation, there comes a set of default operators known as cluster operators which help to manage different aspects of the OpenShift cluster. An OpenShift cluster marks cluster creation as complete once all the cluster operators running in the cluster can reach a healthy running state.

    Cluster Version Operator (CVO) is one of the important cluster operators that reconciles the resources within the cluster to match them to their desired state while ensuring that other cluster operators remain healthy. Each cluster operator manages specific area of the cluster’s functionality and these operator deployments observe a few set of args in their respective deployments manifests as per the configuration set by the cluster apart from other necessary values.

    For the purpose of this example, we will use the cluster-kube-apiserver-operator running on an OpenShift cluster and live debug the running operator remotely on a VS Code setup using go dlv debugger.

    Delve is one of the most widely used debuggers used for Golang. It has the option to allow debugging a go binary remotely through a connected tcp port with the help of which developers can get debug access to the operator binary running inside the actual cluster.

    Debugging tutorial steps

    The following tutorial is aimed at allowing developers to debug operators running on the cluster.

    The first step to modifying any cluster operator running on OpenShift is to disable the CVO this would help prevent the cluster operator deployment manifests to be tweaked without having it be reconciled to to the default image. With the kubeconfig of the running cluster and via oc command, the following command would disable CVO completely.

    $ oc scale --replicas=0 deploy/cluster-version-operator -n openshift-cluster-version
    

    Alternatively, if the cluster operator itself allows the user to set it to an unmanaged state through the ClusterVersion object, for the kube-api-server operator it would be as follows:

    $ oc patch clusterversion/version --type='merge' -p "$(cat <<- EOF
    spec:
      overrides:
      - group: apps
        kind: Deployment
        name: kube-apiserver-operator
        namespace: openshift-kube-apiserver-operator
        unmanaged: true
    EOF
    )"
    

    Either methods should work except that the first method completely disables CVO while the second method is specific to allowing deployment changes to kube-api-server operator only. It is noteworthy to mention that these steps are not required if you plan to use this tutorial to debug operators which are not OpenShift cluster operators. In that case, you can start past this point.

    The deployment for kube-api-server operator can be displayed as follows:

    $ oc get deployment/kube-apiserver-operator -o yaml -n openshift-kube-apiserver-operator
    ---
      name: kube-apiserver-operator
      namespace: openshift-kube-apiserver-operator
      ownerReferences:
      - apiVersion: config.openshift.io/v1
        kind: ClusterVersion
        name: version
        uid: 4b0f3c33-ade3-4e67-832f-169f8e297639
    ---
    spec:
    ---
        spec:
          automountServiceAccountToken: false
          containers:
          - args:
            - --config=/var/run/configmaps/config/config.yaml
            command:
            - cluster-kube-apiserver-operator
            - operator
            env:
            - name: IMAGE
              value: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0314d2c0df2cf572cf8cfd13212c04dff8ef684f1cdbb93e22027c86852f1954
            - name: OPERATOR_IMAGE
              value: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2766c4f3330423b642ad82eaa5df66a5a46893594a685fd0562a6460a2719187
            - name: OPERAND_IMAGE_VERSION
              value: 1.25.2
    ---
    

    As one can observe for the proper functioning of this operator, there are various container args plus environment variables required to be set as a part of the container spec. While debugging operators, developers usually start off by running the Golang binaries locally but it can get cumbersome to set these args and environment variables manually. To avoid this, we can rebuild the operator’s image with a debugger-friendly binary, push it to a registry, and use it in the deployment manifest of the operator.

    In the case of kube-apiserver-operator, one can obtain the source of the operator by cloning it from the GitHub repository, and this step should be similar for any other operator or the developer should have the source files ready on their local at the time of reading this.

    Launch a VS Code editor from the local folder containing the source files. Make the following changes to the Dockerfile of the operator. 

    We create a copy of the Dockerfile and ensure that Go build of the binaries are built with the gcflags as “all=-N -l” before. It can either be passed at command line using go build -gcflags="all=-N -l" or by setting environment variable GCFLAGS. This should be set as a builder stage of the Dockerfile where binaries are compiled from source. For kube-apiserver-operator, the environment variable was set.

    FROM ... AS builder
    RUN go install -mod=readonly github.com/go-delve/delve/cmd/dlv@latest
    ...
    COPY . .
    ENV GO_PACKAGE github.com/openshift/cluster-kube-apiserver-operator
    ENV GCFLAGS "all=-N -l"
    RUN make build --warn-undefined-variables
    ...
    

    The same dlv binary needs to be copied over to the final image as well, using the following command:

    FROM ...
    COPY --from=builder /go/bin/dlv /usr/bin/
    ...
    

    This would ensure that at the time of running the binary in the container as a part of the operator deployment, we can run it using dlv and bind the debug stub inside the container to a port that can later be port-forwarded.

    The final Dockerfile would be as follows:

    FROM ... AS builder
    RUN go install -mod=readonly github.com/go-delve/delve/cmd/dlv@latest
    WORKDIR /go/src/github.com/openshift/cluster-kube-apiserver-operator
    COPY . .
    ENV GO_PACKAGE github.com/openshift/cluster-kube-apiserver-operator
    ENV GCFLAGS "all=-N -l"
    RUN make build --warn-undefined-variables
    
    FROM ...
    COPY --from=builder /go/bin/dlv /usr/bin/
    ...
    

    The same can be done for any other operator including ones which are not cluster operators. The only necessity is to build the Go binary with -l and -N to ensure that the linker keeps the symbols for helping to debug later. The size of the debug binary and hence the debug image could be more than the stripped binary we ship in production operators.

    After obtaining the newly modified Dockerfile at Dockerfile.debug, we can build and push it to the registry using:

    $ podman build -t quay.io/<USERNAME>/<REPO>:<ANY_TAG> -f Dockerfile.debug .
    [1/2] STEP 1/7: ...
    ---
    [2/2] COMMIT quay.io/swghosh/cluster-kube-apiserver-operator:debug
    --> b18f722bd49
    Successfully tagged quay.io/swghosh/cluster-kube-apiserver-operator:debug
    b18f722bd49ad82b8763917800eb0481ef0135b6b1f619973a6fb7c144a09cef
    
    $ podman push quay.io/<USERNAME>/<REPO>:<ANY_TAG>
    ---
    Copying blob 7c33fa50bff3 done  
    Copying config b18f722bd4 done  
    Writing manifest to image destination
    Storing signatures
    

    The next step would be to patch the deployment of the running operator to use the this newly prepared image and alter the container args of the same to run using dlv.

    $ oc edit deployment kube-apiserver-operator -n openshift-kube-apiserver-operator
    # Change the spec.template.containers[0].args,command to use dlv
         - args:
            - --listen=:40000
            - --headless=true
            - --api-version=2
            - --accept-multiclient
            - exec
            - /usr/bin/cluster-kube-apiserver-operator
            - --
            - operator
            - --config=/var/run/configmaps/config/config.yaml
            command:
            - /usr/bin/dlv     
    # Change the spec.template.containers[0].image
            image: quay.io/swghosh/cluster-kube-apiserver-operator:debug
    

    The dlv binary needs to be run in the container which would execute the built Golang binary of the operator. For any operator, the patch would finally to run the command: /usr/bin/dlv --listen=:40000 --headless=true --api-version=2 --accept-multiclient exec /usr/bin/<operator_binary> -- <other_operator_args>. Headless and listen arguments for dlv are required to enable the dlv debugger to run as a stub in headless mode and bind it to a container port which we can access later. Once edited, we can save the deployment and close the editor for the new operator pod to take effect.

    Verify that the new operator pod is running after the manifest change as follows:

    $ oc get pods -n openshift-kube-apiserver-operator
    NAME                                       READY   STATUS    RESTARTS   AGE
    kube-apiserver-operator-86c5fc45cd-rr695   1/1     Running   0          29s
    

    Once the pod is in running state, port-forward 40000 port from the container which is the dlv debug port (used before as a part of --listen). This would enable the traffic at localhost:40000 to be forwarded to 40000 port bound to dlv process inside the container. Re-run the command if the connection for the port-forward times out.

    $ oc port-forward pod/kube-apiserver-operator-65bd9656cc-jdvjr 40000 -n openshift-kube-apiserver-operator
    Forwarding from 127.0.0.1:40000 -> 40000
    Forwarding from [::1]:40000 -> 40000
    

    Now, this operator can be debugged from VS Code or using dlv from the command line and connecting to localhost:40000. To debug it using VS Code, select the remote attach debugger option in Run > Add Configuration > Go: Connect to Server (as shown in Figure 1). This action will allow remote Go debugging.

    The VScode debug target window.
    Figure 1: In VS Code, this debug target is selected "Go: Connect to Server" to allow remote Go debugging.

    Next, connect to the localhost (as shown in Figure 2).

    Setting the VS code debug target to remote host.
    Figure 2. In VS Code, set the debug target "localhost" to the remote host.

    Then, set the port to 40000 (as shown in Figure 3).

    The VS code debug window for setting the remote port.
    Figure 3. In the VS Code window, set the debug port to 40000.

    Finally, the launch.json in the .vscode directory of the local source directory would contain something similar to what is shown in Figure 4. The contents of launch.json will be auto-generated with the necessary configurations provided by the remote host and port set in the previous steps.

    Contents of the launch.json in the VScode window for remote Go debugging.
    Figure 4. The contents of the launch.json after setting the debug host and port should be similar.

    After the launch.json is setup with the necessary details, you can start the Debug > Connect to server target (as shown in Figure 5). Start live debugging of the Go binary running through delve inside the cluster.

    Shows where to start live degugging in the VS code window from the debug explorer.
    Figure 5. From the debug explorer in VS code, select the "Connect to server" debug target to start live debugging.

    Now you can use breakpoints, watch, check the current call stack, and do much more with your operator all while it is running live in the cluster.

    Running remote delve simplifies debugging operators

    We have illustrated how to simplify debugging operators by live debugging using dlv remotely. You can say goodbye to eerie print statements. If you have questions, please comment below. We welcome your feedback.

    Last updated: September 19, 2023

    Related Posts

    • Using Delve to debug Go programs on Red Hat Enterprise Linux

    • Remote debugging on Kubernetes using VS Code

    • Set up an OpenShift cluster to deploy an application in odo CLI

    • Why not couple an Operator's logic to a specific Kubernetes platform?

    Recent Posts

    • How Kafka improves agentic AI

    • How to use service mesh to improve AI model security

    • How to run AI models in cloud development environments

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    What’s up next?

    odo cheat sheet

    As a developer, you want to develop software without the overhead of do-it-yourself operations. The OpenShift CLI odo lets you develop cloud-native applications without learning dozens of commands. Our odo cheat sheet has the commands you need to get started.

    Get the cheat sheet
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue