Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Prevent auto-reboot during Argo CD sync with machine configs

December 20, 2021
Ishita Sequeira
Related topics:
CI/CDGitOpsKubernetes
Related products:
Red Hat OpenShiftRed Hat OpenShift Container Platform

Share:

    Nodes in Red Hat OpenShift can be updated automatically through OpenShift's Machine Config Operator (MCO). A machine config is a custom resource that helps a cluster manage the complete life cycle of its nodes. When a machine config resource is created or updated in a cluster, the MCO picks up the update, performs the necessary changes to the selected nodes, and restarts the nodes gracefully by cordoning, draining, and rebooting them. The MCO handles everything ranging from the kernel to the kubelet.

    However, interactions between the MCO and the GitOps workflow can introduce major performance issues and other undesired behavior. This article shows how to make the MCO and the Argo CD GitOps orchestration tool work well together.

    Machine configs and Argo CD: Performance challenges

    When using machine configs as part of a GitOps workflow, the following sequence can produce suboptimal performance:

    1. Argo CD starts a sync job after a commit to the Git repository containing application resources.
    2. If Argo CD notices a new or changed machine config while the sync operation is ongoing, MCO picks up the change to the machine config and starts rebooting the nodes to apply it.
    3. If any of the nodes that are rebooting contain the Argo CD application controller, the application controller terminates and the application sync is aborted.

    Because the MCO reboots the nodes in sequential order, and the Argo CD workloads can be rescheduled on each reboot, it could take some time for the sync to be completed. This could also result in undefined behavior until the MCO has rebooted all nodes affected by the machine configs within the sync.

    Extend the application's manifest in Git

    The solution to the interactions in the previous section requires you to extend the application's manifest in Git by adding PreSync and PostSync hooks to Argo CD. Argo CD provides these hooks so that you can ensure that operations of your choice are performed before and after each sync (Figure 1). As the name suggests, a PreSync hook is a job that Argo CD executes right before the sync starts. Similarly, the PostSync hook executes after a sync.

    Sync Hook Workflow
    Figure 1. Sync hook workflow.

    We will use kam-blog as the sample application for this demo. We have generated this application following directions in the article Bootstrap GitOps with Red Hat OpenShift Pipelines and kam CLI.

    Add sync hooks to Argo CD

    Our PreSync job pauses the Machine Config Pool (MCP) so it does not reboot the nodes in order to apply the machine config changes. We ensure this pause by setting the flag .spec.paused to true.

    To insert the PreSync job, create a file named pre-sync-job.yaml and add it to the same directory as the application. The content of the file is:

    apiVersion: batch/v1
    kind: Job
    metadata:
      annotations:
        argocd.argoproj.io/hook: PreSync
        argocd.argoproj.io/hook-delete-policy: HookSucceeded
      name: mcp-worker-pause-job
      namespace: openshift-gitops
    spec:
      template:
        spec:
          containers:
            - image: registry.redhat.io/openshift4/ose-cli:v4.4
              command:
                - /bin/bash
                - -c
                - |
                  echo -n "Waiting for the MCP $MCP to converge."
                  echo $(oc patch --type=merge --patch='{"spec":{"paused":true}}' machineconfigpool/$MCP)
                  sleep $SLEEP
                  echo "DONE"
              imagePullPolicy: IfNotPresent
              name: mcp-worker-pause-job
              env:
              - name: SLEEP
                value: "10"
              - name: MCP 
                value: worker
          restartPolicy: Never
          serviceAccount: sync-job-sa
    

    The PostSync hook resumes the MCP so that it reboots the nodes, applying the queued or incoming machine config changes. Enable this behavior by setting the flag .spec.paused to false. To insert the PostSync job, create a file named post-sync-job.yaml and add it to the same directory as the application. The content of the file is:

    apiVersion: batch/v1
    kind: Job
    metadata:
      annotations:
        argocd.argoproj.io/hook: PostSync
        argocd.argoproj.io/hook-delete-policy: HookSucceeded
      name: mcp-worker-resume-job
      namespace: openshift-gitops
    spec:
      template:
        spec:
          containers:
            - image: registry.redhat.io/openshift4/ose-cli:v4.4
              command:
                - /bin/bash
                - -c
                - |
                  echo -n "Waiting for the MCP $MCP to converge."
                  sleep $SLEEP
                  echo $(oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/$MCP)
                  echo "DONE"
              imagePullPolicy: Always
              name: mcp-worker-resume-job
              env:
              - name: SLEEP
                value: "5"
              - name: MCP 
                value: worker
          dnsPolicy: ClusterFirst
          restartPolicy: OnFailure
          serviceAccount: sync-job-sa
          serviceAccountName: sync-job-sa
          terminationGracePeriodSeconds: 30

    Add permissions for Sync Hooks

    In order for these jobs to execute successfully, they need permissions to manipulate machine config resources in the cluster. These permissions need to be granted using a ServiceAccount and appropriate ClusterRole and ClusterRoleBinding properties.

    To add the ServiceAccount, ClusterRole, and ClusterRoleBinding properties, create a file named sync-job-cluster-rbac.yaml and add it to the same directory as the application. The content is:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      annotations: {}
      name: sync-job-sa
      namespace: openshift-gitops
    ---
    
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: sync-job-sa-role
    rules:
      - apiGroups:
          - apiextensions.k8s.io
          - machineconfiguration.openshift.io
        resources:
          - machineconfigpools
        verbs:
          - get
          - list
          - patch
    ---
    
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: sync-job-sa-rolebinding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: sync-job-sa-role
    subjects:
      - kind: ServiceAccount
        name: sync-job-sa
        namespace: openshift-gitops

    You can now apply the configuration to the cluster using the following command:

    $ kubectl apply -k config/argocd

    After you have applied the configuration, try manually syncing the application. You should see that the PreSync and PostSync jobs have paused and unpaused the MCP as shown in Figure 2.

    The OpenShift user interface shows the actions of the PreSync and PostSync hooks.
    Figure 2. The OpenShift user interface shows the actions of the PreSync and PostSync hooks.

    You can also see that the MCP paused by examining its details (Figure 3).

    Machine Config Pool details show that it is paused.
    Figure 3. Machine Config Pool details show that it is paused.

    Once the sync job finishes, the PostSync job unpauses the MCP and resumes all the updates to the nodes in the cluster. The MCP details show this change as well (Figure 4).

    Machine Config Pool details show that it is unpaused.
    Figure 4. Machine Config Pool details show that it is unpaused.

    If the sync fails for any reason, the MCP will stay paused and won't update the nodes. To resume MCP updates, you have to manually update the MCP and set the flag .spec.paused to false. You can set the flag using the following command:

    $ oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/worker

    Conclusion

    Updates to machine configs can lead to uncontrolled node reboots, termination of the sync process, and unanticipated issues in the application. The workaround in this article helps to prevent nodes from rebooting while the critical Argo CD sync operations are in progress.

    Last updated: September 20, 2023

    Related Posts

    • Bootstrap GitOps with Red Hat OpenShift Pipelines and kam CLI

    • Why should developers care about GitOps?

    • Managing GitOps control planes for secure GitOps practices

    • Modern Fortune Teller: Using GitOps to automate application deployment on Red Hat OpenShift

    • The present and future of CI/CD with GitOps on Red Hat OpenShift

    Recent Posts

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    • How to encrypt RHEL images for Azure confidential VMs

    • How to manage RHEL virtual machines with Podman Desktop

    • Speech-to-text with Whisper and Red Hat AI Inference Server

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue