October 13, 2024

GitOps Infrastructure Workflow with FluxCD and Terraform

Cloud Automation

GitOps is gaining popularity as a best practice for Kubernetes management, and FluxCD is a lightweight, powerful tool to implement it. This blog post introduces the GitOps philosophy and offers a hands-on guide for setting up a fully automated, scalable Kubernetes deployment pipeline on Azure.

This article combines theory with practical implementation, providing readers with valuable skills to manage AKS clusters in a real-world GitOps workflow using FluxCD.

Introduction to GitOps

In the last few years cloud-native development and infrastructure management have rapidly evolved and matured. Yet, IT professionals and developers continue to look for ways to streamline processes, ensure reliability, and improve collaboration. This is where GitOps comes in—a powerful methodology that combines DevOps best practices with Git’s robust version control capabilities to manage infrastructure and applications. GitOps offers a practical, scalable way to manage complex cloud environments.

What is GitOps?

At its core, GitOps is a methodology for managing infrastructure and applications declaratively, using Git as the single source of truth. Essentially, you define the desired state of your system—whether it’s infrastructure resources or application configurations—in code (usually YAML files), store it in a Git repository, and then use automation tools to continuously ensure that your environment matches this desired state.

|resize

When a change is needed (e.g., scaling up an app or deploying a new feature), the process is simple:

  1. You update the code in your Git repository.
  2. Your GitOps tool automatically detects the change and ensures that your Kubernetes cluster reflects the updated state.

This process is entirely declarative, meaning you describe what you want, and the system takes care of how to achieve it. You don’t manually execute commands to change infrastructure or deploy applications. The automation tools handle it for you, based on the code stored in Git.

The Benefits of GitOps

GitOps brings a wealth of benefits to both infrastructure management and application deployment:

  • Version Control and Transparency: Since all configurations are stored in Git, every change is tracked. You have complete visibility into who made what changes, when they made them, and why. This level of version control ensures that teams can collaborate more effectively and reduces the risk of miscommunication.
  • Auditability: Because all changes are committed to Git, you have a full audit trail of every modification to your infrastructure and applications. This is essential for organizations with strict compliance or security requirements. Auditing your infrastructure becomes as easy as reviewing your Git commit history.
  • Easy Rollbacks: Since Git inherently tracks changes and versions, rolling back to a previous state is as simple as reverting a Git commit. This means that if something goes wrong in production, you can easily restore a previous version of your environment or application, reducing downtime and mitigating risk.
  • Automation and Consistency: One of the major advantages of GitOps is the level of automation it brings. With GitOps, you no longer need to manually intervene in deployments or infrastructure changes. Automation ensures that your environments are always consistent with the desired state defined in Git. Whether you’re managing one cluster or dozens, GitOps tools like FluxCD or ArgoCD make sure everything is synchronized, reducing human error and ensuring consistency across all environments.
  • Improved Collaboration: With Git as the core of your operations, every team member, whether an IT admin or a developer, can contribute to infrastructure or application changes using the same tools and workflows they use for code. This fosters better collaboration and allows teams to move faster without waiting on approvals or manual interventions.

FluxCD and ArgoCD: Leading GitOps Tools

Two of the leading GitOps tools in the Kubernetes ecosystem are FluxCD and ArgoCD. Both are designed to bridge the gap between Git repositories and Kubernetes clusters, making them powerful allies in automating cloud-native operations.

FluxCD

|resize

FluxCD is a continuous delivery solution for Kubernetes that integrates GitOps principles into your cluster. With Flux, you can automate the synchronization between your Git repository and your Kubernetes environment. Here’s how it works:

  • Flux watches your Git repository for changes in configuration or application manifests.
  • When a change is detected, Flux pulls the updated configuration and applies it to the Kubernetes cluster, ensuring the desired state is always in sync with your Git repository.
  • Flux also supports automated image updates, allowing it to monitor container registries for new versions of images and deploy them to the cluster when they become available.

FluxCD is lightweight, easy to set up, and integrates smoothly with popular CI/CD pipelines like GitHub Actions and Jenkins. It’s an excellent choice for teams looking for a flexible, yet powerful way to adopt GitOps in their Kubernetes workflows.

ArgoCD

|resize

ArgoCD is another popular GitOps tool tailored for Kubernetes. Like FluxCD, ArgoCD pulls application and infrastructure configurations from a Git repository and ensures your cluster matches the desired state. However, ArgoCD also offers a more visual, user-friendly interface, allowing you to monitor and manage deployments through a dashboard. This makes it an appealing option for teams that prefer more visibility into their cluster’s state.

ArgoCD is highly extensible and offers robust features like multi-cluster support, syncing strategies, and rollback capabilities. It also integrates with tools like Helm, Kustomize, and ConfigMaps, making it a versatile choice for Kubernetes environments.

Why Use FluxCD with Azure Kubernetes Service (AKS)?

When it comes to building cloud-native applications, Kubernetes has emerged as the go-to platform for orchestrating containers at scale. However, managing Kubernetes clusters can quickly become complex as applications grow, requiring a structured, automated approach to deployments. This is where GitOps shines, and combining it with Azure Kubernetes Service (AKS) unlocks even more potential for scalability, security, and ease of use.

Why AKS is Perfect for GitOps Workflows

Azure Kubernetes Service (AKS) is Microsoft Azure’s managed Kubernetes service, and it’s a perfect fit for GitOps workflows for several key reasons:

  1. Scalability: AKS provides built-in scalability for both infrastructure and applications. With just a few commands or an automated configuration, you can scale your clusters up or down based on demand. In a GitOps model, this scalability is even easier to manage—just update your desired state in Git, and your environment will automatically adjust. For example, you can modify a Kubernetes Deployment to increase the number of replicas for a service in your Git repository. FluxCD will detect this change and update your AKS cluster accordingly, scaling out the application without manual intervention. This dynamic scaling is crucial for businesses that need to adapt quickly to changing workloads.
  2. Security and Integration with Microsoft EntraID: One of the standout features of AKS is its tight integration with Microsoft EntraID for role-based access control (RBAC). In a GitOps setup, security is paramount, and AKS provides enterprise-grade security features right out of the box. With EntraID, you can control access to AKS clusters based on user roles, ensuring that only authorized individuals can make changes to the infrastructure or deployments. It also allows you to manage permissions for different GitOps workflows, ensuring that only certain teams can modify production configurations while allowing others to manage development environments. This integration simplifies security management and complements the auditability and version control provided by GitOps, giving you a robust, secure platform for running mission-critical applications.
  3. Seamless Integration with CI/CD Tools: AKS integrates seamlessly with Continuous Integration/Continuous Deployment (CI/CD) tools like Azure DevOps, GitHub Actions, and Jenkins, making it easier to automate the full lifecycle of application deployments. In a GitOps workflow using FluxCD, these tools work in tandem to ensure that your infrastructure is automatically updated whenever new code is pushed to Git. For example, you might use GitHub Actions to build and push a Docker image to Azure Container Registry (ACR). From there, FluxCD can automatically detect new image versions and update the corresponding deployment in AKS. This seamless integration between CI/CD pipelines and GitOps tools minimizes manual effort and accelerates the deployment process.

How GitOps Works with FluxCD

FluxCD is designed specifically for Kubernetes and follows GitOps principles, ensuring that your cluster’s state is always synchronized with your Git repository. Here’s a high-level overview of how FluxCD operates:

  • Syncing with Git: After installing FluxCD into your AKS cluster, you configure it to watch a specific branch of your Git repository. This is where your Kubernetes manifests (or Helm charts, or Kustomize overlays) are stored.
  • Continuous Synchronization: FluxCD continuously monitors your Git repository for changes. When a change is pushed to the repository—whether it’s a new application version or a modification to the infrastructure configuration—FluxCD automatically pulls the update.
  • Applying Changes to AKS: Once Flux detects a change, it automatically applies the updated manifests to your AKS cluster. This ensures that the cluster’s state always matches what’s declared in Git. If there’s a failure in applying changes, Flux provides status updates and logs to help you troubleshoot.
  • Image Automation: One of FluxCD’s standout features is its ability to automate image updates. Flux can monitor container image registries (like ACR) for new versions of your application and update Kubernetes deployments accordingly. This enables zero-touch updates, reducing the need for manual intervention in application deployments.

|resize

Why FluxCD Over ArgoCD?

Both FluxCD and ArgoCD are excellent GitOps tools for Kubernetes, but there are subtle differences that might make one a better fit depending on your use case.

FluxCD is known for its simplicity and lightweight nature. It integrates natively with Kubernetes, making it a great choice for teams looking for a quick and minimalistic setup. Its image automation capabilities also make it a natural fit for use cases that require frequent and automated updates to container images.

ArgoCD, on the other hand, provides a more feature-rich dashboard and has strong support for multi-cluster management and advanced syncing strategies. It also offers robust visualization of application states, which can be beneficial for teams that want more visibility into their deployments.

For this article, FluxCD is chosen because of its straightforward integration with AKS, its minimal overhead, and its image automation features, which are particularly useful in dynamic cloud environments where containerized applications are frequently updated.

Setting up a GitOps Workflow with FluxCD on AKS

Now that we’ve covered the basics of GitOps, why AKS is an ideal platform, and why FluxCD is our tool of choice, it’s time to get hands-on!

In this section, we’ll go through a detailed, step-by-step guide for setting up GitOps using FluxCD v2 on Azure Kubernetes Service (AKS), where the infrastructure deployment will be managed using Terraform and FluxCD’s Terraform Controller. This will help you to enable a full GitOps experience.

⚠️ Note: This guide is for demonstration purposes and the covered configuration choices does not adhere to best practices for production scenarios.

Prerequisites

Before diving into the next steps, ensure you have the following prerequisites in place:

  • Azure Subscription with Owner Permissions: You’ll need an active Azure subscription with Owner permissions to create and manage resources in Azure.
  • Azure CLI: Install Azure CLI for interacting with Azure services from the command line. You can install Azure CLI by following the instructions at Microsoft’s official documentation.
  • Kubectl and Kubectl Login: Ensure kubectl is installed no your system, along with kubelogin to log in to Azure Kubernetes Service (AKS) using Entra ID credentials. You can install kubectl and kubelogin by following the instructions in this documentation.
  • Helm: Make sure you have Helm installed on your system. You can install Helm by following the instructions at Helm’s official website.
  • GitHub account and GitHub CLI: Make sure you have an account at GitHub and have the GitHub CLI installed to interact with GitHub.
  • Flux CLI: Install Flux CLI for installing and managing Flux components in Kubernetes.
  • Visual Studio Code (optional): Have Visual Studio Code installed on your system for editing files and managing your project. You can download Visual Studio Code from here.

Create a Storage Account for Terraform State

Step 1: Start by logging into your Azure account using Azure CLI, by running the az login command in your terminal of choice. This command will open a browser window for authentication or prompt you in the terminal.

Step 2: Terraform needs a backend to store its state file, which we will set up in Azure Blob Storage. Run the commands below to set up the necessary resource group and storage container for Terraform’s state file:

az group create \
  --name <your-resource-group-name> \
  --location <your-location>

az storage account create \
  --name <your-storage-account-name> \
  --resource-group <your-resource-group-name> \
  --location <your-location> \
  --sku Standard_LRS

az storage container create \
  --name tfstate \
  --account-name <your-storage-account-name>

Create a Azure Kubernetes Services Cluster

Step 3: Next, create another resource group and the AKS cluster. We’re creating an AKS cluster with 2 nodes and a managed identity, by running the command below:

az group create \
  --name <your-resource-group-name> \
  --location northeurope

az aks create \
  --name <your-aks-cluster-name> \
  --resource-group <your-resource-group-name> \
  --location <your-location> \
  --dns-name-prefix <your-dns-prefix> \
  --node-count 2 \
  --node-vm-size "Standard_D2s_v3" \
  --enable-managed-identity \
  --tags Environment=Demonstration \
  --generate-ssh-keys

Create a Service Principal for Terraform

Step 4: We’ll use a Service Principal for Terraform to interact with Azure. Create it by running the command below:

az ad sp create-for-rbac \
  --name <your-service-principal-name> \
  --role="Contributor" \
  --scopes="/subscriptions/<your-subscription-id>"

Write down the appId (Client ID), password (Client Secret), and tenant values for the next steps.

Set up GitHub Repository and Secrets

Step 5: Now, set up your GitHub repository to store your Terraform configuration and Flux resources, by running the GitHub CLI commands below:

gh auth login
gh repo create <your-repository-name> --private 
gh repo clone <your-repository-name>
cd <your-repository-name>

Write your Terraform files

Step 6: Create a new folder in your repository, called terraform, and write out your Terraform configuration files. For demonstration purposes, I use the provider.tf and main.tf files below:

# terraform\provider.tf

terraform {
  backend "azurerm" {
    storage_account_name = <your-storage-account-name-from-step-2>
    container_name       = "tfstate"
    key                  = "terraform.tfstate"
    resource_group_name  = <your-resource-group-name-from-step-2>
  }
}

provider "azurerm" {
  features {}
  client_id              = <your-appid-from-step-4>
  client_secret          = <your-password-from-step-4>
  tenant_id              = <your-tenant-id-from-step-4>
  subscription_id        = <your-subscription-id>
}
# terraform\lvm.tf

resource "azurerm_resource_group" "lvm" {
  name     = <your-resource-group-name>
  location = <your-location>
}

resource "azurerm_network_interface" "lvm" {
  name                = <your-resource-name>
  location            = azurerm_resource_group.lvm.location
  resource_group_name = azurerm_resource_group.lvm.name

  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurerm_subnet.internal.id
    private_ip_address_allocation = "Dynamic"
  }
}

resource "azurerm_linux_virtual_machine" "lvm" {
  name                = <your-resource-name>
  resource_group_name = azurerm_resource_group.lvm.name
  location            = azurerm_resource_group.lvm.location
  size                = "Standard_F2"
  admin_username      = <your-admin-username>
  admin_password      = <your-admin-password>
  disable_password_authentication = false
  network_interface_ids = [
    azurerm_network_interface.lvm.id,
  ]

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-jammy"
    sku       = "22_04-lts"
    version   = "latest"
  }
}

resource "azurerm_resource_group" "vnet" {
  name     = <your-resource-group-name>
  location = <your-location>
}

resource "azurerm_virtual_network" "vnet" {
  name                = <your-resource-name>
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.vnet.location
  resource_group_name = azurerm_resource_group.vnet.name
}

resource "azurerm_subnet" "internal" {
  name                 = "internal"
  resource_group_name  = azurerm_resource_group.vnet.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = ["10.0.2.0/24"]
}

Install FluxCD and TF Controller

Step 7: Using the Azure CLI, retrieve the AKS credentials so you can manage the cluster with kubectl, by running the command below:

az aks get-credentials \
  --resource-group <your-resource-group-name-from-step-3> \
  --name <your-aks-cluster-name-from-step-3>

Step 8: Validate that your cluster nodes are running properly, by running the kubectl get nodes command in your terminal.

Step 9: Install FluxCD and the Terraform controller, by running the commands below in your terminal:

flux install
helm repo add tf-controller https://flux-iac.github.io/tofu-controller
helm upgrade -i tf-controller tf-controller/tf-controller --namespace flux-system

Verify that the Flux system and Terraform controller pods are running, by using the kubectl get pods -n flux-system command. It should look similar like the image below:

|resize

Step 10: Flux needs GitHub credentials to pull your repository. First, create a Personal Access Token on GitHub, following [GitHub’s official documentation](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens. I gave mine full repo access. Once you’re ready, create a Kubernetes secret with your GitHub credentials, using the command below:

kubectl create secret generic git-credentials \
  --namespace=flux-system \
  --from-literal=username=<your-github-username> \
  --from-literal=password=<your-personal-access-token>

Create and Apply FluxCD Configuration Files

Now that FluxCD is installed on the AKS cluster, it’s time to set up the necessary configuration files to enable Flux to manage your infrastructure and applications.

Step 11: First, create a new folder in your repository, called flux, and create a repo.yml file. In this file we’ll define the Git repository that Flux will monitor for changes, and it should look like something below:

# flux\repo.yml

apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: repo
  namespace: flux-system
spec:
  interval: 1m
  url: https://github.com/<your-username>/<your-repository-name>.git
  ref:
    branch: main
  secretRef:
    name: git-credentials

Step 12: Next, create a terraform.yml file in this same folder. This file sets up the Terraform resource that Flux will manage, enabling the integration of Terraform with Flux.

# flux\terraform.yml

apiVersion: infra.contrib.fluxcd.io/v1alpha2
kind: Terraform
metadata:
  name: terraform
  namespace: flux-system
spec:
  interval: 1m
  approvePlan: auto
  path: ./terraform
  sourceRef:
    kind: GitRepository
    name: repo
    namespace: flux-system

Step 13: Lastly, create a kustomization.yml file in the same folder. This file defines how Flux should apply Kubernetes manifests and Kustomize overlays.

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: kustomization
  namespace: flux-system
spec:
  interval: 1m
  path: "./"
  prune: true
  sourceRef:
    kind: GitRepository
    name: repo
    namespace: flux-system

Step 14: Now, apply the configuration files to your cluster, by running the commands below in your terminal:

kubectl apply -f flux/repo.yml
kubectl apply -f flux/terraform.yml
kubectl apply -f flux/kustomization.yml

Step 15: Check that Flux is successfully syncing your repository, by running the commands below:

kubectl get gitrepositories -n flux-system
kubectl get kustomizations -n flux-system
kubectl get terraform -n flux-system

This should give you the message that the Git repository is empty, and no source artifacts were found.

Commit and Push Changes to GitHub

To resolve this, we need to commit and push our Terraform configuration files to our GitHub repository.

Step 16: Commit your changes to the repository and push to GitHub, by running the commands below:

git add .
git commit -m "Initial setup for Flux and Terraform"
git push origin main

Step 17: After the push, you can check the status of your FluxCD and Terraform resources again. Do so by running the commands below:

git log -n 1

This command lists the last commit, including the commit ID.

kubectl get gitrepositories -n flux-system

Flux will detect the updates. The result should look similar like the image below:

|resize

kubectl get kustomizations -n flux-system

This command will show you that it applied the revision of the same commit ID shown earlier.

kubectl get terraform -n flux-system

This command will show you that the reconciliation is in progress, accordingly to the pushed changes above.

kubectl describe terraform terraform -n flux-system

This command will show you the actual progress of the Terraform Controller itself.

Step 18: Login to the Azure portal and validate that the configured resources in your Terraform configuration is applied and provisioned. In my case, the Linux Virtual Machine and Virtual Network is successfully provisioned.

⚠️ Note: This guide is for demonstration purposes and the covered configuration choices does not adhere to best practices for production scenarios. Remember that the Azure service principal details are stored in your Terraform backend configuration and provider files (provider.tf).>

💡 Tip: After following this guide, it might be a good idea to delete all the provisioned resources to save costs. This includes the Terraform managed resources, but also the AKS cluster and Storage Account that is used for the Terraform State. Lastly, don’t forget to clean up the EntraID App Registration and GitHub Repository, if you want a total clean-up!

Best Practices for Managing GitOps Workflows

Implementing GitOps can significantly enhance the operational efficiency of cloud environments, but to fully benefit from this approach, it’s essential to adhere to best practices. Below are some critical guidelines to follow for managing GitOps workflows effectively, particularly when using tools like FluxCD.

Security: Protecting your Infrastructure

Security is paramount in a GitOps workflow, especially since the entire infrastructure and application configurations are stored in a Git repository. Consider the following practices to safeguard your environment:

  • Ensure that only authorized users and systems have access to your Git repository. Use SSH keys or OAuth tokens for authentication. Consider enforcing MFA (Multi-Factor Authentication) for Git platform access.
  • Limit access to the AKS (Azure Kubernetes Service) cluster using RBAC (Role-Based Access Control). Only grant the necessary permissions for specific roles, and always use Azure Managed Identities or service principals instead of embedding credentials in scripts or manifests.
  • Use a secrets management solution such as Kubernetes secrets encrypted by KMS (Key Management Service) or Azure Key Vault. Additionally, tools like Sealed Secrets or SOPS can encrypt secrets at rest in your Git repository, ensuring they are not exposed in plaintext.

Branching Strategy: Managing Multiple Environments

A well-defined branching strategy is essential for handling different environments, such as development, staging, and production. Implementing an effective strategy enables smooth deployment workflows and isolates different stages of the development lifecycle.

  • Create dedicated branches for each environment (main for production, dev for development, staging for staging). This structure allows you to test changes in isolation before merging into production.
  • For each change or feature, open a pull request (PR) that can be reviewed by the team before being merged into a target branch. This practice enhances code quality and ensures all changes are intentional.
  • Set up branch protection rules in GitHub to ensure that only approved pull requests can be merged into production. Automated tests, such as those run in CI/CD pipelines, should be part of the approval process to catch any issues before they affect the live environment.

Scalability: Managing Large-Scale Environments

As your infrastructure grows, your GitOps workflow needs to scale with it. Managing multiple clusters and repositories can be complex, but adopting scalable best practices can help streamline operations.

  • For larger environments, break down configurations into separate Git repositories per cluster or environment. This modular approach helps in isolating changes and reduces the risk of affecting multiple clusters at once.
  • FluxCD natively supports multi-cluster setups, allowing you to manage multiple Kubernetes clusters from a single GitOps controller. Define separate GitRepository and Kustomization resources for each cluster to ensure isolated control.
  • Group Kubernetes manifests into logical directories based on services, components, or environments. Use Kustomize overlays to maintain consistency across environments while allowing for environment-specific configurations.

Monitoring and Auditing: Ensure Transparency and Troubleshooting

Monitoring and auditing are critical for ensuring the health of your GitOps setup and troubleshooting issues as they arise. GitOps provides built-in transparency because all changes are tracked in Git, but you still need to implement monitoring strategies for a real-time view of the environment.

  • Regularly review commit history and PR activity in Git to track changes in your infrastructure. Additionally, FluxCD provides detailed logs on synchronization activities, errors, and changes applied to the cluster. Tools like Loki or Elasticsearch can be integrated with FluxCD to centralize and analyze logs.
  • Configure alerts for critical events, such as failed synchronizations, Terraform apply errors, or drift detection. Use monitoring platforms like Prometheus and Grafana, which integrate with FluxCD to visualize cluster metrics and generate real-time alerts.
  • Many organizations have strict compliance requirements. By leveraging Git’s inherent version control, you get an audit trail of all infrastructure changes. This can be paired with Kubernetes auditing features to log every API request made in the cluster, helping track who made changes and when.

Closing Words

We’ve successfully set up a complete GitOps workflow using FluxCD and the Terraform Controller. Your infrastructure is now provisioned, continuously monitored, and will automatically be updated when the code changes. This setup ensures a reliable, automated, and scalable cloud-native environment, making infrastructure management easier and more efficient.

By adopting a GitOps approach using tools like FluxCD or ArgoCD, you’re embracing a more automated, reliable, and transparent way to manage Kubernetes clusters and cloud-native applications. It enhances team collaboration, reduces manual errors, and ensures that your infrastructure is always aligned with the latest changes committed to Git—making it a valuable methodology for any IT professional or developer.

When combining FluxCD with the Terraform Controller, you create a powerful GitOps workflow that automates cloud infrastructure provisioning, simplifies infrastructure management, and improves security. FluxCD ensures that your infrastructure is always in sync with your Git repository, providing a reliable, secure, and efficient solution for cloud-native operations.

To learn more about and to continue your journey with FluxCD, you can start with reading some of the resources below:

Thank you for taking the time to go through this post and making it to the end. Stay tuned, because we’ll keep continuing providing more content on topics like this in the future.

Author: Rolf Schutten

Posted on: October 13, 2024