Managing Multi-Cluster AKS with Azure Kubernetes Fleet Manager

Why Multi-Cluster Management Matters

As organizations mature in their Kubernetes adoption, the single-cluster approach quickly shows its limits. Teams need multiple clusters for geographic distribution to serve users closer to their location, for workload isolation to separate production from staging and development, and for scaling beyond the resource limits of a single cluster. Some industries require clusters in specific regions for data residency compliance, while others need separate clusters for different business units or teams.

Managing these clusters individually becomes an operational burden. Each cluster needs its own upgrade cycle, its own monitoring configuration, and its own set of policies. Without a unified control plane, configuration drift is inevitable, and a routine update can become a multi-day coordination effort across teams.

This is exactly the problem Azure Kubernetes Fleet Manager was built to solve.

What Is Azure Kubernetes Fleet Manager

Azure Kubernetes Fleet Manager is a multi-cluster management service that provides a single control plane for orchestrating operations across multiple AKS clusters. Rather than treating each cluster as an independent entity, Fleet Manager introduces a fleet-level abstraction that lets you manage groups of clusters as a cohesive unit.

The architecture centers on three key concepts. The hub cluster is the central control plane that Fleet Manager provisions and manages. It stores fleet-level resources and coordinates operations across all member clusters. Member clusters are your existing AKS clusters that you join to the fleet. They continue to run your workloads as before, but now receive coordinated updates and policies from the hub. Fleet-level resources are Kubernetes objects that you define once on the hub and have Fleet Manager propagate to the appropriate member clusters.

You can create a fleet with or without a hub cluster. A hubless fleet is sufficient if you only need update orchestration. A fleet with a hub cluster unlocks workload propagation and multi-cluster networking capabilities.

Update Orchestration: Controlled Rollouts Across Clusters

One of Fleet Manager's most valuable capabilities is coordinated update management. Instead of manually upgrading each cluster and hoping nothing breaks, you define update strategies that control how updates roll out across your fleet.

An update strategy consists of stages, each containing one or more groups of clusters. Fleet Manager processes stages sequentially but can update clusters within a group in parallel. Between stages, you can configure wait periods that give you time to validate that the previous stage completed successfully before proceeding.

For example, you might define a three-stage strategy: first update your development clusters, wait for validation, then update staging clusters, wait again, and finally update production clusters. If an issue is detected at any stage, you can halt the rollout before it reaches production.

Update runs are the actual execution of an update strategy. Each run targets a specific Kubernetes version or node image version and applies it according to the strategy you defined. You can monitor the progress of each run and see which clusters have been updated, which are in progress, and which are pending.

This staged approach dramatically reduces the risk of fleet-wide outages from a problematic update and brings the same progressive delivery principles that teams use for application deployments to infrastructure management.

Multi-Cluster Load Balancing with Service Export

For organizations that need traffic distribution across clusters in different regions, Fleet Manager provides multi-cluster L4 load balancing through service export. This feature allows you to expose a Kubernetes service from multiple member clusters behind a single Azure Load Balancer endpoint.

When a service is exported from multiple clusters, Fleet Manager automatically manages the traffic routing. If one cluster becomes unhealthy, traffic is redirected to healthy clusters without any manual intervention. This creates a resilient, multi-region service architecture without requiring external load balancing infrastructure or complex DNS failover configurations.

The service export mechanism uses standard Kubernetes resource types extended with fleet-level semantics, keeping the operational model familiar to teams already working with Kubernetes.

Property-Based Scheduling for Workload Placement

A significant capability currently in preview is property-based scheduling, which allows you to define placement policies for workloads based on cluster properties. Instead of manually deciding which workloads run on which clusters, you express your intent through constraints and preferences, and Fleet Manager handles the placement.

For example, you might specify that a workload should only run on clusters in EU regions for data residency compliance, or that it should prefer clusters with GPU node pools for machine learning workloads. You can combine cost optimization with availability requirements by spreading workloads across the minimum number of clusters needed to meet your redundancy targets.

This declarative approach to workload placement removes the manual coordination overhead and ensures that placement decisions are consistently applied as clusters join or leave the fleet.

When to Use Fleet Manager vs Managing Individual Clusters

Fleet Manager is not necessary for every AKS deployment. If you operate a single cluster or a small number of clusters that rarely change, the overhead of setting up a fleet may not be justified. The direct AKS management tools and Infrastructure as Code approaches work well for these scenarios.

Fleet Manager becomes valuable when you have five or more clusters that need coordinated updates, when you operate clusters across multiple regions and need traffic management between them, or when your organization has multiple teams managing their own clusters but requiring consistent policies and upgrade cadences.

The service is also particularly useful for organizations adopting a cluster-per-environment or cluster-per-team model, where the number of clusters grows proportionally with the organization.

Getting Started

Fleet Manager is generally available and included in your Azure subscription at no additional cost for the fleet resource itself. You only pay for the underlying AKS clusters and any hub cluster resources.

To explore the full capabilities and set up your first fleet, see the official Azure Kubernetes Fleet Manager documentation. For details on the latest generally available workload orchestration features, check the Azure updates announcement.

For organizations already running multiple AKS clusters, Fleet Manager offers a meaningful reduction in operational complexity. The investment in setting up a fleet pays for itself the first time a coordinated update runs smoothly across all your clusters while you focus on other work.

Daniel Moquist

Author

May 20, 2025

Daniel Moquist

Cloud Architect & DevOps Expert