Building a Resilient Multicloud EKS Infrastructure: AWS + Azure Hybrid Nodes
- Darren Lavery

- Jul 7
- 5 min read
Updated: Jul 7
In today’s cloud-native world, resilience and flexibility aren’t just nice-to-haves — they’re essentials. What if you could combine the best of Amazon EKS with the regional reach or compliance capabilities of Azure or multi-cloud? That’s exactly what hybrid nodes let you do.
By extending your EKS worker node pool to include Azure-based VMs, you can run Kubernetes workloads across both AWS and Azure — without spinning up a second control plane or managing two clusters. In this article, we’ll break down the architecture, infrastructure code (with Terraform), and practical tips to get you started with a production-ready multicloud EKS deployment.
Why Go Hybrid?
Here’s why hybrid cloud Kubernetes setups are worth considering:
Business continuity: Workloads on Azure nodes remain online even if AWS compute hits issues, although control-plane outages will pause scheduling and cluster-level operations.
Cost optimization: Keep critical workloads on AWS and push less-sensitive tasks to lower-cost Azure resources.
Regional flexibility: Reach users in locations where AWS doesn't operate or meet compliance needs that Azure better serves.
Unified management: Run all workloads under a single Amazon EKS control plane, even though your worker nodes span multiple clouds.
Let’s dive in.
At the heart of this setup is a fully managed EKS control plane running in AWS. What makes it unique? The worker nodes live in Azure. By securely linking the two clouds — using VPN or Transit Gateway — we turn Azure VMs into first-class EKS nodes using nodeadm and AWS Systems Manager. The result: a seamless hybrid Kubernetes cluster that spans clouds without doubling your control plane.
Multicloud EKS Infrastructure Architecture Overview

As stated above, the core of this setup is a fully managed Amazon EKS control plane running inside an AWS VPC. However, instead of hosting all the worker nodes within AWS, this hybrid configuration extends the EKS cluster to Azure, leveraging Linux virtual machines (such as Ubuntu 24.04) as remote Kubernetes worker node or nodes.
These Azure VMs are seamlessly integrated with the EKS cluster over a secure VPN connection or AWS Transit Gateway, allowing them to function as first-class EKS nodes. This is made possible through the use of nodeadm, a specialized tool that handles registration and lifecycle management of external nodes in EKS.
To ensure secure and automated management, AWS Systems Manager (SSM) is used to bootstrap, configure, and manage these hybrid worker nodes, even though they reside outside AWS. This enables centralized control and eliminates the need for SSH access or manual intervention.
For cross-cloud pod networking, Cilium is deployed as the CNI (Container Network Interface). Cilium provides robust features such as BGP peering for routing, encryption of inter-node traffic, and deep visibility and control over network policies and flow, making it a powerful choice for hybrid and multi-cloud Kubernetes networking.
This architecture enables organizations to run a unified Kubernetes environment that spans multiple clouds, combining the management simplicity of EKS with the flexibility of Azure-hosted workloads.
Key Networking Tips for Hybrid EKS-Azure Deployments:
Networking is often the most complex aspect of hybrid Kubernetes setups — especially when connecting AWS and Azure. Here are some best practices to ensure a smooth, secure deployment:
Use AWS Site-to-Site VPN for the quick path. For dedicated private connectivity, pair AWS Direct Connect with Azure ExpressRoute through a provider (e.g., Megaport) or a co-location cross-connect.
Deploy virtual network gateways on both cloud platforms with non-overlapping CIDR blocks to prevent routing conflicts.
Configure firewall rules and Azure NSGs to allow required traffic flows — including access to the Kubernetes API server, pod CIDRs, and other critical components.
Cilium plays a vital role in bridging the clouds: it enables encrypted pod-to-pod communication, BGP-powered routing, and high-performance CNI functionality — all essential for cross-cloud networking.
Terraform: Your Automation for Multicloud EKS Infrastructure Backbone
All infrastructure components are defined using Terraform, enabling reproducibility, modularity, and consistent deployment across both AWS and Azure. Our modular approach makes it easy to manage and extend the hybrid setup across environments.
What the Code Builds:
On AWS:
Core Infrastructure:
aws_eks_cluster to run the managed Kubernetes control plane
aws_iam_role, aws_vpc, subnets, and route tables for secure, scalable networking
aws_ec2_transit_gateway for cross-cloud or multi-VPC routing
Hybrid Node Management:
aws_ssm_activation and related IAM roles allow AWS Systems Manager (SSM) to securely control Azure-based worker nodes
On Azure:
Hybrid Worker Nodes:
azurerm_linux_virtual_machine (Ubuntu 24.04 recommended)
Networking components: azurerm_virtual_network_gateway, NICs, IPs, and NSGs
Node Bootstrapping:
Cloud-init or provisioning scripts install nodeadm and register the VM with the EKS cluster
Kubernetes + Cilium:
Cilium is installed using Helm to power advanced CNI features, with:
BGP peering for hybrid pod networking
Load balancer IP pools and encryption for secure, cross-cloud traffic
Required Terraform Providers:
terraform {
required_providers {
aws = ">= 5.47.0"
azurerm = "~> 3.0"
azapi = "~> 1.5"
helm = "~> 2.0"
kubernetes = "~> 2.0"
}
}
Our team has developed this hybrid EKS deployment using production-grade Terraform modules and cloud-init scripts. While this article focuses on the architecture and approach, we’re happy to share the code for the right use case — just reach out.
It is important to note that there are only certain Operating Systems that work with hybrid nodes.
These include:
OS | Notes |
Amazon Linux 2023 | Use in virtualized environments |
Ubuntu 20.04–24.04 | Fully supported for hybrid usage |
RHEL 8 & 9 | Enterprise-ready with hybrid capabilities |
The billing considerations are also a key factor when creating hybrid node clusters, please take note of the following:
AWS charges for the EKS control plane and associated services (e.g., SSM, VPN).
Azure VMs are billed separately under your Azure subscription.
Hybrid node billing starts once a VM joins the EKS cluster and stops when it is de-registered.
Be proactive with resource cleanup to avoid unnecessary costs.
Tips & Advanced Considerations for Building a Hybrid Cluster
Once your basic hybrid architecture is in place, there are several advanced considerations and optimizations worth exploring. These can improve performance, security, and operational efficiency when connecting Azure VMs to your AWS-hosted EKS cluster.
Advanced: Cilium & BGP Perring for Cross-Cloud Networking
Cilium is an essential part of the hybrid networking layer, enabling high-performance networking and transparent security policies.
· Install Cilium in EKS using Helm with a templated BGP config file:
helm install cilium cilium/cilium --values cilium-bgp.yaml· Define BGP peering rules via YAML templates (cilium-bgp.yaml.tpl) to control traffic between cloud zones and VMs.
· Validate hybrid pod-to-pod networking by testing connectivity between workloads on AWS and Azure nodes.
For detailed guidance, refer to AWS’ official hybrid nodes documentation, which outlines the lifecycle of hybrid nodes and control plane interactions.
Example Terraform Snippets to Get You Started
To give your team a head start, here are some simplified Terraform examples used in this project:
Create the AWS VPC using a module:
module "vpc" { source = "terraform-aws-modules/vpc/aws" name = "hybrid_eks_vpc" cidr = "10.20.0.0/16" azs = ["af-south-1a", "af-south-1b"] private_subnets = ["10.20.1.0/24", "10.20.2.0/24"] public_subnets = ["10.20.101.0/24", "10.20.102.0/24"] enable_nat_gateway = true enable_vpn_gateway = true}Provision an Azure Linux VM for use as a hybrid worker node:
resource "azurerm_linux_virtual_machine" "hybrid-node-vm" { name = "azure-hybrid-node" resource_group_name = azurerm_resource_group.rg.name location = "South Africa North" size = "Standard_DS2_v2" admin_username = "ubuntu" ...}You’ll want to bootstrap these VMs using cloud-init or provisioning scripts to pass in the SSM activation code and install nodeadm, enabling the VM to register with your EKS cluster securely.
Final Thoughts
Hybrid Kubernetes clusters are no longer niche. Whether for regulatory compliance, cloud independence, or disaster resilience, this architecture offers real operational value.
By combining:
· Amazon EKS for a fully-managed control plane
· Azure VMs for elastic compute resources
· Cilium for secure, cross-cloud networking
· Terraform for unified infrastructure as code
...you can build a multicloud-native Kubernetes environment that feels like a single, cohesive cluster — with no duplicate control planes or manual sync headaches.
If you’re ready to go hybrid, chat to us at cloudandthings.io.




Comments