I hear time and again from fellow DevOps folks about how hard it is to configure and manage Istio service mesh. Most of them are starting out with Istio and much of the complexity arises because of its sidecar architecture.
Istio maintainers have been working on a solution to make Istio easier to implement and manage, both for existing users and newcomers. And last year, Istio introduced ambient mesh.
Ambient mesh is a sidecar-less data plane mode that is set to replace the default sidecar Istio implementation. It is faster, lighter, and high-performing compared to sidecars.
Let us look at the Istio sidecar architecture, its drawbacks, and the new ambient mesh. Let us see how the architecture differs and what ambient mesh has to offer for DevOps folks and architects.
I have covered the topic in a video. Feel free to watch it below.
Istio sidecar architecture
The proxy intercepts the traffic to and from the application container and provides service mesh features — networking, security, and observability — on top of it (see Fig. A).
Fig.A – Sidecar proxy intercepts service-to-service traffic and provides security, observability, and networking features of Istio service mesh
Each pod in the mesh accompanies a sidecar Envoy proxy. The proxy offers both L4 and L7 processing in terms of routing, security (mTLS), and telemetry capabilities.
The collection of Envoy proxies form the data plane component of Istio. The control plane component, called Istiod, controls and configures the behavior of the data plane (refer to Fig.B).
Fig. B – Istio sidecar architecture and components
Challenges with the Istio sidecar pattern
Although sidecar implementation is the default way to deploy Istio, it is not the ideal way for the following reasons.
High memory and CPU utilization
In the sidecar model, each pod in the mesh needs to have the sidecar container running. That means if there are 1000 pods in the cluster, DevOps folks have to allocate compute and memory resources for not only 1000 application containers but also for the same number of Envoy proxies.
Besides, DevOps needs to factor in worst-case usage while allocating resources. It will be on the higher side than the average pod actually requires due to the presence of the proxy container. This leads to the risk of potential under-utilization of resources across the cluster.
Fixed operational cost
Envoy proxy handles both the L4 and L7 capabilities of Istio in the sidecar model. Even If the DevOps folks do not need any L7 features but just encryption of data-in-transit or mTLS, they still need to maintain the sidecar. Sidecar would be an overkill in such scenarios, and it could introduce unnecessary complexity.
The sidecar proxy is very much integrated with the application in the pod in the sidecar pattern. So whenever there are changes or updates made to the sidecar, DevOps teams would need to restart the entire pod. This is not ideal as it can lead to service disruption.
The problem here is that, although Istio can manipulate the traffic without changing the application code, there is no complete separation between the proxy and application container in the sidecar pattern.
There are other issues with the sidecars that often bug mesh operators, apart from the ones mentioned above.
Sometimes sidecars consume more resources than the application itself because of improper configuration, for example. Also, some sidecars do not get updated during their lifecycle management, causing version compatibility issues.
Istio maintainers have been considering all these problems and working towards making Istio better, in all aspects. The new Istio ambient mesh is a promising step towards it.
Introducing Istio ambient mesh
Istio ambient mesh is a sidecar-less implementation of Istio. It is comparatively faster, lightweight, and provides all the L4 and L7 functionality of Istio.
Istio ambient mesh architecture
Istio ambient mesh splits the sidecar functionality into two components: ztunnel and waypoint proxy.
Zero trust tunnel or Ztunnel
Ztunnel is a lightweight, rust-based agent deployed as a daemon in the node (see Fig. C). It handles mTLS, authentication, L4 authorization, TCP metrics, and logging for the pod traffic in the node.
Unlike sidecars ztunnel is deployed per node, and it securely connects and authenticates elements in the mesh.
The connection between ztunnels happens over an HTTP tunnel, called HBONE (or HTTP-Based Overlay Network Environment), where the traffic is encrypted with mTLS (see Figure C).
Fig. C – Ztunnels connected each other over HTTP (called HBONE or HTTP-Based Overlay Network Environment)
No HTTP/L7 processing happens at ztunnel. Ambient mesh has a dedicated waypoint proxy component for that.
Waypoint proxy is basically Envoy which is deployed as a pod, not as a container. It is deployed per namespace or service account (see Fig. D).
Waypoint proxy leverages Istio virtual service resource to handle advanced traffic management — circuit breaking, traffic splitting, retries, fault injection, rate limiting, etc. — and processes L7 authorization and L7 telemetry.
With waypoint proxy, DevOps teams get the benefit of deploying Envoy only for the pods/services that require HTTP/L7 processing.
When a waypoint proxy is deployed, the traffic flows from the source ztunnel to the destination ztunnel through the proxy, before finally reaching the respective destination service (see Fig. D below).
Fig. D – Traffic flow in Istio ambient mesh with waypoint proxy
Benefits of Istio ambient mesh
Istio ambient mesh provides all the features of Istio sidecars in a better, more efficient way. Below are some benefits of ambient mesh.
Reduced resource and cost overhead
Ambient mesh is built in a modular fashion. Two components split the L4 and L7 capabilities of Istio between them unlike sidecars, where both capabilities are in a single architecture component.
As a result, no sidecar containers run alongside the application in pods. It helps DevOps teams as they do not have to allocate too much resource per pod and thus stops underutilization of resources, cluster-wide.
Basic zero trust network/mTLS and L4 functionalities are handled at the node level by ztunnels that are very lightweight and way less resource-intensive. More resources are needed only when waypoint proxies are introduced, which can be auto-scaled based on real-time traffic since it is a Kubernetes pod deployment, thus saving huge amounts of resources and cost.
Zero downtime and better network security
Since there are no sidecar containers tightly integrated with applications in pods in ambient mesh, there is better isolation between applications and the mesh components. The isolation provides better network security overall, as ztunnels and waypoint proxies can still enforce strict authn/z policies on traffic even to a compromised workload.
The architecture also helps in the lifecycle management of ztunnels and waypoint proxies, as it does not require restarting the workloads in the mesh. It can be done independently without disturbing the pods, ensuring zero downtime.
Increased performance and operational efficiency
With Istio ambient mesh, DevOps teams have fewer components to manage. It also helps to significantly reduce latency and increase the overall performance of the mesh. And it saves teams from tedious sidecar lifecycle management.
The ambient mesh architecture also gives great flexibility for DevOps and architects. They can implement Istio gradually into their environments, by using basic but critical functionalities like mTLS and network transport layer (L4) features first and then implementing advanced application layer features (L7) later.
Tabular comparison: Istio ambient mesh vs Istio sidecar service mesh
Ambient mesh implementation is supposed to boost the performance of Istio service mesh. However, it is still in beta so there is still time till rubber hits the road.
Below is a comparison table between Istio ambient mesh and Istio sidecar mesh — from features, operations, and performance dimensions.
How to implement Istio ambient mesh in Kubernetes
I urge DevOps and architects to test ambient mesh in staging as early as you can so that you will be better prepared when it is production-ready.
We have already covered a detailed tutorial, How to Implement Istio Ambient Mesh in GKE or AKS. It shows how to install ambient mesh and enable L4 and L7 authorizations for services running in the mesh. If you are looking to install Istio ambient mesh on AWS EKS, chech this out: Implement Istio Ambient Mesh on EKS in 5 Steps.
IMESH helps enterprises to adopt Istio in their production environments (check our managed Istio offerings). We also help architects and DevOps teams perform dedicated POC on Istio ambient mesh and evaluate it for their non-prod environments.