Introduction
Istio is the most popular service mesh, but the DevOps and SREs community constantly complain about its performance.
Istio Ambient is a sidecar-less approach by the Istio committee (majorly driven by SOLO.io) to improve performance. Since there are many promotions about Ambient mesh being production-ready, many of our prospects and enterprises are generally eager to try or migrate to Ambient mesh.
Architecturally, the Istio Ambient mesh is a great design that improves performance. But whether it performs quickly is still a question.
We have tried Istio Ambient Mesh and observed the performance countless times between January 2024 and July 2024, and we have yet to see any significant performance gains.
Below is the lab setup on which we ran our experiments.
Lab setup to load test Istio Ambient Mesh
- Load testing tool: Fortio
- Application configuration: Bookinfo Application
- Total requests fired: 1000 queries/second (QPS), 10 connections, and for 30 seconds.
- Cluster Configuration: Azure (AKS) clusters with 3 nodes
- Node Configuration: 2 VCPU and 7GB memory for each node
- CNI used: Kube CNI and Cilium. (We did not use Flannel because it was not working well with AKS.)
Note:
- Application and Fortio are kept in different nodes:
- We have exposed the Rating microservice and NOT Details service to handle external traffic. Because the Details microservice is written in Ruby, it is unfit for handling higher QPS. We sent a load of 100 QPS and 1000 QPS to the Details service without Istio, and the p99 latency for 100 QPS is around 6 ms, but it goes up to 50 ms for 1000 QPS. Refer to the screenshot below.
Fig: Performance results of Details service at 1000 QPS with Kube CNI + Without Istio
Fig: Performance results of Details service at 1000 QPS with Kube CNI + with Istio sidecar
Performance test on Istio Ambient Mesh with Kube CNI and Cilium
We have carried out the performance or load test for the following cases:
- Kube CNI
- Kube CNI + Istio sidecar (mTLS enabled)
- Kube CNI + Istio Ambient mesh (mTLS enabled)
- Cilium CNI
- Cilium CNI + Istio sidecar (mTLS enabled)
- Cilium CNI + Istio Ambient mesh (mTLS enabled)
Although we have tested each case multiple times, we have attached only three screenshots to showcase the standard deviation of P99 latency in each case.
Load test results for Kube CNI without Istio
Observed (Median) P99 latency: 1.12ms
Fig 1: Kube CNI + Without Istio
Fig 2: Kube CNI + Without Istio
Load test of Kube CNI and Istio sidecar (mTLS enabled)
Observed (Median) P99 latency: 4.72 ms
Fig3: Kube CNI + With Istio Sidecar (mtLS enabled)
Fig4: Kube CNI + With Istio Sidecar (mtLS enabled)
Fig5: Kube CNI + With Istio Sidecar (mtLS enabled)
Load test of Kube CNI and Istio Ambient mesh (mTLS enabled)
Observed (Median) P99 latency: 3.6 ms
Fig6: Kube CNI + With Istio Ambient (mtLS enabled)
Fig7: Kube CNI + With Istio Ambient (mtLS enabled)
Fig8: Kube CNI + With Istio Ambient (mtLS enabled)
Load test of Cilium CNI without Istio
Observed (Median) P99 latency: 4.5 ms
Fig9: Cilium CNI + Without Istio
Fig10: Cilium CNI + Without Istio
Fig11: Cilium CNI + Without Istio
Load test of Cilium CNI and Istio sidecar (mTLS enabled)
Observed (Median) P99 latency: 8.8 ms
Fig12: Cilium CNI + With Istio Sidecar
Fig13: Cilium CNI + With Istio Sidecar
Fig14: Cilium CNI + With Istio Sidecar
Load test of Cilium CNI and Istio Ambient mesh (mTLS enabled)
Observed (Median) P99 latency: 6.8 ms
Fig15: Cilium CNI + With Istio Ambient
Fig16: Cilium CNI + With Istio Ambient
Fig17: Cilium CNI + With Istio Ambient
Final load test results and benchmarking of Rating service with and without Istio
Here are the benchmarking results for the p99 latency of the Rating service with and without Istio (sidecar and Ambient mesh).
Conclusion
Three items are concluded from the experimentation:
- The performance of Istio Ambient mesh will never give you thunderbolt improvements over latency when compared with plain Kube CNI. Note that using Ztunnel for encryption still involves network hops, which will increase the latency. Yes, it is better than Istio sidecar architecture.
- Irrespective of the CNI used, the performance (p99 latency) of the Istio Ambient Mesh is 20% better than that of the Istio sidecar.
- Combining Cilium and Istio (sidecar or Ambient) produces undesirable results. If you are looking for performance improvements, you should avoid this mix.