What is canary deployment in CI/CD

With the adoption of microservices, cloud, and Kubernetes, the CI/CD processes proved to be very useful in deploying features and changes into production faster. The new normal is the faster time to market with software delivery (or CI/CD) pipelines. What sets a company apart is releasing better, high-quality features that provide excellent consumer expectations or experience.  

However, the fear of delivering a wrong software change to the production haunts DevOps and SREs teams. That’s why canary deployment is becoming famous among architects and the DevOps team.

What is Canary Deployment?

Canary deployment is a strategy to release software into production gradually. The process involves allowing a fraction of users to test newly deployed software. If all the criteria, such as performance and quality, are at par with the previous release, more users are allowed to use the new software. The iteration is carried out till the newly deployed software is rolled out completely to the production users.

Canary Deployment phases in CI/CD

While performing CI/CD using tools such as Spinnaker or Argo CD, DevOps and developers want to deploy using a canary strategy. The canary deployment is usually implemented in four phases (refer to Fig A). 

  1. Deployment: 1st phase of canary deployment is the deployment of a new release. In this phase, CD tools and GitOps tools are used to deploy software. 
  2. Release a small portion of traffic: After the new version is deployed, a small version of traffic is routed to the latest version, and most of the traffic can go to the stable version. 
  3. Analysis and validation: In the analysis and validation phase, the canary is tested to see if it is working fine in terms of performance and quality. In case the canary performs as the stable version, then more traffic is routed to the canary. 
  4. Rollback/Rollback: Based on the analysis phases, the canary can be rolled back or rolled forward to serve 100% of the traffic in the production. 
Phases of Canary deployment in CI/CD

Fig A: Phases of Canary deployment in CI/CD

These phases are important for reducing the risks of the software release while maintaining the speed of deployment.

There are times when architects, and DevOps teams might use canary deployment and canary analysis as the same. Well, read the next section to find out the difference and understand how canary analysis is an integral part of the canary deployment. 

Canary Deployment vs Canary Analysis

Canary analysis is a part of the canary deployment process where the Ops team needs to validate the Canary with each increment of traffic percentage. Usually, during the canary analysis, an SRE or Ops person would collect the metrics and logs of the new release and validate the performance, quality, and security. If all the criteria are met, the traffic to the new version will increase further.

The perfect Canary analysis can be a bit tricky because it involves statistical evaluation of metrics and logs of the new version (Canary). Since a small load to the Canary will not be statistically relevant with the baseline version, a 3rd application with the baseline version is created to route the same amount of traffic as Canary (refer to Fig B). Let us understand the steps in detail.  

The steps to perform canary analysis are:

  1. Two applications of the same stable version (baseline) must be created. Let us call them- B1 and B2. 
  2. A new version of the application can be deployed in the same cluster. Let’s call it the canary version. 
  3. In case you want to route a small percentage of traffic, say 5%, to the Canary. Then route another 5% of B1 and the rest 90% to B2. 
  4. After a certain amount of time, like 4-5 mins, the Canary will be sent for performance and quality evaluation. 
  5. The metrics such as CPU & memory utilization, latency and throughput of Canary is compared to that of the baseline: B1 version. And if any metrics of the Canary version are not performing in a particular range ( to that of B1), then the Canary will be deemed unfit for further rollout. Similarly, the application logs, security logs, traffic logs, and API logs will also be collected to understand the behavior of the Canary. If the behavior is very dissimilar to that of the B1 version, the Canary can be rolled back. 
  6. In case the Canary version performs as well as B1, then the traffic percentage can be increased to, say, 15% for each application – Canary and B1. The Remaining 70% can be redirected to B2.
canary analysis in canary deployment strategy

Fig B: Way to perform canary analysis in canary deployment

Book a call for expert

Benefits of Canary Deployment

Using a canary deployment strategy in your software delivery process has three important benefits. 

  1. Less risk with one-click rollback: Since Canary involves testing with only a small portion of real-time traffic, it is less risky. In case of any issues, the Canary can be rolled back instantly, and all 100% of traffic can be routed to the stable version. 
  2. Understanding real-time customer behavior: With the Canary release, developers and business managers can understand the customer behavior and response wrt the new feature released in the Canary. 
  3. Experiment culture: Canary provides the ability to keep a culture of build, measure, and learn. Millennials and Gen Z want to experiment with a feature before subscribing to it, and canary testing allows developers to release a feature to the market and test if new changes are getting accepted. 

Open source software required to implement canary in CI/CD 

For deploying applications using canary strategy we need to ensure there are two important things:

  1. Traffic splitting tools: Since the canary deployment strategy involves routing a small portion of production traffic to the new version, the DevOps team would require an L7 traffic management tool. Traditional API gateways, load balancers, and Ingress controllers are enough for the traffic routing, but we recommend reimagining your landscape with service mesh software like Istio. Using flexible and powerful Istio service mesh, you can implement Canary easily, and perform granular analysis upon each traffic percentage increment. (Learn the API gateway vs Istio service mesh)

Best tool for handling Canary: Istio service mesh 

  1. Automated deployment tools: To implement the canary strategy in the CI/CD process, one needs to use automated deployment tools such as Spinnaker CD, Argo CD, Argo Rollouts,3 Tekton, Flux CD, GitHub Actions, etc. The deployment tool will act as an agent to release the software and update the traffic management tool like Istio to update the traffic percentage after each validation. 

Conclusion

Hope this blog provided a fair understanding of canary deployment strategy in CI/CD. In the next blog we will discuss how to implement canary deployment using Istio service mesh
In case you want to start with Istio service mesh journey or need enterprise Istio support, then contact us. If you are evaluating Istio and need any help in consultation or research wrt to service mesh or Istio, then talk to one of our Istio experts.

Debasree Panda

Debasree Panda

Debasree is the CEO of IMESH. He understands customer pain points in cloud and microservice architecture. Previously, he led product marketing and market research teams at Digitate and OpsMx, where he had created a multi-million dollar sales pipeline. He has helped open-source solution providers- Tetrate, OtterTune, and Devtron- design GTM from scratch and achieve product-led growth. He firmly believes serendipity happens to diligent and righteous people.

Leave a Reply