Portfolio | Platform Engineer

The Problem with `RollingUpdate`

Traditional Kubernetes Deployment objects are a massive improvement over manual releases, but their default RollingUpdate strategy has a critical flaw: it's too simple. It slowly replaces old pods with new ones, but it lacks the fine-grained control needed for mission-critical services. You can't pause it for testing, you can't easily control traffic flow, and if something goes wrong, a portion of your users are already affected. This leads to release anxiety and encourages teams to ship less frequently.

To truly de-risk deployments, we must separate the deployment of new code from its release to users. This is the core principle of progressive delivery, and Argo Rollouts is the premier tool for implementing it on Kubernetes.

What is Argo Rollouts?

Argo Rollouts is a Kubernetes controller that replaces the standard Deployment object with a Rollout Custom Resource Definition (CRD). This Rollout object provides powerful, declarative strategies for managing releases, including:

Blue-Green Deployments: Run two versions side-by-side and cut traffic over instantly.
Canary Deployments: Gradually shift a small percentage of traffic to the new version while monitoring for errors.
Automated Analysis: Integrate with monitoring tools like Prometheus or Datadog to automatically verify a release's health before, during, and after promotion.

This guide will walk you through both strategies with detailed examples.

A Note on Diagrams: The diagrams in this post are written in Mermaid syntax. Your blog's MDX renderer should automatically convert the text blocks below into visual flowcharts.

Strategy 1: Blue-Green Deployment

Best for: Applications where you need to switch all traffic at once and cannot have two different versions of the API running simultaneously.

This strategy involves running two full-fledged production environments: Blue (the current, stable version) and Green (the new, unreleased version). Traffic is only switched to the Green environment after it has been thoroughly tested and verified.

Blue-Green Workflow Diagram

graph TD
    subgraph Legend
        direction LR
        L1(Live Traffic) -- L2(Preview Traffic) -- L3(Analysis)
    end

    subgraph "1. Initial State"
        User(Live User Traffic) --> Ingress
        Ingress --> ActiveSvc(Active Service)
        ActiveSvc -- selector: v1 --> BlueRS(ReplicaSet v1)
    end

    subgraph "2. Deploying New Version (v2)"
        direction TB
        subgraph Live Path
            User --> Ingress --> ActiveSvc --> BlueRS
        end
        subgraph Verification Path
            PreviewSvc(Preview Service) -- selector: v2 --> GreenRS(ReplicaSet v2)
            Analysis(Automated AnalysisRun) -- Probes --> PreviewSvc
            Prometheus(Monitoring System) -- Metrics --> Analysis
        end
    end

    subgraph "3. Promotion"
        Analysis -- All tests pass --> Promote[kubectl argo rollouts promote]
        Promote -- Updates Service selector --> ActiveSvc
        ActiveSvc -- now selector: v2 --> GreenRS
        User --> Ingress --> ActiveSvc --> GreenRS
    end

    subgraph "4. Final State"
        User --> Ingress --> ActiveSvc --> GreenRS
        BlueRS -- (Kept for instant rollback, then scaled down) --> Idle
    end

Blue-Green YAML Manifests

To implement this, you need three key resources:

# 1. The Rollout Resource: Defines the strategy and pod template.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app-bluegreen
spec:
  replicas: 5
  selector: { matchLabels: { app: my-app-bluegreen } }
  template:
    metadata: { labels: { app: my-app-bluegreen } }
    spec:
      containers:
      - name: web
        image: my-registry/my-app:1.1.0 # The new version to deploy
        ports: [{ containerPort: 8080 }]
  strategy:
    blueGreen:
      activeService: my-app-active # Service for live traffic
      previewService: my-app-preview # Service for internal testing
      autoPromotionEnabled: false # Pause the rollout for verification

---
# 2. The Active Service: Your Ingress should point to this.
apiVersion: v1
kind: Service
metadata:
  name: my-app-active
spec:
  ports: [{ port: 80, targetPort: 8080 }]
  selector:
    app: my-app-bluegreen
    # Argo Rollouts will manage this label to point to the correct ReplicaSet

---
# 3. The Preview Service: Used for internal testing of the Green version.
apiVersion: v1
kind: Service
metadata:
  name: my-app-preview
spec:
  ports: [{ port: 80, targetPort: 8080 }]
  selector:
    app: my-app-bluegreen

Strategy 2: Canary Deployment

Best for: Applications where you want to test the new version with a small subset of live production traffic before a full rollout.

This strategy is more nuanced. It allows you to incrementally shift traffic to the new version, observe its behavior, and automatically roll back if key metrics (like error rates or latency) degrade.

Canary Workflow Diagram

graph TD
    subgraph "1. Initial State"
        User(100% Traffic) --> Ingress --> Svc(Service) --> StableRS(ReplicaSet v1)
    end

    subgraph "2. Canary Release (Step 1)"
        User -- 90% --> Svc --> StableRS
        User -- 10% --> Svc --> CanaryRS(ReplicaSet v2)
        Analysis(AnalysisRun) -- Monitors --> CanaryRS
    end

    subgraph "3. Canary Release (Step 2)"
        User -- 50% --> Svc --> StableRS
        User -- 50% --> Svc --> CanaryRS
        Analysis -- Continues Monitoring --> CanaryRS
    end

    subgraph "4. Full Promotion"
        Analysis -- All steps pass --> Promote[kubectl argo rollouts promote]
        User -- 100% --> Svc --> CanaryRS
        StableRS -- (Scaled Down) --> Idle
    end

Canary YAML Manifest

For a canary, you typically only need one Service, as Argo Rollouts manipulates the ReplicaSet pod counts directly to manage the traffic percentage.

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app-canary
spec:
  replicas: 10
  selector: { matchLabels: { app: my-app-canary } }
  template:
    metadata: { labels: { app: my-app-canary } }
    spec:
      containers:
      - name: web
        image: my-registry/my-app:1.2.0 # New version
        ports: [{ containerPort: 8080 }]
  strategy:
    canary:
      steps:
      - setWeight: 10 # Send 10% of traffic to the new version
      - pause: { duration: 5m } # Pause for 5 minutes to observe
      - analysis:
          templates:
          - templateName: prometheus-error-rate
      - setWeight: 25 # If analysis passes, increase traffic to 25%
      - pause: { duration: 10m }
      - analysis:
          templates:
          - templateName: prometheus-error-rate
      # Final step is implicit full promotion

Blue-Green vs. Canary: Which to Choose?

Feature	Blue-Green Deployment	Canary Deployment
Concept	Deploy a full new environment, then switch traffic.	Gradually shift traffic to the new version.
Cost	Higher (requires double the resources during deploy).	Lower (only need a few extra pods for the canary).
Risk	Lower (no users see the new version until tested).	Minimal (small blast radius for initial canary).
Speed	Slower (must wait for full environment provisioning).	Faster (can start testing with a small percentage quickly).
Use Case	Breaking API changes; major infrastructure changes.	Iterative feature releases; performance testing.

Interacting with Rollouts via CLI

The kubectl argo rollouts plugin is essential for managing deployments:

# Get a real-time, color-coded view of a rollout's progress
kubectl argo rollouts get rollout my-app-canary --watch

# Manually promote a paused rollout to the next step
kubectl argo rollouts promote my-app-canary

# Abort a rollout and immediately roll back to the stable version
kubectl argo rollouts abort my-app-canary

The Payoff: Deploy with True Confidence

By adopting Argo Rollouts, you transform deployments from a source of fear into a controlled, data-driven, and ultimately boring process. You gain:

Zero-Downtime Releases: Users are never interrupted.
Instant, Safe Rollbacks: Reverting is an atomic, one-command operation.
Data-Driven Promotions: Releases are approved by hard metrics, not gut feelings.

This allows your teams to ship smaller changes more frequently, accelerating innovation while dramatically improving stability.

The Problem with `RollingUpdate`

What is Argo Rollouts?

Blue-Green Deployments: Run two versions side-by-side and cut traffic over instantly.
Canary Deployments: Gradually shift a small percentage of traffic to the new version while monitoring for errors.
Automated Analysis: Integrate with monitoring tools like Prometheus or Datadog to automatically verify a release's health before, during, and after promotion.

This guide will walk you through both strategies with detailed examples.

A Note on Diagrams: The diagrams in this post are written in Mermaid syntax. Your blog's MDX renderer should automatically convert the text blocks below into visual flowcharts.

Strategy 1: Blue-Green Deployment

Best for: Applications where you need to switch all traffic at once and cannot have two different versions of the API running simultaneously.

Blue-Green Workflow Diagram

graph TD
    subgraph Legend
        direction LR
        L1(Live Traffic) -- L2(Preview Traffic) -- L3(Analysis)
    end

    subgraph "1. Initial State"
        User(Live User Traffic) --> Ingress
        Ingress --> ActiveSvc(Active Service)
        ActiveSvc -- selector: v1 --> BlueRS(ReplicaSet v1)
    end

    subgraph "2. Deploying New Version (v2)"
        direction TB
        subgraph Live Path
            User --> Ingress --> ActiveSvc --> BlueRS
        end
        subgraph Verification Path
            PreviewSvc(Preview Service) -- selector: v2 --> GreenRS(ReplicaSet v2)
            Analysis(Automated AnalysisRun) -- Probes --> PreviewSvc
            Prometheus(Monitoring System) -- Metrics --> Analysis
        end
    end

    subgraph "3. Promotion"
        Analysis -- All tests pass --> Promote[kubectl argo rollouts promote]
        Promote -- Updates Service selector --> ActiveSvc
        ActiveSvc -- now selector: v2 --> GreenRS
        User --> Ingress --> ActiveSvc --> GreenRS
    end

    subgraph "4. Final State"
        User --> Ingress --> ActiveSvc --> GreenRS
        BlueRS -- (Kept for instant rollback, then scaled down) --> Idle
    end

Blue-Green YAML Manifests

To implement this, you need three key resources:

# 1. The Rollout Resource: Defines the strategy and pod template.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app-bluegreen
spec:
  replicas: 5
  selector: { matchLabels: { app: my-app-bluegreen } }
  template:
    metadata: { labels: { app: my-app-bluegreen } }
    spec:
      containers:
      - name: web
        image: my-registry/my-app:1.1.0 # The new version to deploy
        ports: [{ containerPort: 8080 }]
  strategy:
    blueGreen:
      activeService: my-app-active # Service for live traffic
      previewService: my-app-preview # Service for internal testing
      autoPromotionEnabled: false # Pause the rollout for verification

---
# 2. The Active Service: Your Ingress should point to this.
apiVersion: v1
kind: Service
metadata:
  name: my-app-active
spec:
  ports: [{ port: 80, targetPort: 8080 }]
  selector:
    app: my-app-bluegreen
    # Argo Rollouts will manage this label to point to the correct ReplicaSet

---
# 3. The Preview Service: Used for internal testing of the Green version.
apiVersion: v1
kind: Service
metadata:
  name: my-app-preview
spec:
  ports: [{ port: 80, targetPort: 8080 }]
  selector:
    app: my-app-bluegreen

Strategy 2: Canary Deployment

Best for: Applications where you want to test the new version with a small subset of live production traffic before a full rollout.

This strategy is more nuanced. It allows you to incrementally shift traffic to the new version, observe its behavior, and automatically roll back if key metrics (like error rates or latency) degrade.

Canary Workflow Diagram

graph TD
    subgraph "1. Initial State"
        User(100% Traffic) --> Ingress --> Svc(Service) --> StableRS(ReplicaSet v1)
    end

    subgraph "2. Canary Release (Step 1)"
        User -- 90% --> Svc --> StableRS
        User -- 10% --> Svc --> CanaryRS(ReplicaSet v2)
        Analysis(AnalysisRun) -- Monitors --> CanaryRS
    end

    subgraph "3. Canary Release (Step 2)"
        User -- 50% --> Svc --> StableRS
        User -- 50% --> Svc --> CanaryRS
        Analysis -- Continues Monitoring --> CanaryRS
    end

    subgraph "4. Full Promotion"
        Analysis -- All steps pass --> Promote[kubectl argo rollouts promote]
        User -- 100% --> Svc --> CanaryRS
        StableRS -- (Scaled Down) --> Idle
    end

Canary YAML Manifest

For a canary, you typically only need one Service, as Argo Rollouts manipulates the ReplicaSet pod counts directly to manage the traffic percentage.

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app-canary
spec:
  replicas: 10
  selector: { matchLabels: { app: my-app-canary } }
  template:
    metadata: { labels: { app: my-app-canary } }
    spec:
      containers:
      - name: web
        image: my-registry/my-app:1.2.0 # New version
        ports: [{ containerPort: 8080 }]
  strategy:
    canary:
      steps:
      - setWeight: 10 # Send 10% of traffic to the new version
      - pause: { duration: 5m } # Pause for 5 minutes to observe
      - analysis:
          templates:
          - templateName: prometheus-error-rate
      - setWeight: 25 # If analysis passes, increase traffic to 25%
      - pause: { duration: 10m }
      - analysis:
          templates:
          - templateName: prometheus-error-rate
      # Final step is implicit full promotion

Blue-Green vs. Canary: Which to Choose?

Feature	Blue-Green Deployment	Canary Deployment
Concept	Deploy a full new environment, then switch traffic.	Gradually shift traffic to the new version.
Cost	Higher (requires double the resources during deploy).	Lower (only need a few extra pods for the canary).
Risk	Lower (no users see the new version until tested).	Minimal (small blast radius for initial canary).
Speed	Slower (must wait for full environment provisioning).	Faster (can start testing with a small percentage quickly).
Use Case	Breaking API changes; major infrastructure changes.	Iterative feature releases; performance testing.

Interacting with Rollouts via CLI

The kubectl argo rollouts plugin is essential for managing deployments:

# Get a real-time, color-coded view of a rollout's progress
kubectl argo rollouts get rollout my-app-canary --watch

# Manually promote a paused rollout to the next step
kubectl argo rollouts promote my-app-canary

# Abort a rollout and immediately roll back to the stable version
kubectl argo rollouts abort my-app-canary

The Payoff: Deploy with True Confidence

By adopting Argo Rollouts, you transform deployments from a source of fear into a controlled, data-driven, and ultimately boring process. You gain:

Zero-Downtime Releases: Users are never interrupted.
Instant, Safe Rollbacks: Reverting is an atomic, one-command operation.
Data-Driven Promotions: Releases are approved by hard metrics, not gut feelings.

This allows your teams to ship smaller changes more frequently, accelerating innovation while dramatically improving stability.

The Ultimate Guide to Zero-Downtime Kubernetes Deployments with Argo Rollouts

The Problem with `RollingUpdate`

What is Argo Rollouts?

Strategy 1: Blue-Green Deployment

Blue-Green Workflow Diagram

Blue-Green YAML Manifests

Strategy 2: Canary Deployment

Canary Workflow Diagram

Canary YAML Manifest

Blue-Green vs. Canary: Which to Choose?

Interacting with Rollouts via CLI

The Payoff: Deploy with True Confidence

The Ultimate Guide to Zero-Downtime Kubernetes Deployments with Argo Rollouts

The Problem with `RollingUpdate`

What is Argo Rollouts?

Strategy 1: Blue-Green Deployment

Blue-Green Workflow Diagram

Blue-Green YAML Manifests

Strategy 2: Canary Deployment

Canary Workflow Diagram

Canary YAML Manifest

Blue-Green vs. Canary: Which to Choose?

Interacting with Rollouts via CLI

The Payoff: Deploy with True Confidence