Docker, Kubernetes & OpenShift

đŸ—‚ïž Index

SectionTopicDetails
🚀 Docker EssentialsDockerIntroduction to Docker and containerization.
Add a Healthcheck to the DockerfileEnsure your containers are running healthy with healthchecks.
Set a Non-Root UserLearn how to configure Docker containers securely with non-root users.
IBM CloudDive into IBM Cloud for containerized app hosting.
Labs (Docker & IBM Cloud)Practical hands-on labs using Docker and IBM Cloud.
IBM Cloud Container Registry NamespacesExplore IBM Cloud's container registry namespaces for better organization.
☁ Kubernetes FundamentalsContainer OrchestrationLearn about container orchestration for managing workloads.
Kubernetes EcosystemA broad overview of Kubernetes and its ecosystem.
Kubernetes Objects (High to Low Level)Understand Kubernetes objects from a high-level perspective down to the details.
Stateful & Stateless Services/ApplicationsDifferentiating between stateful and stateless apps in Kubernetes.
Rollout and Rolling Updates (Deployments)How to manage rolling updates for applications in Kubernetes.
Why Services?The importance of services in Kubernetes.
ClusterIP & NodePort ExplainedUnderstanding ClusterIP and NodePort services in Kubernetes.
External Load Balancer & External Name & Ingress ExplainedDeep dive into external load balancers, external names, and ingress objects.
Other ObjectsExplore other useful Kubernetes objects and resources.
KubectlMastering the Kubernetes command line tool kubectl.
Ingress Object Needs Explicit Ingress Controller SetupLearn why the Ingress object needs an Ingress Controller to work.
Kubernetes AntipatternsCommon mistakes and antipatterns to avoid in Kubernetes.
Labs (K8s & IBM Cloud)Hands-on labs using Kubernetes and IBM Cloud for real-world experience.
⚙ Advanced Kubernetes TechniquesAutomated Bin PackingLearn how automated bin packing optimizes container placement.
AutoscalingHow Kubernetes can autoscale applications based on demand.
Deployment StrategiesExplore different deployment strategies like rolling updates, blue-green, etc.
Rolling Update (In details)Implementing the Rolling Update Strategy (.yaml)
Practical Demos of Rollout / Rolling UpdatesOne-at-a-time & All-at-once Rollout/Rollback
ConfigMaps and SecretsManage configuration and sensitive data in Kubernetes.
Service BindingConnect applications to external services with automatic credentials management.
🔮 Red Hat OpenShift 🐳Introduction to Red Hat OpenShiftexplain what OpenShift is and list its features, describe OpenShift CLI, architecture, and components.
OpenShift vs. K8sCompare OpenShift with Kubernetes
Hybrid Cloud, Multi-Cloud & Edge DeploymentsOpenShift supports Hybrid Cloud, Multi-Cloud & Edge Deployments
BuildsLearn about Builds, ImageStreams, Build Triggers, BuildConfig process, build strategies (S2I, Docker, Custom) & Build Automation
OperatorsAutomate Repetitive Tasks
Operators, CRDs & Custom Controllers all put togetherDiscover the Operator Pattern
IstioSafe Connection & Traffic Management (Istio Operator via OpenShift)
Labs (Red Hat OpenShift Web Console & IBM Cloud)Use OpenShift WC, oc CLI & Deploy an application

Docker

alt text

  • Docker uses namespaces to provide an isolated worspace called container

  • Docker is written in Go

  • Docker uses linux kernel features

  • Docker methodology has inspired additional innovations, including complementary tools such as Docker CLI, Docker compose and Prometheus, and various plugins including storage plugins. Orchestration technologies using Docker swarm or Kubernetes, and development methodologies using microservices and serverless.

alt text

alt text

alt text

alt text

Add a Healthcheck to the Dockerfile

For ensuring the container is running correctly, define a health check using the HEALTHCHECK instruction. This command checks the application’s health by sending a request to the specified port.

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 CMD curl -fs http://localhost:$PORT || exit 1

  • HEALTHCHECK: Configures a health check to ensure the container is running correctly.

    • --interval=30s: Specifies the interval between health checks.
    • --timeout=10s: Sets the timeout for each health check.
    • --start-period=5s: Defines the start period during which the container must initialize before health checks begin.
    • --retries=3: Sets the number of retries before considering the container unhealthy.
  • CMD curl -fs http://localhost:$PORT || exit 1: Specifies the command to run for health checks. It checks if a request to http://localhost:$PORT succeeds; otherwise, it exits with code 1, indicating failure.

Set a Non-Root User

For security purposes, set a non-root user with the USER instruction. Here, the user is set to node

USER node

  • USER: Sets the user that will run the subsequent instructions in the Dockerfile.
  • node: Specifies the user named node to run the commands for security purposes.

IBM Cloud

  • IBM Cloud and Docker:

    • IBM Cloud provides services for deploying and managing Docker containers.
    • Key services include:
      • IBM Cloud Container Registry: Stores and manages Docker images.
      • IBM Cloud Kubernetes Service: Orchestrates Docker containers at scale using Kubernetes.
      • IBM Cloud Foundry: Deploys Docker containers alongside traditional apps.
      • IBM Cloud Functions (Serverless): Runs Docker containers in a serverless architecture.
      • IBM Cloud DevOps Tools: Supports CI/CD for Docker container deployment.
  • IBM Cloud vs Local Docker Use:

    • IBM Cloud is for cloud-based production deployments at scale, similar to AWS or Google Cloud.
    • Local Docker use is for development, while IBM Cloud enables scalability and management in the cloud.
  • IBM Cloud Pricing:

    • Offers free and paid tiers.
    • Free tier includes limited resources for testing and development.
    • Paid plans offer additional resources and features for production deployments.
  • Serverless Computing:

    • Serverless means you don’t manage servers; the cloud provider handles provisioning and scaling.
    • Event-driven: Applications respond to specific events.
    • Pay-per-use: You pay for execution time or requests only.
  • IBM Serverless Services:

    • IBM Cloud Functions is IBM’s serverless service.
    • Built on Apache OpenWhisk, it allows running code in response to events without server management.
    • Pay-per-use pricing based on execution time.
  • Popular serverless services include AWS Lambda, Google Cloud Functions, and Azure Functions.

Labs (Docker & IBM Cloud)

Lab M101 - Docker & IBM Cloud Container Registry

You have to Obtain IBM Cloud Feature Code and Activate Trial Account

Creating an IBM Cloud Container Registry Namespace

  • This lab is likely about setting up a namespace in the IBM Cloud Container Registry. A namespace is essentially a collection of repositories that store container images:

    1. Organization: Namespaces help organize and manage your Docker images within IBM Cloud. This makes it easier to find and manage images.
    2. Access Control: You can set fine-grained access controls for users within your IBM Cloud account, ensuring that only authorized users can access specific images. IBM_CLOUD_1
    3. Security: IBM Cloud Container Registry provides automatic scanning of images for vulnerabilities, helping you make informed decisions about your deployments. IBM_CLOUD_2
    4. Integration: The registry is pre-integrated with IBM Cloud Kubernetes Service and Red Hat OpenShift, which accelerates the deployment of your applications. IBM_CLOUD_3

IBM Cloud Container Registry Namespaces

In the context of IBM CloudÂź Container Registry, a namespace is essentially a way to organize and manage your container images. Each namespace can be thought of as a set of related groups of images.

  • Isolation: Each namespace provides a level of isolation, meaning that images within one namespace are separate from those in another. This helps in organizing images based on projects, teams, or environments (e.g., development, staging, production).
  • Tagging and Pushing: When you tag and push an image, the namespace becomes part of the image name, helping to clearly identify and manage images. For example, us.icr.io/<my_namespace>/<my_repo>:<my_tag>. Here
  • Access Control: You can control access to each namespace, ensuring that only authorized users can push or pull images from it. Here

Manage Docker container images in a fully managed private registry. Push private images into this registry to run them in IBM Cloud Kubernetes Service and other runtime environments. Images are checked for security issues, so that you can make informed decisions about your deployments. Get Started

Container Orchestration

alt text

alt text

alt text alt text alt text

Kubernetes Ecosystem

alt text

Kubernetes objects (From High Level to Low Level)

alt text

alt text

Labels & Selectors

alt text

Namespaces

alt text

Pods

alt text

Deployment

alt text

alt text

Stateful & Stateless Services/Applications

1. Stateless Applications (Deployment)

A stateless application does not store any data or state locally that needs to be persisted across different instances or pod restarts. These applications rely on external storage systems (like databases, object storage, etc.) to store their data, and each instance of the application can operate independently of the others.

  • Deployment: In Kubernetes, a Deployment is used for stateless applications. It manages a set of identical, interchangeable pods, ensuring that a specified number of replicas are running at all times. If one pod crashes, Kubernetes will automatically replace it with another. Since the application does not rely on any internal state, pods can be replaced or scaled without worrying about data consistency or loss.

    Example use case: A web server that does not store any user data locally. The web server may rely on an external database for state, but each instance is independent of the others.

2. Stateful Applications (StatefulSet)

A stateful application requires persistent storage that must be retained even if the pods are destroyed or recreated. These applications rely on maintaining some form of state across restarts, meaning each pod must have a stable identity and persistent storage.

  • StatefulSet: In Kubernetes, a StatefulSet is used for stateful applications. It ensures that the pods are created with unique, stable identities and persistent storage. When a pod is rescheduled (e.g., due to a failure), it retains its identity and data, ensuring that the application can continue functioning without losing its state. StatefulSets also manage persistent volume claims (PVCs) that are tied to specific pods, ensuring the storage is consistent and persistent across pod restarts.

    Example use case: A database like MySQL or Redis, where each instance of the database must store data on a persistent disk, and it needs to be reliably identified (e.g., mysql-0, mysql-1, etc.).

Key Differences:

  • Stateless (Deployment):

    • No need for persistent storage.
    • Pods are interchangeable.
    • Pods can be easily scaled or replaced.
    • Good for applications that don’t need to maintain data across instances.
  • Stateful (StatefulSet):

    • Requires persistent storage (e.g., Persistent Volumes).
    • Pods have stable, unique identities (e.g., pod-0, pod-1).
    • Pods are not interchangeable; they have unique data and configurations.
    • Good for applications that need to maintain state, like databases or message queues.

Rollout and Rolling Updates (Deployments)

  • The rolling update process allows you to update your application while maintaining availability.

    • Rolling Update: Gradually replaces old pods with new ones during an update to avoid downtime.
    • Deployment: Manages updates to ensure high availability by controlling how and when pods are updated.
    • Key Parameters:
      • maxSurge: Maximum number of extra pods that can be created above the desired replica count.
      • maxUnavailable: Maximum number of pods that can be unavailable during the update.
    • Health Checks: Kubernetes ensures new pods are healthy and ready before replacing old ones.
    • Update Process:
      1. New pods are created with the updated version.
      2. Old pods are terminated once the new pods are ready.
      3. Kubernetes gradually replaces the old pods, maintaining the desired number of available replicas.
    • Rollback: If a failure occurs, Kubernetes automatically rolls back to the previous version to ensure stability.
    • Benefits: Zero downtime, controlled updates, and safe rollbacks.

Why Services

alt text

Native K8s Applications: Containerized, cloud-native, and optimized for Kubernetes.

Non-Native K8s Applications: Legacy or non-containerized apps that may require refactoring or additional work to run effectively in Kubernetes.

alt text

alt text

alt text

alt text

alt text

ClusterIP & NodePort Explained

ClusterIP

  • Definition: ClusterIP is the default type of service in Kubernetes. It exposes the service only within the Kubernetes cluster.
  • How it Works:
    • It creates an internal IP address that can only be accessed by other services inside the Kubernetes cluster.
    • It cannot be accessed from outside the cluster.
  • Use Case: Ideal for internal communication between services within the Kubernetes cluster (e.g., one microservice talking to another).
  • Access:
    • Accessible by other pods or services inside the cluster.
    • Not accessible from outside the cluster.
  • Example: If you have a database service running with a ClusterIP, only other services inside the cluster can connect to it, but users outside the cluster can't directly access it.

NodePort

  • Definition: NodePort is a service type in Kubernetes that exposes a service on a specific port on each Node (physical or virtual machine) in the cluster.
  • How it Works:
    • It opens a static port on every node in the cluster, which can be used to access the service from outside the cluster.
    • It forwards traffic from the port on the node to the service inside the cluster.
    • NodePort maps to a ClusterIP service internally (meaning it still gets traffic routed via a ClusterIP), but it makes it accessible externally via the node's IP and port.
  • Use Case: Ideal for external access to the service from outside the Kubernetes cluster, but it doesn't have load balancing.
  • Access:
    • You can access the service from outside the cluster by using the <NodeIP>:<NodePort>.
    • Useful for exposing a service to the internet, but not recommended for production because it doesn't scale as well as other types like LoadBalancer.
  • Example: If you expose a web application with NodePort, you can access it by using http://<NodeIP>:<NodePort>, where <NodeIP> is the IP address of any node in the cluster, and <NodePort> is the port you configured.

Key Differences:

  • Scope:

    • ClusterIP: Exposes the service only within the cluster.
    • NodePort: Exposes the service both within the cluster and externally (from outside the cluster).
  • Accessibility:

    • ClusterIP: Can be accessed only by other services within the cluster.
    • NodePort: Can be accessed from outside the cluster using the <NodeIP>:<NodePort>, but only if you know the external IP of the node.
  • Usage:

    • ClusterIP: Suitable for internal communication between services inside the cluster.
    • NodePort: Suitable for exposing services to external traffic for testing or development (but not ideal for production in most cases).

Example Use Case Scenario:

  • ClusterIP: You have an internal database service in your cluster, and you want only your backend applications (within the cluster) to talk to it. You would expose it as a ClusterIP service.

  • NodePort: You have a web server that needs to be accessed from outside the Kubernetes cluster (for example, by users on the internet or from a different network). You would expose this service as a NodePort so it can be accessed using <NodeIP>:<NodePort>.

Visual Example:

Service TypeInternal AccessExternal AccessUse Case
ClusterIPYesNoInternal services communication
NodePortYesYes (<NodeIP>:<NodePort>)Exposing services externally (usually for development/testing)

In summary:

  • ClusterIP: Internal service communication within the cluster.
  • NodePort: Exposes a service to external traffic on a specific port of each node in the cluster.

External Load Balancer & External Name & Ingress Explained

What Experts Typically Do:

  1. Internal Service Communication (Inside the Cluster)

    • Use ClusterIP Service (default):
      • This is the most common approach for internal communication between services within the Kubernetes cluster.
      • Kubernetes automatically load balances traffic between the pods that belong to a service.
      • No external access is involved — it’s just internal traffic between services.

    Example:

    • You have multiple services in your cluster, like api-service, frontend-service.
    • These communicate with each other internally via ClusterIP services, which route traffic inside the cluster.
  2. Expose Services Externally

    • Use LoadBalancer Service (when you need external exposure):
      • For external traffic, experts will often use the LoadBalancer type service, which works with cloud providers' load balancers (like AWS ELB, GCP GLB).
      • When you create a service of type LoadBalancer, Kubernetes automatically requests an external load balancer from your cloud provider.
      • The external load balancer (ELB) exposes the service to the outside world (via a public IP or DNS).
      • Kubernetes internally load balances between the pods of the service.

    Why?:

    • Simple: Using the LoadBalancer service is easy and automatically provisions an external load balancer for you. It’s great for exposing web apps, APIs, etc., to the internet.
    • Cloud Integration: This is the standard way to expose services externally if you’re on a cloud provider (AWS, GCP, Azure, etc.).
  3. Advanced Traffic Management (Path-Based Routing)

    • Use Ingress for complex routing:
      • If you need advanced routing (e.g., different paths or hostnames routing to different services), experts use Ingress.
      • Ingress routes traffic based on path rules (like /api or /frontend) and hostnames (like api.myapp.com or web.myapp.com).
      • It works with Ingress controllers (like NGINX, Traefik, etc.) and can be combined with an external load balancer (ELB).
      • Ingress allows multiple services to be exposed under one external IP and use different rules for routing.

    Why?:

    • More flexibility: If you have multiple services exposed under one domain or IP, Ingress simplifies traffic routing.
    • SSL termination, authentication, and path-based routing can all be managed via Ingress.
  4. When to Use ELB Directly (Without Ingress):

    • Use ELB for simple, single-service exposure where complex routing is not needed.
    • ELB works with the LoadBalancer service and handles traffic distribution to one service (but across all its pod replicas).
    • If you just want to expose one service (like a simple frontend or API service) to the outside world, experts will directly use ELB via the LoadBalancer service.

Expert Workflow (Common Setup):

  1. Internal Communication:
    • Use ClusterIP for internal services.
  2. External Exposure (Simple):
    • Use LoadBalancer service if you want a public IP/DNS to access a service.
  3. Advanced Traffic Routing:
    • Use Ingress if you need path-based or hostname-based routing to multiple services under the same external IP.
    • Combine Ingress with ELB when needed.

TL;DR for Experts:

  • Internal services: Use ClusterIP (Kubernetes handles internal load balancing).
  • Expose single service: Use LoadBalancer for external access (ELB is provisioned automatically).
  • Expose multiple services with path-based routing: Use Ingress for complex routing logic and combine it with LoadBalancer for external access.

Real-World Example:

  1. You have a frontend service and API service.
  2. Use Ingress to route requests to /api to the API service and /frontend to the frontend service (all under one public IP).
  3. Use LoadBalancer (and thus ELB in AWS, for example) to expose the Ingress controller externally.
  4. Kubernetes will load balance traffic to the relevant pods (both frontend and API) internally.

Other Objects

alt text

alt text

alt text

Kubectl

Kubectl Overview

  • Kubectl is the Kubernetes command-line interface (CLI) used to interact with and manage Kubernetes clusters.
  • Command Structure: kubectl [command] [type] [name] [flags]
    • Command: Operation (e.g., create, get, apply, delete)
    • Type: Resource type (e.g., pod, deployment, replica set)
    • Name: Resource name (if applicable)
    • Flags: Special options/modifiers to override defaults

Kubectl Command Types

  1. Imperative Commands

    • Directly create, update, or delete live objects.
    • Advantages:
      • Easy to learn and use.
      • Quick execution for simple tasks (e.g., creating pods).
    • Disadvantages:
      • No audit trail.
      • Limited flexibility (no templates).
      • Not ideal for reproducible or collaborative deployments.
      • Best for development or test environments.
  2. Imperative Object Configuration

    • Use a configuration file (YAML/JSON) to define resources.
    • Advantages:
      • Configuration files can be version-controlled (e.g., using Git).
      • Ideal for environments where consistency is needed.
      • Supports audit trails and change tracking.
    • Disadvantages:
      • Requires understanding object schemas.
      • If updates are missed, discrepancies may occur.
      • Developers need to maintain and update config files manually.
  3. Declarative Object Configuration

    • Preferred for production: Kubernetes automatically manages resource state based on configuration files.
    • Advantages:
      • Automatically ensures that the actual state matches the desired state.
      • No need for manual operation specification.
      • Best for consistency and automation in production systems.
    • Disadvantages:
      • Needs well-maintained templates and configuration files for all involved teams.

Commonly Used Kubectl Commands

  1. Get: Lists resources (e.g., kubectl get pods, kubectl get deployments).
  2. Apply: Creates/updates resources from configuration files (YAML/JSON).
  3. Delete: Deletes resources (e.g., kubectl delete pod <pod_name>).
  4. Scale: Scales the number of replicas for a resource (e.g., kubectl scale deployment nginx --replicas=3).
  5. Autoscale: Automatically scales resources based on demand (e.g., kubectl autoscale deployment nginx --min=1 --max=10 --cpu-percent=50).

Summary:

  • Kubectl is essential for managing Kubernetes clusters.

  • It supports three main command types:

    1. Imperative Commands: Direct operations, great for development but lacks audit trails.
    2. Imperative Object Configuration: Uses configuration files but requires manual updates.
    3. Declarative Object Configuration: Automates operations, ideal for production environments.
  • Common Commands: get, apply, delete, scale, and autoscale are fundamental for managing Kubernetes resources.

Ingress Object needs explicit Ingress Controller setup

1. Ingress Objects

An Ingress resource in Kubernetes defines HTTP or HTTPS routes to services in your cluster. It provides a way for external users to access your services through a single, consolidated endpoint, such as a URL. Ingress objects do not directly manage ports or non-HTTP traffic but instead define rules on how to route HTTP/HTTPS traffic to specific services.

Key Features of Ingress:

  • Routes external HTTP(S) traffic to internal services.
  • Handles SSL/TLS termination (secure traffic handling).
  • Allows name-based virtual hosting (e.g., app.example.com vs api.example.com).
  • Does not manage arbitrary ports or protocols like TCP or UDP. For those, you would typically use other service types such as NodePort or LoadBalancer.

Example of an Ingress Object:

1apiVersion: networking.k8s.io/v1
2kind: Ingress
3metadata:
4  name: example-ingress
5  namespace: default
6spec:
7  rules:
8  - host: example.com
9    http:
10      paths:
11      - path: /service1
12        pathType: Prefix
13        backend:
14          service:
15            name: service1
16            port:
17              number: 80
18      - path: /service2
19        pathType: Prefix
20        backend:
21          service:
22            name: service2
23            port:
24              number: 80
25  tls:
26  - hosts:
27    - example.com
28    secretName: example-tls-secret

Explanation:

  • Host (example.com): Defines the external hostname.
  • Paths: Defines the routing paths, where traffic to example.com/service1 goes to service1, and example.com/service2 goes to service2.
  • TLS: The Ingress object can also be used to handle HTTPS traffic by using TLS termination, pointing to a Kubernetes Secret (example-tls-secret) containing SSL certificates.

2. Ingress Controllers

An Ingress controller is the actual resource responsible for fulfilling the Ingress resource’s rules. It reads the Ingress object and configures a load balancer, reverse proxy, or another frontend to manage incoming traffic and route it to the correct service. The Ingress controller needs to be explicitly installed and configured in your cluster.

Without an Ingress controller, Ingress resources do nothing. Popular Ingress controllers include NGINX Ingress Controller, Traefik, and HAProxy Ingress Controller.

Example of deploying an NGINX Ingress Controller:

  1. Install the NGINX Ingress Controller using Helm:

    1helm install ingress-nginx ingress-nginx/ingress-nginx
  2. Activate the Ingress controller: This controller will now listen for Ingress resources and apply the rules specified (such as traffic routing) when Ingress objects are created in the cluster.

Basic Example of the Role of Ingress Controller:

  • If the Ingress object points to an HTTP route on a service, the NGINX Ingress Controller will configure NGINX to route traffic to the service.
  • It will manage the load balancing and can handle SSL termination if configured.

Kubernetes Antipatterns

  1. Avoid Baking Configuration in Container Images

    • Don’t embed environment-specific configurations in images.
    • Use generic images for consistency across environments.
  2. Separate Application and Infrastructure Deployment

    • Use separate pipelines for infrastructure and application to optimize resource usage.
  3. Eliminate Specific Order in Deployment

    • Avoid relying on fixed startup orders for pods.
    • Prepare for failures and enable simultaneous component initialization.
  4. Set Memory and CPU Limits for Pods

    • Define resource limits to prevent a single app from consuming all cluster resources.
  5. Avoid Pulling the Latest Tag in Production

    • Use specific image tags for better versioning and easier troubleshooting.
  6. Segregate Production and Non-Production Workloads

    • Use separate clusters for production and non-production environments to reduce complexity and security risks (namespacing).
  7. Refrain from Ad-Hoc Deployments with kubectl Edit/Patch

    • Use Git-based deployments to track changes and avoid configuration drift.
  8. Implement Health Checks with Liveness and Readiness Probes

    • Set up health probes to monitor container health and prevent service disruptions.
  9. Prioritize Secret Handling and Use Vault

    • Avoid embedding secrets in containers; use a consistent secret management strategy like HashiCorp Vault.
  10. Use Controllers and Avoid Running Multiple Processes per Container

    • Use controllers like Deployment, StatefulSet, or Job for managing pods and avoid running multiple processes in a single container.

Labs (K8s & IBM Cloud)

Automated bin packing

  • Automated bin packing in Kubernetes refers to the process of automatically placing containers onto nodes in a cluster based on their resource requirements (such as CPU and memory) and the available resources of the nodes. Kubernetes ensures that containers are placed in such a way that it optimizes resource usage while maintaining high availability and reliability.

Autoscaling

Create Autoscaling using kubectl

alt text

alt text

  • Autoscaling Types alt text

    • HPA alt text
    • You can also create an HPA from scratch in YAML: alt text
    • VPA alt text alt text alt text
    • CA alt text alt text
  • A combination of all three autoscaler types often provide the most optimized solution.

Deployment Strategies

Rolling Update (In details)

Certainly! Here's the updated, detailed answer with progressDeadlineSeconds included, along with the rolling update strategy, readiness and liveness probes, and monitoring.


Rolling Update Strategy: Parameters, Probes, and Monitoring

To apply a rolling update strategy in your Kubernetes deployment YAML, configure the readiness and liveness probes, and manage the deployment parameters (maxSurge, maxUnavailable, minReadySeconds, and progressDeadlineSeconds).


Key Rolling Update Strategy Parameters in Kubernetes

  1. maxSurge:

    • Definition: Controls the maximum number of pods that can be created above the desired number of pods during a rolling update.
    • Usage: It helps Kubernetes scale up temporarily to handle the update without affecting service availability.
    • Value Type: Can be an absolute number or a percentage.
    • Example: If you have 3 replicas, maxSurge: 1 means Kubernetes can scale up to 4 pods while updating.
    1maxSurge: 1
  2. maxUnavailable:

    • Definition: Specifies the maximum number of pods that can be unavailable during the update.
    • Usage: Controls how many pods can be temporarily taken offline during the update, ensuring the rest are available.
    • Value Type: Can be an absolute number or a percentage.
    • Example: If you have 3 replicas, maxUnavailable: 1 means one pod can be unavailable during the update.
    1maxUnavailable: 1
  3. minReadySeconds:

    • Definition: Specifies the minimum number of seconds a pod must be in the "Ready" state before it is considered available to serve traffic.
    • Usage: Ensures that pods are stable and running before considering them as available for traffic. This can prevent prematurely routing traffic to pods that might still be starting up.
    • Example: If minReadySeconds: 5, the pod will be considered ready only after being stable for 5 seconds.
    1minReadySeconds: 5
  4. progressDeadlineSeconds:

    • Definition: This specifies the maximum time allowed for a deployment's rolling update to complete. If the update doesn't progress within this time frame, Kubernetes will mark the update as failed and trigger a rollback.
    • Usage: It is useful to prevent stuck updates. If the update process takes longer than the specified duration, Kubernetes will automatically stop waiting and initiate a rollback.
    • Value Type: Time in seconds.
    • Example: If progressDeadlineSeconds: 600, Kubernetes will wait up to 10 minutes for the rolling update to complete. If the update takes longer, it will trigger a rollback.
    1progressDeadlineSeconds: 600
  5. Readiness Probe:

    • Definition: Indicates if a pod is ready to accept traffic. If the readiness probe fails, Kubernetes will not send traffic to that pod.
    • Usage: This is used to determine when a pod is ready for traffic. It ensures that the application is fully initialized and can handle requests.
    • Example: A simple HTTP probe to check if the app is serving on /healthz.
    1readinessProbe:
    2  httpGet:
    3    path: /healthz
    4    port: 80
    5  initialDelaySeconds: 5
    6  periodSeconds: 5
    7  failureThreshold: 3
  6. Liveness Probe:

    • Definition: Indicates if a pod is still running properly. If the liveness probe fails, Kubernetes will restart the pod.
    • Usage: This ensures that a pod is restarted if it becomes unresponsive or crashes.
    • Example: A simple HTTP probe to check if the app is still running and responsive.
    1livenessProbe:
    2  httpGet:
    3    path: /healthz
    4    port: 80
    5  initialDelaySeconds: 15
    6  periodSeconds: 5
    7  failureThreshold: 3

Example Scenario for 3 Pods with Rolling Update

Let’s use a 3-pod scenario for better clarity. Here’s how the rolling update would behave in a typical deployment setup with maxSurge, maxUnavailable, minReadySeconds, and progressDeadlineSeconds:

YAML Example for Rolling Update:

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: my-app
5spec:
6  replicas: 3  # Number of pods you want
7  progressDeadlineSeconds: 600  # Maximum 10 minutes for the rolling update to complete
8  minReadySeconds: 5,
9  strategy:
10    type: RollingUpdate  # Defines the strategy to use for updates
11    rollingUpdate:
12      maxSurge: 1  # Kubernetes can scale up to 4 pods (3 + 1) during the update
13      maxUnavailable: 1  # At most, 1 pod can be unavailable at a time
14  template:
15    metadata:
16      labels:
17        app: my-app
18    spec:
19      containers:
20      - name: my-app-container
21        image: my-app-image:latest
22        ports:
23        - containerPort: 80
24        readinessProbe:
25          httpGet:
26            path: /healthz
27            port: 80
28          initialDelaySeconds: 5
29          periodSeconds: 5
30          failureThreshold: 3
31        livenessProbe:
32          httpGet:
33            path: /healthz
34            port: 80
35          initialDelaySeconds: 15
36          periodSeconds: 5
37          failureThreshold: 3

Deployment Update Process:

  1. Initial State:

    • Your deployment has 3 replicas (pods), and they are all running and healthy.
  2. Rolling Update Process:

    • Kubernetes creates one extra pod (maxSurge: 1), so it temporarily scales to 4 pods.
    • Pod 1 (say pod-0) is the first to be updated. It is taken offline, updated with the new version, and then brought back online.
    • Kubernetes ensures that at least 2 pods are still running, thanks to maxUnavailable: 1.
    • Once pod-0 is healthy (verified via readiness probe), pod-1 is updated, followed by pod-2.
    • After all pods are updated, Kubernetes scales the number of pods back down to 3.
  3. progressDeadlineSeconds ensures that if the update doesn’t progress within 600 seconds (10 minutes), Kubernetes will roll back the update automatically.


Activating and Monitoring the Rolling Update:

  1. Apply the Deployment YAML: To apply your changes to Kubernetes, use the kubectl command:

    1kubectl apply -f my-deployment.yaml
  2. Monitor the Rollout: You can monitor the status of the rollout to track whether the update is progressing correctly:

    1kubectl rollout status deployment/my-app

    This will show you the progress of the update, including how many pods are updated and if the rollout is successful or facing any issues.

  3. Check Deployment History: If you need to view the rollout history or need to rollback to a previous version, use the following command:

    1kubectl rollout history deployment/my-app
  4. Inspect Pod Status: To check the health of the pods during the update, use:

    1kubectl get pods

    This will show whether each pod is in the Running, Pending, or CrashLoopBackOff state.

  5. Rolling Back the Update: If the update fails or causes issues, you can roll back to the previous stable version:

    1kubectl rollout undo deployment/my-app

Summary of Parameters and Probes in Rolling Update:

  • Rolling Update Parameters:

    • maxSurge: Max additional pods that can be created during the update (e.g., 1).
    • maxUnavailable: Max number of pods that can be unavailable during the update (e.g., 1).
    • minReadySeconds: Minimum time a pod must be ready before considered healthy (e.g., 5).
    • progressDeadlineSeconds: Maximum time the update process is allowed to take before a rollback is triggered (e.g., 600 for 10 minutes).
  • Probes:

    • Readiness Probe: Ensures the pod is ready to serve traffic (e.g., checks /healthz).
    • Liveness Probe: Ensures the pod is running and alive, otherwise restarts it.

Another Example Scenario (Visual)

alt text alt text alt text alt text alt text alt text

Practical Demos of Rollout / Rolling Updates

alt text

1. All-at-Once Rollout:

  • Strategy: Recreate
  • Parameters: No additional parameters for surge or availability needed.

Example:

1strategy:
2  type: Recreate

2. One-at-a-Time Rollout:

  • Strategy: RollingUpdate
  • Parameters:
    • maxSurge: 0 (no extra pods created)
    • maxUnavailable: 1 (only one pod is unavailable at a time)

Example:

1strategy:
2  type: RollingUpdate
3  rollingUpdate:
4    maxSurge: 1
5    maxUnavailable: 1

3. All-at-Once Rollback:

  • Strategy: Recreate
  • Parameters: No additional parameters; the rollback happens immediately.

Command:

1kubectl rollout undo deployment/my-app

4. One-at-a-Time Rollback:

  • Strategy: RollingUpdate
  • Parameters:
    • maxSurge: 0 (no extra pods created)
    • maxUnavailable: 1 (one pod rolled back at a time)

Command:

1kubectl rollout undo deployment/my-app

Example:

1strategy:
2  type: RollingUpdate
3  rollingUpdate:
4    maxSurge: 1 # (3 + 1 new)
5    maxUnavailable: 1

ConfigMaps and Secrets

ConfigMaps

alt text alt text

Problem

alt text

Solution

  • From string literal --------------------------------------- alt text

  • From Properties file --------------------------------------- alt text

  • From YAML file --------------------------------------- alt text

  • You can reference the config data in TS/JS using process.env

Secrets

  • Use with String Literals

alt text

  • The values are fully encoded: alt text
  • Reference the secret in the deployment descriptor and TS/JS (process.env.API_CREDS) alt text
  • Can also be referenced in a Volume Mount alt text

More ways to create a Secret in k8s

The term generic in the context of Kubernetes refers to a type of secret that can be created to store various kinds of sensitive information. The kubectl create secret generic command is used to create a secret with a generic name, and you can store data in it in various ways.

More ways to create secrets apart from using the --from-literal flag:

1. Using Files (--from-file)

You can create a secret from a file. The file will be read and the content will be stored as the secret.

1kubectl create secret generic api-creds --from-file=key=/path/to/your/file

If you have multiple files, you can pass multiple --from-file flags:

1kubectl create secret generic api-creds --from-file=key1=/path/to/file1 --from-file=key2=/path/to/file2

2. From Literal Key-Value Pairs (--from-literal)

You can create secrets by providing key-value pairs directly in the command:

1kubectl create secret generic api-creds --from-literal=key1=value1 --from-literal=key2=value2

This approach is useful for small secrets, such as API keys or passwords, that you want to hard-code into the command.

3. From Directory (--from-file)

You can create secrets from all the files within a directory. Kubernetes will treat each file as a key and its contents as the secret value.

1kubectl create secret generic api-creds --from-file=/path/to/directory

This will create a secret where each file in the directory is treated as a separate key-value pair. For example, if a directory has two files key1.txt and key2.txt, the secret will contain two keys (key1 and key2) with the respective file contents.

4. From Environment Variables (--from-env-file)

You can create secrets from an environment file (a .env file). Each line in the .env file should be a key-value pair.

Example .env file:

key1=value1
key2=value2

Then use this command to create a secret:

1kubectl create secret generic api-creds --from-env-file=/path/to/envfile

This will create a secret with each environment variable as a key, and its value will be the corresponding value in the .env file.

5. Creating a Secret from a Secret Manifest File

You can manually create a YAML manifest for your secret and apply it using kubectl apply.

1apiVersion: v1
2kind: Secret
3metadata:
4  name: api-creds
5type: Opaque # In Kubernetes, the type: Opaque field in a Secret resource is the default type for secrets. It indicates that the secret contains arbitrary, user-defined data (i.e., data that doesn't belong to any specific Kubernetes secret type like Docker credentials, TLS certificates, etc.).
6data:
7  key1: c2VjcmV0dmFsdWUx  # Base64 encoded 'secretvalue1'
8  key2: c2VjcmV0dmFsdWUy  # Base64 encoded 'secretvalue2'

Note that the values for key1 and key2 are base64 encoded. You can encode data using base64:

1echo -n 'secretvalue1' | base64

Once your YAML is ready, apply it:

1kubectl apply -f secret.yaml
  • Kubernetes provides several other types of secrets besides opaque, which have specific use cases, such as:

    • kubernetes.io/dockerconfigjson: Used for storing Docker registry credentials.
    • kubernetes.io/tls: Used for storing TLS certificates and private keys.
    • kubernetes.io/basic-auth: Used for storing basic authentication credentials.
    • kubernetes.io/service-account-token: Used for storing service account tokens.

Service Binding

  • Service Binding: Connects your app to external services (e.g., databases, APIs).
  • Function: Acts as a bridge for communication between your app and external services.
  • Security: Keeps sensitive information (e.g., passwords) safe and secure.
  • Automation: Manages configuration and credentials needed to accesss the services automatically.
  • Benefit: Allows developers to focus on building applications without handling service details.

Example:

alt text

IBM Service Binding

Steps Overview:

  1. Provision the IBM Cloud Service Instance: Create an instance of the IBM Cloud service (in this case, the Tone Analyzer service).
  2. Bind the Service to the Kubernetes Cluster: Use the IBM Cloud CLI to bind the service to your Kubernetes cluster, creating service credentials stored in a Kubernetes secret.
  3. Store the Service Credentials in a Kubernetes Secret: Automatically create a Kubernetes secret to hold the base64-encoded credentials for the service instance.
  4. Configure the Application to Access the Credentials: Either mount the secret as a volume or use environment variables to access the credentials in your application.

Detailed Explanation and Code

Step 1: Provisioning the IBM Cloud Service Instance

First, you need to create an instance of the service using the IBM Cloud CLI (or the IBM Cloud console).

Using IBM Cloud CLI:

1ibmcloud login
2ibmcloud target --cf
3ibmcloud resource service-instance-create tone-analyzer-instance lite "tone-analyzer" -p <your-resource-group>
  • tone-analyzer-instance: This is the name of the service instance you're creating.
  • lite: The pricing plan for the service.
  • tone-analyzer: The name of the service type.
  • <your-resource-group>: Your resource group on IBM Cloud.

Alternatively, you can provision the service directly through the IBM Cloud catalog by visiting the IBM Cloud console.

Step 2: Binding the Service to Your Cluster

After provisioning the service instance, you need to bind the service to your Kubernetes cluster. This action will automatically create the credentials in a Kubernetes secret.

Bind the service instance to the Kubernetes cluster:

1ibmcloud ks service bind --cluster <cluster-name> --service-name tone-analyzer-instance
  • This will create a Kubernetes secret with the service credentials, including the API key and other connection details, in base64 encoded JSON format.

Step 3: Verifying the Secret Object

To verify the secret that has been created, you can use the following command to check for all secrets in your Kubernetes cluster:

1kubectl get secrets

You can also access the secrets through the Kubernetes Dashboard UI or through the IBM Cloud Kubernetes service interface.

Step 4: Accessing the Secret in Your Application

There are two primary ways to access the secret data within your application:

  1. Mount the Secret as a Volume:

    • This will create a file in your Pod with the credentials.

    Kubernetes Deployment YAML (mount the secret as a volume):

    1apiVersion: apps/v1
    2kind: Deployment
    3metadata:
    4  name: tone-analyzer-app
    5spec:
    6  replicas: 1
    7  template:
    8    spec:
    9      containers:
    10        - name: tone-analyzer-container
    11          image: node:14
    12          volumeMounts:
    13            - mountPath: /app/secrets
    14              name: service-credentials
    15      volumes:
    16        - name: service-credentials
    17          secret:
    18            secretName: tone-analyzer-instance-credentials

    In this example:

    • The secret tone-analyzer-instance-credentials is mounted at /app/secrets inside the container.
    • This means that the binding credentials will be accessible in a file inside the container at that path.
  2. Reference the Secret in Environment Variables:

    • If you'd prefer to use environment variables, you can reference the credentials directly in your app.

    Kubernetes Deployment YAML (use environment variables):

    1apiVersion: apps/v1
    2kind: Deployment
    3metadata:
    4  name: tone-analyzer-app
    5spec:
    6  replicas: 1
    7  template:
    8    spec:
    9      containers:
    10        - name: tone-analyzer-container
    11          image: node:14
    12          envFrom:
    13            - secretRef:
    14                name: tone-analyzer-instance-credentials

    This will set environment variables in your container based on the keys stored in the secret.

    The environment variables for the credentials will typically look like:

    • BINDING_API_KEY
    • BINDING_USERNAME
    • BINDING_PASSWORD

    These environment variables will be automatically available in the container and can be used in your application code.

Example: Node.js Application Using the Credentials

A simple example of how to use the environment variables for the IBM Watson Tone Analyzer service in a Node.js application using the Express.js framework.

1// Import required modules
2const express = require('express');
3const axios = require('axios');
4
5// Initialize the Express app
6const app = express();
7const port = 3000;
8
9// Retrieve credentials from environment variables
10const apiKey = process.env.BINDING_API_KEY;  // Watson Tone Analyzer API key
11const username = process.env.BINDING_USERNAME;  // Watson Tone Analyzer username
12const password = process.env.BINDING_PASSWORD;  // Watson Tone Analyzer password
13
14// Setup route to analyze text tone
15app.post('/analyze', async (req, res) => {
16  const text = req.body.text;
17
18  try {
19    // Call the Watson Tone Analyzer API
20    const response = await axios.post('https://api.us-south.tone-analyzer.watson.cloud.ibm.com/instances/{instance_id}/v3/tone', 
21    { text: text }, 
22    { 
23      auth: {
24        username: username,
25        password: password
26      }
27    });
28
29    // Return the tone analysis results
30    res.json(response.data);
31  } catch (error) {
32    res.status(500).send('Error analyzing tone: ' + error.message);
33  }
34});
35
36// Start the app
37app.listen(port, () => {
38  console.log(`Tone Analyzer app listening at http://localhost:${port}`);
39});

Explanation of the Code:

  • Express.js: A lightweight web framework for Node.js.
  • Axios: A promise-based HTTP client for making API calls.
  • The application reads BINDING_API_KEY, BINDING_USERNAME, and BINDING_PASSWORD environment variables set by Kubernetes.
  • The /analyze endpoint accepts text, sends it to the Watson Tone Analyzer API, and returns the analysis result.

Final Steps: Deploying to Kubernetes

To deploy the Node.js application to Kubernetes:

  1. Build your Docker image:

    • Create a Dockerfile for your Node.js app.
    1# Use official Node.js image as base
    2FROM node:14
    3
    4# Set working directory
    5WORKDIR /app
    6
    7# Copy app files
    8COPY . .
    9
    10# Install dependencies
    11RUN npm install
    12
    13# Expose port for the app
    14EXPOSE 3000
    15
    16# Start the app
    17CMD ["node", "app.js"]
  2. Build and push the Docker image:

    1docker build -t <your-image-name> .
    2docker push <your-image-name>
  3. Deploy the application to Kubernetes:

    1kubectl apply -f deployment.yaml
    2kubectl apply -f service.yaml
    • deployment.yaml is the Kubernetes deployment configuration that defines your app.
    • service.yaml exposes your app so it can be accessed externally.

This process involves::

  • Provisioning and Binding an IBM Cloud service.
  • Storing credentials in Kubernetes secrets.
  • Accessing those credentials either through volume mounts or environment variables.
  • Using the credentials in your application to interact with the IBM Cloud service (e.g., Watson Tone Analyzer).

Labs (Scaling, ConfigMaps & Secrets using ICR)

Introduction to Red Hat OpenShift

  • OpenShift is an enterprise Kubernetes platform developed by Red Hat. It is built on top of Kubernetes and adds additional features and tools that make it easier to deploy, manage, and scale applications in a containerized environment. alt text

OpenShift Features

alt text

OpenShift Platform Architecture

  • OpenShift runs on top of a Kubernetes cluster, with object data stored in the etcd key-value store. It has a microservices-based architecture. alt text alt text

OpenShift Components

alt text

OpenShift CLI

  • OpenShift's CLI tool oc wraps kubectl and add additional functions to it such as: oc

OpenShift vs. K8s

alt text alt text

Hybrid Cloud, Multi-Cloud & Edge Deployments

Hybrid Cloud in OpenShift

When we talk about Hybrid Cloud in OpenShift, we're referring to the ability to run workloads both in on-premises data centers and in the public cloud. OpenShift allows you to create a unified infrastructure where you can manage and orchestrate your containers regardless of whether they’re running locally or in the cloud. This is important because it gives businesses the flexibility to leverage existing on-prem resources while also tapping into the benefits of cloud scalability and elasticity. With OpenShift, you can seamlessly move workloads between on-prem and cloud environments, depending on factors like cost, compliance, or performance needs. This flexibility helps organizations balance between the security and control of on-premises infrastructure and the innovation and scalability of the cloud. Moreover, OpenShift’s integrated management tools allow you to centrally manage both environments with the same control plane, simplifying operations and improving visibility across your hybrid infrastructure.

Multi-Cloud in OpenShift

Multi-Cloud environments take this flexibility one step further by allowing OpenShift to run on multiple public cloud platforms, such as AWS, Azure, Google Cloud, or others. The key benefit of a multi-cloud setup is the ability to avoid vendor lock-in. Organizations can take advantage of specific features or pricing from different cloud providers without being tied to one. For example, you might run a particular workload in AWS due to its storage options, while running another in Azure for better integration with other services. OpenShift enables you to manage multiple clusters across different clouds using a single control plane, which simplifies orchestration. You don’t have to worry about managing different sets of tools for each cloud; OpenShift handles the complexity of distributing workloads across different cloud environments, optimizing resource allocation, and ensuring consistent application deployment. This multi-cloud approach increases redundancy and availability, ensuring that your applications can run smoothly across multiple regions or providers.

Edge Deployments in OpenShift

Edge Computing is an area where OpenShift also shines. Edge deployments refer to running applications and processing data closer to where it is generated, such as in remote locations, industrial facilities, or even on IoT devices. This is particularly useful in situations where low latency or real-time processing is required, and where relying on centralized data centers or cloud resources is not feasible due to connectivity limitations. OpenShift supports running lightweight clusters at the edge, designed to operate even with limited resources, which is a key feature when deploying in such environments. Even though these edge clusters can operate autonomously, OpenShift still provides the ability to centrally manage and monitor them, ensuring that all clusters—whether at the edge, in the cloud, or on-premises—are controlled from a single interface. This makes it easier to ensure consistency, security, and updates across your infrastructure. Edge deployments are especially useful in industries like manufacturing, retail, and telecommunications, where immediate decision-making and local processing are crucial for operational efficiency and responsiveness.

Builds

A build is the process of transforming inputs into a resultant object, for example, transforming source code to a container image. A build requires a build configuration file or build config, which defines the build strategy and input sources. Commonly used build strategies are source-to-image, or S2I, Docker, and custom.

ImageStreams

  • In OpenShift:

    • ImageStreams are a powerful concept that help you manage and track container images in a more flexible, versioned, and consistent way. They are one of OpenShift's unique features that provides benefits beyond what Docker and Kubernetes offer for handling container images.

    • An ImageStream is a set of tags that are associated with container images. It allows OpenShift to track images over time, manage their updates, and create a higher-level abstraction than just managing images by their tags. An ImageStream can reference images from external container registries (such as Docker Hub, Quay.io, or a private registry), as well as local images within your OpenShift cluster.

    • ImageStreams provide a way to version and control your images in a more structured and manageable way, enabling you to easily track image changes and trigger automatic deployments.

Key Concepts of ImageStreams

  1. ImageStream Tags:

    • A tag in an ImageStream is a named reference to a specific image (like v1, latest, or prod).
    • You can define different tags for the same ImageStream, pointing to different versions of a container image (for example, v1, v2, or latest).
    • ImageStream tags help you track and manage different versions of your container images.
  2. ImageStream Repositories:

    • An ImageStream repository refers to the collection of all tagged versions of images within a specific ImageStream.
    • It could include references to external images (from Docker Hub, private registries) as well as internal OpenShift-generated images.
  3. ImageStream Import:

    • OpenShift allows you to import images into an ImageStream. This can be done manually or automatically.
    • When you import an image, OpenShift fetches the image from an external registry and stores it locally, making it part of the ImageStream. This means you can use that image within your OpenShift cluster and refer to it by its ImageStream tag.
  4. ImageStream Annotations:

    • ImageStreams support annotations, which are additional metadata that can be used to attach extra information to the image stream, such as build details, tags, etc.
  5. ImageStream Imports and Triggers:

    • OpenShift can be configured to automatically import a new image from an external registry when the image is updated. This is helpful in automating updates to your applications, so they are always running the latest version of a container image.
    • You can also use ImageStreams to trigger builds in OpenShift. For instance, when a new image is pushed to an ImageStream, it could trigger a deployment or another automated process, like a Continuous Deployment (CD) pipeline.

Why Use ImageStreams in OpenShift?

  • Versioning and Tagging: ImageStreams allow you to version your container images. When you tag a specific version of an image (e.g., v1, v2), OpenShift can track these versions and use them to manage deployments, ensuring that your application always runs the intended version.

  • Centralized Image Management: Instead of tracking container images by their image digest or tag, ImageStreams give you a centralized view of images in your OpenShift cluster. You can easily manage different versions of the same application and create a more structured deployment pipeline.

  • Integration with OpenShift Builds: ImageStreams are tightly integrated with OpenShift’s build system. You can configure OpenShift to automatically pull the latest image from an ImageStream whenever a new version is available, allowing for continuous integration and continuous deployment (CI/CD) pipelines.

  • Auto-Triggering: When an ImageStream tag is updated (e.g., a new image is pushed to it), OpenShift can automatically trigger various actions such as a deployment or build. This makes it easier to automatically deploy new application versions when the base image changes.

  • Flexibility with External Registries: ImageStreams are not just for images stored within OpenShift's internal registry. You can link ImageStreams to external container registries (like Docker Hub, AWS ECR, etc.), which means you can still use external image sources but benefit from OpenShift’s management and automation.

How to Use ImageStreams in OpenShift: Practical Examples

  1. Creating an ImageStream

You can create an ImageStream using either the OpenShift CLI (oc) or the OpenShift web console.

Using the CLI:

1oc create imagestream my-app

This creates an empty ImageStream named my-app. You can then import images into it, or create tags that point to different versions of images.

  1. Importing an External Image

You can import an image from an external registry (like Docker Hub) into your ImageStream:

1oc import-image my-app --from=docker.io/namespace/my-app:latest --confirm

This command pulls the image my-app:latest from Docker Hub and imports it into the my-app ImageStream in OpenShift.

  1. Viewing an ImageStream

You can view the tags associated with an ImageStream and the images it tracks:

1oc get is my-app

This command will show the ImageStream's tags and the images associated with them.

  1. Triggering a Deployment from an ImageStream Update

To trigger an automatic deployment when an image in an ImageStream is updated, you can link the ImageStream to a DeploymentConfig and set up triggers.

DeploymentConfig is a resource that exists only in OpenShift, not in Kubernetes.

An example of a deployment configuration with an ImageStream trigger:

1apiVersion: apps/v1
2kind: DeploymentConfig
3metadata:
4  name: my-app
5spec:
6  replicas: 1
7  triggers:
8    - type: ImageChange
9      imageChangeParams:
10        automatic: true
11        from:
12          kind: ImageStreamTag
13          name: my-app:latest
14  template:
15    spec:
16      containers:
17        - name: my-app
18          image: "my-app:latest"
19          ports:
20            - containerPort: 8080

This configuration ensures that whenever the my-app:latest image in the ImageStream is updated, the deployment will automatically trigger and update the running application.

  1. Working with ImageStream Tags

You can manage tags on your ImageStream to reference different versions of an image. For example, you might have a v1, v2, and latest tag in the same ImageStream.

To update a tag manually:

1oc tag my-app:v1 my-app:latest

This would point the latest tag to the same image as v1.

ImageStream as a Trigger for Builds: OpenShift can trigger builds whenever a new image is imported or pushed into an ImageStream. For instance, you might set up a pipeline so that whenever a new base image is pushed to an ImageStream, it triggers a new build of your application based on that image.

ImageStream Imports and External Sources: ImageStreams can track images from external sources like Docker Hub or private registries. For example, an ImageStream can monitor a specific tag from Docker Hub, and when that tag is updated, it can trigger an internal deployment in your OpenShift environment.

In OpenShift, ImageStreams provide a high-level abstraction over container images, offering versioning, automated workflows, and better management capabilities than working with raw images and tags directly. They give you a centralized and streamlined way to track images, automatically update deployments, and create robust CI/CD pipelines. Whether you're importing external images or managing internal ones, ImageStreams are a critical tool for any OpenShift-based application lifecycle management.

Build Triggers

alt text

See more about --> WebHooks

BuildConfig: Step-by-Step process

Key Points:

  1. Triggers for Builds: A configuration change trigger is one of the ways to automatically trigger a new build in OpenShift when a BuildConfig resource is created or changed. This can be configured to run when certain actions, like a change in the configuration, occur.

  2. BuildConfig Resource:

    • A BuildConfig in OpenShift defines the build process for creating container images. It specifies the input source (like a Git repository or Dockerfile) and the build strategy (like Source-to-Image (S2I) or Docker).
  3. Build Strategies: OpenShift offers several build strategies:

    • Source-to-Image (S2I): OpenShift provides pre-built images that can take source code and automatically produce a ready-to-run image without requiring a Dockerfile. For example, building a Ruby application using a pre-built Ruby image (like ruby-20-centos7). alt text
    • Docker: Uses a Dockerfile to build an image using Docker commands.
    • Custom: You define your own build strategy and the necessary tools, useful for specific custom workflows like CI/CD pipelines.
  4. Triggers: Triggers are used to start the build automatically when a certain event occurs. The common triggers in OpenShift are:

    • Image Change Trigger: Starts a new build when an image referenced in the ImageStream is updated.
    • Configuration Change Trigger: Starts a new build when the BuildConfig resource itself is updated (i.e., a change in the BuildConfig definition).
    • GitHub Webhook: A webhook can trigger a build when code changes are pushed to a repository.
  5. Post-Commit Hook: An optional section in the BuildConfig where you can define actions (like deploying to another environment) that happen after the build finishes.

Sample BuildConfig File Explanation:

  • An example BuildConfig YAML configuration for a Ruby application with a Source-to-Image (S2I) strategy:
1apiVersion: build.openshift.io/v1
2kind: BuildConfig
3metadata:
4  name: ruby-sample-build
5spec:
6  runPolicy: Serial # Options: 'Serial', 'Parallel' (default), 'Sequential'
7  triggers:
8    - type: ConfigChange # Triggers the build when the BuildConfig changes
9    - type: ImageChange
10      imageChangeParams:
11        automatic: true
12        from:
13          kind: ImageStreamTag
14          name: ruby-20-centos7:latest
15  source:
16    git:
17      uri: 'https://github.com/example/ruby-sample.git'
18      ref: master # Git branch to build from
19    type: Git # Defines the type of source (Git, Binary, Dockerfile, etc.)
20  strategy:
21    type: Source
22    sourceStrategy:
23      from:
24        kind: ImageStreamTag
25        name: ruby-20-centos7:latest # Base builder image
26  output:
27    to:
28      kind: ImageStreamTag
29      name: ruby-sample-app:latest
30  postCommit:
31    script: "echo 'Post commit action'"

Explanation of the Configuration:

  1. runPolicy:

    • Serial: Builds happen one at a time, no parallel builds.
    • Parallel: Allows multiple builds to run simultaneously.
    • Sequential: Runs builds in sequence.
  2. Triggers:

    • ConfigChange: The build is triggered whenever the BuildConfig itself changes. For example, if you modify the BuildConfig YAML file and apply it, OpenShift will trigger a new build based on the updated configuration.
    • ImageChange: The build is triggered whenever a change in the source image (like ruby-20-centos7) occurs. For example, if OpenShift detects that a new version of ruby-20-centos7 is available in the image stream, it will trigger a new build using that updated image.
  3. Source:

    • The source section defines where the build gets its input. In this case, it's pulling source code from a Git repository (https://github.com/example/ruby-sample.git) and the master branch.
    • The source type is Git, meaning the build process will pull the latest code from the repository and build it.
  4. Strategy:

    • This example uses the Source Strategy with Source-to-Image (S2I). The ruby-20-centos7 image is a pre-configured S2I builder image for Ruby, which takes the source code and creates a container image that runs the Ruby application.
    • This is different from a Docker strategy, where you would define a custom Dockerfile to build your image.
  5. Output:

    • The built image is pushed to an ImageStream named ruby-sample-app:latest. This ImageStream will store the newly built container image and be available for use by other services or deployments in OpenShift.
  6. PostCommit:

    • This section allows you to define actions that occur after the build is completed. In this case, it simply prints "Post commit action" as a placeholder.
    • You can add more complex actions here, such as deploying the built image to a development or staging environment.

Practical Example of How It Works:

  1. Creating the BuildConfig: You create the BuildConfig by applying the YAML configuration:

    1oc apply -f ruby-sample-build-config.yaml
  2. Triggering a Build:

    • Configuration Change: If you make any changes to the BuildConfig (e.g., change the Git repository URL or the build strategy), OpenShift will automatically trigger a new build due to the ConfigChange trigger. You can check for this by running:
      1oc get builds
    • Image Change: If OpenShift detects a new version of the base image ruby-20-centos7 in the image stream, it will trigger a new build. For example, when a new version of ruby-20-centos7 is pushed to the registry, it automatically triggers the build as defined by the ImageChange trigger.
  3. Build Execution:

    • OpenShift will pull the source code from the Git repository, use the ruby-20-centos7 builder image to build the Ruby application, and create a container image with that code.
    • The image will be stored in the ImageStream (ruby-sample-app:latest), ready to be used.
  4. Deploying the Image:

    • Once the build is complete, you can create a DeploymentConfig or other resources to deploy the new image to a pod or a service in OpenShift.

    Example:

    1oc new-app ruby-sample-app:latest
  5. Post-Commit Actions:

    • Any actions specified in the postCommit section (like scripts or commands) will be executed after the build process completes.

Conclusion:

  • Triggers in OpenShift, like Configuration Change and Image Change, allow you to automate builds based on changes in the BuildConfig resource or changes in the underlying images referenced in ImageStreams.
  • BuildConfig defines how to build the application, what source to use (Git, Dockerfile, etc.), and which strategy to apply (Source-to-Image, Docker, or Custom).
  • The S2I strategy in OpenShift simplifies the process of building a container image directly from source code without needing to write a Dockerfile.
  • After the build is completed, the container image is pushed to the internal OpenShift registry, and you can deploy it to your environment.

Operators (The coolest feature so far)

alt text

Why Operators ?

alt text

Service Brokers vs. Operators

alt text

Custome Controllers

alt text

Operator Framework

alt text

Operators, CRDs & Custom Controllers all put together

1. What is an Operator?

An Operator is a method of packaging, deploying, and managing Kubernetes applications. It automates tasks that would otherwise need human intervention, such as installation, scaling, upgrades, and monitoring of an application. Operators run as custom controllers within Kubernetes and use Custom Resource Definitions (CRDs) to manage application-specific resources.

  • Purpose: Operators replicate the behavior of a human operator by automating the management of applications.
  • Key Role: Operators monitor the cluster’s state and automatically make adjustments to maintain the desired state, such as deploying resources, scaling, updating, or handling failures.

2. What are CRDs (Custom Resource Definitions)?

A CRD is a way to extend Kubernetes beyond its built-in resources (like Pods, Deployments, and Services). With a CRD, you can define your own custom resource types that Kubernetes doesn’t have by default. CRDs allow you to store and retrieve application-specific objects, essentially extending the Kubernetes API.

  • Use Case: If you have an application like a database that needs specific settings (e.g., backup schedules, cluster sizes), you can define a Custom Resource for that application using a CRD. This resource becomes part of Kubernetes and is available for management with kubectl like any built-in resource.
  • Example: If you want a MySQL database to be managed in Kubernetes, you could define a MySQLCluster CRD that describes a MySQL instance and its configurations (like replication, backups, etc.).

3. What are Custom Controllers?

A Custom Controller in Kubernetes is responsible for managing the lifecycle of your custom resources (those defined by CRDs). It reconciles the actual state of the cluster with the desired state as described in the CRD.

  • Role: The custom controller watches for changes in the CRD and automatically takes actions to ensure the cluster matches the configuration specified. If the state of the application or infrastructure diverges from the desired state, the controller automatically adjusts it (e.g., scaling the number of pods or creating new resources).

  • Example: A custom controller for a MySQLCluster CRD would:

    1. Watch for new MySQLCluster resources being created.
    2. Create the necessary Pods, Services, and ConfigMaps as described in the custom resource.
    3. Ensure the cluster's state (e.g., number of replicas, database readiness) matches the desired state defined in the CRD.

4. How Operators, CRDs, and Custom Controllers Work Together:

  • Operator Pattern: The Operator is essentially a combination of CRDs and Custom Controllers that automate tasks for complex, stateful applications. The operator pattern allows you to describe the desired state of your application (via a CRD), and the custom controller makes sure the actual state of the system is aligned with that desired state.

Flow:

  1. Define CRD: A CRD defines the structure for a custom resource, like MySQLCluster, which specifies parameters for your app (e.g., database version, number of replicas).
  2. Create Custom Controller: The controller watches the MySQLCluster resources, ensuring that the correct number of MySQL instances are running, and handles tasks like scaling or upgrading the database.
  3. Operator Automates Management: The operator’s job is to take this a step further by automating all aspects of managing the MySQL cluster, including:
    • Deployment: Automatically creating and managing resources like Deployments, ConfigMaps, and Persistent Volumes.
    • Scaling: Automatically adjusting the number of MySQL replicas based on load or other criteria.
    • Upgrades: Managing software upgrades or patches to the MySQL instances with minimal downtime.

5. Practical Example: Deploying a MySQL Operator

Let's look at how this works with a MySQL operator, which automates the deployment and management of MySQL clusters on Kubernetes:

  • Step 1: Define a CRD for MySQLCluster: This CRD defines what a MySQLCluster resource looks like in the Kubernetes API. For example:

    1apiVersion: apiextensions.k8s.io/v1
    2kind: CustomResourceDefinition
    3metadata:
    4  name: mysqlclusters.example.com
    5spec:
    6  group: example.com
    7  names:
    8    kind: MySQLCluster
    9    plural: mysqlclusters
    10    singular: mysqlcluster
    11  scope: Namespaced
    12  versions:
    13  - name: v1
    14    served: true
    15    storage: true
    16    schema:
    17      openAPIV3Schema:
    18        type: object
    19        properties:
    20          spec:
    21            type: object
    22            properties:
    23              replicas:
    24                type: integer
    25                example: 3

    This CRD declares a MySQLCluster resource with properties like replicas.

  • Step 2: Define a Custom Controller: The controller listens for changes to MySQLCluster resources and reconciles the state. For instance, if a MySQLCluster resource specifies 3 replicas, the controller ensures that 3 MySQL pods are running.

    The controller code could be in Go (using the Kubernetes client) or in another language supported by the Operator SDK.

  • Step 3: Deploy the Operator: An Operator is deployed in a pod that runs the controller. This Operator watches for changes to the MySQLCluster resource and automatically takes action to ensure the desired state is achieved, e.g., creating necessary Kubernetes resources like Pods, StatefulSets, Persistent Volumes, etc.

    Operators typically run as long-lived processes within the cluster. They constantly monitor changes and adjust accordingly.

  • Step 4: Use the Operator: After the operator is deployed, a user can create a MySQLCluster custom resource:

    1apiVersion: example.com/v1
    2kind: MySQLCluster
    3metadata:
    4  name: mysql-cluster
    5spec:
    6  replicas: 3

    Once this resource is created, the operator's controller will ensure that there are 3 MySQL pods running, and it will manage tasks like scaling the number of replicas or performing upgrades as needed.

6. The Role of OperatorHub and Operator SDK:

  • OperatorHub: It’s a marketplace where you can find, install, and manage operators in Kubernetes or OpenShift. For example, you can install an Istio operator from the OperatorHub to manage Istio service mesh installation and upgrades.
  • Operator SDK: This provides tools for building operators using Go, Helm, or Ansible. It helps you create, test, and package your operators efficiently.

7. The Operator Lifecycle Manager (OLM):

OLM helps manage the lifecycle of operators, including installation, updates, and role-based access control (RBAC). It ensures that operators are correctly installed and their dependencies are handled.


Summary of the Flow:

  1. CRDs extend the Kubernetes API by defining custom resources (e.g., MySQLCluster).
  2. Custom Controllers watch and manage these resources, ensuring the desired state is met.
  3. Operators are controllers that use CRDs to automate application management tasks (like deployment, scaling, upgrades).
  4. OperatorHub is the marketplace for finding and installing operators.
  5. Operator SDK simplifies creating operators, and OLM manages their lifecycle in the cluster.

By combining these components, Kubernetes clusters can be automated, making management tasks more consistent, repeatable, and efficient.

Istio

alt text

A service mesh is an infrastructure layer that gives applications capabilities like zero-trust security, observability, and advanced traffic management, without code changes. Istio is the most popular, powerful, and trusted service mesh. Founded by Google, IBM and Lyft in 2016, Istio is a graduated project in the Cloud Native Computing Foundation alongside projects like Kubernetes and Prometheus.

Istio ensures that cloud native and distributed systems are resilient, helping modern enterprises maintain their workloads across diverse platforms while staying connected and protected. It enables security and governance controls including mTLS encryption, policy management and access control, powers network features like canary deployments, A/B testing, load balancing, failure recovery, and adds observability of traffic across your estate.

Istio is not confined to the boundaries of a single cluster, network or runtime — services running on Kubernetes or VMs, multi-cloud, hybrid, or on-premises, can be included within a single mesh.


What Is a Service Mesh?

A service mesh is a dedicated layer that manages communication between services by providing:

  • Traffic Management: Controls the flow of requests between services.
  • Security: Encrypts traffic and enforces authentication, authorization, and access policies.
  • Observability: Offers insights into service behavior through logging, monitoring, and tracing.

Key Concepts of Istio

Istio builds on the service mesh foundation with four core capabilities:

  1. Connection:
    Manages traffic routing for deployments such as canary releases and A/B testing.

  2. Security:
    Protects service communication with strong authentication, authorization, and encryption.

  3. Enforcement:
    Implements and enforces policies across your entire service fleet to ensure controlled access and consistent behavior.

  4. Observability:
    Provides detailed metrics, tracing, and logging to monitor and troubleshoot the flow of traffic and service performance.


How Istio Enhances Microservices

Istio integrates into microservices architectures, which consist of loosely coupled and independently deployable services. The benefits include:

  • Independent Updates: Update only the affected microservice without impacting the whole application.
  • Scalability: Scale individual components as needed.
  • Technology Diversity: Use different technology stacks for different services.

However, challenges such as ensuring secure communication, handling complex service interactions, and preventing cascading failures require additional strategies like retries, circuit breakers, and robust monitoring. Istio addresses these by:

  • Traffic Shifting: Gradually moving traffic from one version to another for smoother updates.
  • A/B Testing: Directing specific user segments to different service versions to optimize performance.
  • Enhanced Security: Encrypting all communications to prevent man-in-the-middle attacks.

Monitoring Metrics

Istio monitors service communications using four basic metrics:

  • Latency: Measures response times to detect bottlenecks.
  • Traffic: Tracks the number of requests for scaling insights.
  • Errors: Identifies issues in service interactions.
  • Saturation: Indicates resource usage and potential capacity issues.

Module Summary

  • OpenShiftÂź is an enterprise-ready Kubernetes container platform built for open hybrid cloud. 

  • OpenShift is easier to use, integrates with Jenkins, and has more services and features. 

  • Custom resource definitions (CRDs) extend the Kubernetes API.

  • CRDs paired with custom controllers create new, declarative APIs in Kubernetes.

  • Operators use CRDs and custom controllers to automate cluster tasks.

  • A build is a process that transforms inputs into an object.

  • An ImageStream is an abstraction for referencing container images in OpenShift.

  • A service mesh provides traffic management to control the flow of traffic between services, security to encrypt traffic between services, and observability of service behavior to troubleshoot and optimize applications.

  • Istio is a service mesh that supports four concepts of connection, security, enforcement, and observability. It is commonly used with Microservices applications.

  • Istio provides service communication metrics for basic service monitoring needs: latency, traffic, errors, and saturation.

Labs (Red Hat OpenShift Web Console & IBM Cloud)