Kubernetes workbook

A guide to ease your work with Kubernetes. It contains simple explanations, examples and tips to help you understand important concepts, the kubectl tool and ease the creation and management of a bunch of Kubernetes resources.

Kubernetes workbook
Photo by NIR HIMI / Unsplash

Help and tips

Kubectl command

Use the following commands and look for the section: Examples

# Get help and usage examples on kubectl commands
kubectl -h
kubectl $subcommand [$resource_name] -h

Note that all Kubernetes resources can't be created using kubectl commands. Use the following command for a list of resources that can be created that way:

# Find resources that can be created with the kubectl command
kubectl create -h

There are also kubectl subcommands that create resources we can't directly create using 'kubectl create'. The 'expose' subcommand is one example, for creating Kubernetes services resource and 'run' for creating pods.

To generate resources manifests from kubectl commands, that can then be adapted and deployed, use the following commands:

# Generate resources manifests from kubectl commands
kubectl create ... --save-config=true --dry-run=client -o yaml > $manifest_file

# Create resources from generated manifest file
kubectl apply -f $manifest_file

Resources fields documentation

Use the following kubectl commands to get documentation about Kubernetes resources fields:

# Get Kubernetes resources fields documentation
kubectl explain $resource_name
kubectl explain $resource_name.$resource_field

# For help on how to use that command
kubectl explain -h

We can also use the online Kubernetes resources API reference documentation: Kubernetes resources API reference

Kubectl configuration

Kubectl configurations file

The configuration file used by kubectl for authenticating and interacting with different Kubernetes clusters is '~/.kube/config' by default. Inside that file, we will find Kubernetes clusters configuration settings like API servers endpoints and certificates, client users and certificates, and so on.

Kubectl sees one cluster configuration from its default configuration file ('~/.kube/config') as a context.

Set current context for kubectl

To tell kubectl to use a specific cluster for interaction, we set the context accordingly.

# Set the current context to kubernetes-admin@kubernetes1
$ kubectl config use-context kubernetes-admin@kubernetes1
Switched to context "kubernetes-admin@kubernetes1".

List available contexts for kubectl

To list available kubectl contexts, we use:

# List available contexts 
$ kubectl config get-contexts
CURRENT   NAME                           CLUSTER       AUTHINFO            NAMESPACE
*         kubernetes-admin@kubernetes1   kubernetes1    kubernetes-admin
          kubernetes-admin@kubernetes2   kubernetes2   kubernetes-admin
          kubernetes-admin@kubernetes3   kubernetes3   kubernetes-admin

The context we are currently using is marked with an *.

Show the current context for kubectl

We can also use the following command to show the current context:

# Show the current context
$ kubectl config current-context
kubernetes-admin@kubernetes1

Set default namespace for kubectl current context

To set kubectl default namespace for the current context, we use:

# Set the default namespace for the current kubectl context
$ kubectl config set-context kubernetes-admin@kubernetes1 --namespace=not_default
Context "kubernetes-admin@kubernetes1" modified.

# Verify
$ kubectl get pods
No resources found in not_default namespace.

Show kubectl configuration

To view the kubectl configuration file content, use:

# View kubectl configuration file
$ kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://172.28.247.116:6443
  name: kubernetes1
contexts:
- context:
    cluster: kubernetes1
    namespace: not_default
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes1
current-context: kubernetes-admin@kubernetes1
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: DATA+OMITTED
    client-key-data: DATA+OMITTED

Managing Pods

Have a look at Understanding Kubernetes Pods for an overview on Pods functioning.

Simple Pod

pod

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  containers:
  - name: nginx
    image: nginx

Pod selecting specific nodes

pod.spec.nodeName

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  nodeName: k8s-worker1
  containers:
  (...)

pod.spec.nodeSelector

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  nodeSelector: 
    kubernetes.io/hostname: k8s-worker1
  containers:
  (...)

pod.spec.tolerations

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  tolerations:
  - key: nodepool
    value: monitoring
    effect: NoSchedule
  containers:
  (...)

Pod with specific containers command

pod.spec.containers.command

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  containers:
  - name: busybox
    image: busybox
    command:
    - sh
    - -c
    - while true; do sleep 10; done

That sleep command is useful to make sure the pod will be in a Running state.

Pod with volume + sidecar container

pod.spec.volumes | pod.spec.volumeMounts

A sidecar container is a second container inside a Pod that provides additional functionalities. In the below example manifest, the main container logs are stored inside a file. The sidecar container is used to output those logs to stdout, so that we can read them with the 'kubectl logs' command.

To share the log files between the two containers, we use a volume named logs backed by the 'emptyDir' ephemeral storage type (data erased after pod restart), and mount that volume inside both containers at /output where our app.log file will be stored.

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  volumes:
  - name: logs
    emptyDir:
     medium: ""
  containers:
  - name: main
    image: busybox
    command: ['sh', '-c', 'while true; do echo "App logs" > /output/app.log; sleep 5; done']
    volumeMounts:
    - name: logs
      mountPath: /output
  - name: sidecar
    image: busybox
    command: ['tail', '-f', '/output/app.log']
    volumeMounts:
    - name: logs
      mountPath: /output

Pod with initContainers

pod.spec.containers.initContainers

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  volumes:
  - name: statics
    emptyDir:
      medium: ""
  containers:
  - name: nginx
    image: nginx
    volumeMounts:
    - name: statics
      mountPath: /var/www/html
  initContainers:
  - name: busybox
    image: busybox
    volumeMounts:
    - name: statics
      mountPath: /var/www/html
    command:
    - sh
    - -c
    - touch /var/www/html/init.txt

Pod using data from configMaps and secrets

pod.spec.containers.env | pod.spec.volumes | pod.spec.containers.volumeMounts

For details about creating 'configMap' or 'secrets' resources run:

kubectl create configmaps -h
kubectl create secrets -h

Use '--save-config=true --dry-run=client -o yaml' options to generate command equivalent manifests if needed. For a deep dive about Kubernetes secrets, have a look at Understanding secrets in Kubernetes.

For a deep dive and concrete behavior examples about the various ways of creating secrets, have a look at Creating Kubernetes generic secrets. Replace 'secret generic' by 'configmap' in all commands for configMaps.

Here we are going to make configMaps and secrets data available inside our pod as file(s) and environment variable(s).

First, lets create two example configMaps and one secret:

# Create configMap from literal key=value pairs
# Will be made available inside the pod as environment variables
$ kubectl create cm app-config --from-literal=log_level=warn --from-literal=sso_client_id=dfj8473ll -n my-namespace

# Create configMap from a configuration file
# Will be made available inside the pod as a file
$ kubectl create cm nginx-config --from-file=nginx.conf -n my-namespace

# Create secret from literal key=value pairs
# Kubernetes secrets are not encrypted (just base64 encoded)
# Will be made available inside the pod as file
$ kubectl create secret generic database-creds --from-literal=db-password=mysuperpass -n my-namespace

Results:

$ kubectl get cm/app-config -n my-namespace -o yaml
apiVersion: v1
data:
  log_level: warn
  sso_client_id: dfj8473ll
kind: ConfigMap
metadata:
  creationTimestamp: "<creation_timestamp>"
  name: app-config
  namespace: my-namespace
  resourceVersion: "48708"
  uid: 91c9acf6-ea50-4bd7-a771-7cca36b58109
  
$ kubectl get cm/nginx-config -n my-namespace -o yaml
apiVersion: v1
data:
  nginx.conf: "http {\n    server {\n\tlisten 80;\n\tserver_name myapp.local;\n\n\tlocation
    / {\n            proxy_set_header Host $host;\n\t    proxy_set_header X-Real-IP
    $remote_addr;\n\n            proxy_pass http://127.0.0.1:9000;\n        }\n\n
    \       # Logging\n        # Log levels : debug, info, notice, warn, error, crit,
    alert, or emerg\n\t# Choosing warn for example => logging all messages of sevrity
    \n\t# warn and above (error, crit, alert and emerg)\n        access_log /var/log/nginx/myapp-access.log
    combined;\n  \terror_log /var/log/nginx/myapp-error.log warn;\n    }\n}\n"
kind: ConfigMap
metadata:
  creationTimestamp: "<creation_timestamp>"
  name: nginx-config
  namespace: my-namespace
  resourceVersion: "48761"
  uid: 41c0f2b5-f8c3-4dc5-99ca-3c132b2dd19e
  
$ kubectl get secrets/database-creds -n my-namespace -o yaml
apiVersion: v1
data:
  db-password: bXlzdXBlcnBhc3M=
kind: Secret
metadata:
  creationTimestamp: "<creation_timestamp>"
  name: database-creds
  namespace: my-namespace
  resourceVersion: "49176"
  uid: 7f13f2e8-920a-49aa-bebf-8dde4edc4dd1
type: Opaque

Here is the manifest for the Pod using the data from the preceding created configMaps and secrets:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  volumes:
  - name: nginx-conf
    configMap:
      name: nginx-config
      items: # optional. By default filename at mount point will be 'data key' and the content: corresponding keys 'data value' (same thing for secrets)
      - key:  nginx.conf # key of the item inside the configMap
        mode: 0755 # file mode at mount path
        path: nginx-deployed.conf # custom filname at mount path
  - name: database-creds
    secret:
      secretName: database-creds
  containers:
  - name: nginx
    image: nginx
    volumeMounts:
    - name: nginx-conf
      mountPath: /nginx
    - name: database-creds
      mountPath: /database
    env:
    - name: LOG_LEVEL
      valueFrom:
        configMapKeyRef: # or secretKeyRef
          name: app-config # name of the configMap containing the data
          key: log_level # key inside the configMap, to get the value from
    - name: SSO_CLIENT_ID
      valueFrom:
        configMapKeyRef:
          name: app-config # name of the configMap containing the data
          key: sso_client_id # key inside the configMap, to get the value from

Now lets look at the data inside the Pods container:

$ kubectl exec -it pods/my-pod -n my-namespace -- bash
root@my-pod:/#

# Nginx config
root@my-pod:/# cat nginx/nginx-deployed.conf 
http {
    server {
        listen 80;
        server_name myapp.local;

        location / {
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;

            proxy_pass http://127.0.0.1:9000;
        }

        # Logging
        # Log levels : debug, info, notice, warn, error, crit, alert, or emerg
        # Choosing warn for example => logging all messages of sevrity
        # warn and above (error, crit, alert and emerg)
        access_log /var/log/nginx/myapp-access.log combined;
        error_log /var/log/nginx/myapp-error.log warn;
    }
}

# Database secret
root@my-pod:/# cat database/db-password 
mysuperpass

# Environment variables
root@my-pod:/# env | grep -E "SSO|LOG"
LOG_LEVEL=warn
SSO_CLIENT_ID=dfj8473ll

Pod using probes and restart policy

Pods lifecycle fields | pod.spec.containers.xProbes | pod.spec.containers.restartPolicy

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  containers:
  - name: nginx
    image: nginx
    restartPolicy: Always
    livenessProbe:
      exec:
        command:
        - sh
        - -c
        - cat /etc/nginx/nginx.conf
    readinessProbe:
      httpGet:
        port: 80
        scheme: HTTP
        path: /
$ kubectl get pods -n my-namespace -o wide
NAME     READY   STATUS    RESTARTS   AGE     IP            NODE          NOMINATED NODE   READINESS GATES
my-pod   1/1     Running   0          2m46s   10.244.1.12   k8s-worker1   <none>           <none>
  • The Running in the STATUS column indicates a successful livenessProbe
  • The 1/1 in the READY column indicates a successful ReadinessProbe for the single container inside the pod
  • Once a ReadinessProbe is successful for a specific pods container, that pods container endpoint (ip:port) is added to associated services endpoints in case there are services redirecting traffic to the pod. More about Kubernetes services and endpoints in the Managing services and ingress section

Deployments

Deployments documentation | Deployments API reference

Use case

The deployment can be used to manage a set of stateless identical pods. It creates a replicaset under the hood to manage the number of replica pods.

Creating deployments
# Create deployment or generate command equivalent manifest
$ kubectl create deploy my-deploy --image=nginx --replicas=2 -n my-namespace [--save-config=true --dry-run=client -o yaml > deploy.yaml]
How it works

The deployments '.spec.template' field is used to define a template for the pods that will be created and managed by the deployment. The value of the deployments '.spec.selector.matchLabels' field should match with the one inside the '.spec.template.metadata.labels' field in order for the deployment to properly select the pods to manage.

Each time the deployments pods template ('.spec.template') changes, a new deployment revision is created. For instance, adding more replicas to a deployment does not create a new revision as only the '.spec.replicas' field changes, not '.spec.template'. A new revision creates a new replicaset, that is progressively scaled up to meet the deployments replicas number, while the old replicas are progressively scaled down to zero.

Replicaset names are in the form:

  • <deployment_name>-<replicaset_hash>

The hash of the replicaset managing the deployments pods is also found in the pods names that are in the following form:

  • <deployment_name>-<replicaset_hash>-<pod_hash>
Updates

Only when the deployments pods template field ('.spec.template') changes, a new revision of the deployment is created to replace the existing one. The default strategy used to replace the pods of the existing revision, specified in the '.spec.strategy.type' field of the deployment is 'RollingUpdates'. For more on updates strategies for the deployment resource, have a look at: updates strategies for the deployments resource.

With 'RollingUpdates' old revision pods are progressively replaced by the new. A bunch of new revision pods are added first and when successful, the same amount of old revision pods are deleted.

The number of new revision pods that are added is called 'max surge' and can be set with the '.spec.strategy.rollingUpdate.maxSurge' deployments field (default value to 25% of the number of the deployments replicas).

The number of old revision pods that are deleted is called 'max unavailable' and can be set with the '.spec.strategy.rollingUpdate.maxUnavailable' deployments field (default value to 25% of the number of the deployments replicas).

Here is a deployment update exemple:

# Update deployments pods containers image
kubectl set image deployment/nginx-deployment nginx=nginx:1.161

# See progress
kubectl rollout status deployment/nginx-deployment [--timeout <seconds>]

Here you will find all the available deployment statuses and what they mean.

To make multiple changes to the deployments '.spec.template' field without triggering new revisions creation, we can do the following:

# Pause deployment updates
kubectl rollout pause deployment/nginx-deployment

# then, do all changes, exemple:
kubectl set image deployment/nginx-deployment nginx=nginx:1.17.x
kubectl set resources deployment/nginx-deployment -c=nginx --limits=cpu=200m,memory=512Mi

# to apply all changes at once, run
kubectl rollout resume deployment/nginx-deployment

To add more replicas to a deployment, manually or automatically, use:

# Manual scale
kubectl scale deployment/nginx-deployment --replicas=4

# Automated scaling with HPA (Horizontal Pod Autoscaler)
kubectl autoscale deployment/nginx-deployment --min=2 --max=8 --cpu-percent=80

To get info about revisions:

# See deployments updates history
kubectl rollout history deployment/nginx-deployment

# See specific revision details
kubectl rollout history deployment/nginx-deployment --revision=2

To update the revisions history limit, use the '.spec.revisionHistoryLimit' deployments field.

To rollback a deployment, use:

# Rollback to the previous revision
kubectl rollout undo deployment/nginx-deployment

# Rollback to a specific revision
kubectl rollout undo deployment/nginx-deployment --to-revision=2

When rolling back to early revisions, only the deployments pods template part ('.spec.template') is rolled back.

For additional deployment's '.spec' fields, have a look at Writing a deployment spec.

StatefulSets

StatefulSet's documentation | StatefulSet's API reference

Use case

The statefulset can be used to create and manage a set of pods that require a stable persistent storage or a stable, unique network identity.

Stable means that once a persistent storage (PV + PVC) or network identity (DNS / Hostname) has been assigned to the statefulsets pods, we have the guarantee that they will remain the same after pods (re)scheduling.

Stafulsets pods are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.

Ordering

The pods of a statefulset are by default numbered from 0 to N-1, with N corresponding to the number of replicas pods. Their names are in the following form:

  • Statefulsets pods names: ${statefulset_name}-${ordinal_number}

The ordinal number from which to start from (0 by default), is configurable. Have a look at statefuset's pods start ordinal number if interested.

The statefulset's controller also adds the following two labels to the statefulset's pods:

  • app.kubernetes.io/pod-index: ordinal number of the pod
  • statefulset.kubernentes.io/pod-name: name of the pod. Gives the possibility to attach only specific statefulset's pods to Kubernetes services (services select pods by labels)

The pods of a statefulset, during deployment or scaling, are created sequentially from 0 to N-1 (N = number of replicas pods) and deleted in reverse order, from N-1 to 0. This behavior is defined by the statefulset's pods management policy.

Before a new pod is added to the statefulset, all of its predecessors must be 'Running' and 'Ready'. Before a pod is removed from the statefulset, all of its successors must be shutdown completely.

Network identity

The stable network identity for the statefulset's pods is provided by a headless service. For more on headless services, have a look at kubernetes-headless-services. Unlike classic Kubernetes services that load balance traffic to pods, the goal of a headless service is to provide a sticky network identity to the backend pods. That stable network identity is simply a DNS name that is unique for each pod of the statefulset. That DNS name has the following form:

  • ${pod_name}.${headless_service_name}.${namespace}.svc.${cluster_domain}

${cluster_domain} is the domain name configured for your Kubernetes cluster, cluster.local for instance.

The headless service must exist prior to the creation of the statefulset. It is simply a service of type ClusterIP, with the ClusterIP set to None. Here is an exemple manifest:

apiVersion: v1
kind: Service
metadata:
  name: mysts-headless-service
  namespace: myns
  labels:
    app: mysts
spec:
  ports:
  - port: 80
    targetPort: 80
    name: http
  selector: 
    app: mysts
  clusterIP: None # the None value here makes the service headless

During the statefulset's creation, the name of the headless service must be specified inside the '.spec.serviceName' field as follows:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysts
  namespace: myns
  labels:
    app: mysts
spec:
  replicas: 2
  serviceName: "mysts-headless-service"
  (...)

All the fields we can use inside statefulset's manifests are described here: Statefulset's API reference .

Persistent storage

The persistent storage that will be used by the statefuset's pods is defined with '.spec.volumeClaimTemplates'. Inside that field, we define a list of PersistentVolumeClaims (PVC) that will be used by each of the statefulset's pods.

Those PVCs will either use a storage class ('storageClassName') for dynamic storage provisioning or directly reference an already provisioned storage using 'volumeName'. Here are examples:

# Dynamic storage provisioning

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysts
  namespace: myns
  labels:
    app: mysts
spec:
  (...)
  volumeClaimTemplates:
  - metadata:
      name: mypvc
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "standard"
      resources:
        requests:
          storage: 500Gi

# Static storage provisioning 

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysts
  namespace: myns
  labels:
    app: mysts
spec:
  (...)
  volumeClaimTemplates:
  - metadata:
      name: mypvc
    spec:
      accessModes: [ "ReadWriteOnce" ]
      volumeName: "mypv" # the name of an alredy existing PV

In the first case, a persistent storage (represented by a PersistentVolume resource) of the defined storage class will be created for each of the statefulset's pods. In the second case, the single referenced persistent storage (represented by a PersistentVolume resource) will be made available to all the pods of the statefulset.

Once persistent storage are properly defined inside the '.spec.volumeClaimTemplates', the statefulset's pods template '.spec.template' can then be used to mount the available persistent storage for each pods containers using '.spec.template.spec.containers[x].volumeMounts'. Here is an example:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysts
  namespace: myns
  labels:
    app: mysts
spec:
  replicas: 2
  serviceName: "mysts-headless-service"
  selector:
    matchLabels:
      app: mysts
  template:
    metadata:
      labels:
        app: mysts
    spec:
      containers:
      - name: mysts
        image: mysts:latest
        volumeMounts:
          - mountPath: /var/local
            name: mypvc # name of the PVC defined spec.volumeClaimTemplates
    (...)

Here is the API specification for the pods template: PodTemplateSpec

Updates

The default strategy used to update a statefulset is rolling updates, which rollouts the statefulsets pods, one by one, from higher ordinals to lower ones. The statefulsets controller waits until an updated pod is Running and Ready before updating its predecessor.

The statefulsets update strategy is controlled by the '.spec.updateStrategy.type' field. In addition to the 'RollingUpdate' strategy, there is also the 'OnDelete' one, which tells the statefulsets to not immedialtely update the pods, but instead, wait for user action (pod deletion for instance) before a pod is updated.

The statefulsets pods rolling update can also be partitioned, which makes only pods with an ordinal number greater or equal to the partition number, to be updated automatically. For details, have a look at statefulset's partitioned rolling updates.

In some rare cases, the statefulset can be stuck in a broken state requiring manual intervention. For details, have a look at statefulset's broken state.

DaemonSets

DaemonSet's documentation | DaemonSet's API reference

Use case

We use daemonsets in case we want to ensure pods are running on every nodes of the Kubernetes cluster. Remember that control plane nodes, by default, have 'taints' to avoid pods running on them. Daemonsets pods are not an exception. So to ensure our daemonsets pods run on every nodes, including master nodes of the cluster, we need to tolerate the necessary 'taints', for instance:

  • node-role.kubernetes.io/control-plane, Exists, NoSchedule
  • node-role.kubernetes.io/master, Exists, NoSchedule

A set of tolerations are by default added to daemonsets pods. For a list, have a look at daemonSet's added tolerations.

Managing services and ingress

What is a service ?

Services expose a set of Kubernetes applications pods. Instead of communicating with applications pods directly, we communicate with services that loadbalance traffic to pods. Regardless the number of pods, the application will be reachable the same way.

Backend pods that should be reached through the service are identified by their labels, known by the service. An Endpoints resource is also created and associated to the service. That endpoints resource maintains a set of IP:PORT of backends pods to reach through the service (network layer 4 traffic).

There are different types of services that cover different exposition needs we are going to talk about next.

What is an ingress ?

Ingress exposes Kubernetes applications through the network layer 7 HTTP and HTTPS protocols. Mostly used to make web apps accessible from outside the cluster.

Kubernetes Ingress objects are created inside specific namespaces and contain rules that define where (on which backend service/port) to send HTTP/HTTPS traffic targeting specific hostnames / domain names.

HTTP/HTTPS routing rules defined inside Ingress objects are implemented by ingress controllers. Ingress controllers are simply pods, exposed through Load Balancers (in the Cloud) or NodePort services, that receive HTTP(S) traffic and forward them to applications services according to ingress rules.

There are many ingress controllers that can be deployed inside our Kubernetes clusters (traefik, nginx, ingress-gce in GCP...). Each ingress controller looks at 'spec.ingressClassName' inside ingress objects, and only if that field matches the controller name, the controller implement the routing rules.

ClusterIP service

Reachable only from inside the cluster. To expose a resource (pods, deployments, statefulsets...) through a ClusterIP service, use this:

# Create a ClusterIP service for a deployment or generate manifest
$ kubectl expose deployment/my-deploy -n my-namespace --name my-deploy-service --type=ClusterIP --port=80 --target-port=8080 [--save-config=true --dry-run=client -o yaml > deployment-service.yaml]

The '--port' option indicates the listening port of the service and '--target-port' indicates the port name or number inside pods on which the service should redirect traffic.

LoadBalancer service

Implementation free service. Implemented by Cloud providers to provide access to applications running inside their managed Kubernetes clusters services, from outside the cluster.

In Google Cloud Platform (GCP) for instance, creating a LoadBalancer type service inside Google Kubernetes Engine (GKE), provisions a layer 4 Loadbalancer resource, accessible from outside the cluster, forwarding traffic to target backends pods inside the cluster.

Use this to expose a resource (pods, deployments, statefulsets...) through a LoadBalancer service:

# Create a LoadBalancer service for a deployment or generate manifest
$ kubectl expose deployment/my-deploy -n my-namespace --name my-deploy-service --type=LoadBalancer --port=80 --target-port=8080 [--save-config=true --dry-run=client -o yaml > deployment-service.yaml]

The '--port' option indicates the listening port of the service and '--target-port' indicates the port name or number inside pods on which the service should redirect traffic.

NodePort service

Exposes applications pods through cluster nodes. Each NodePort type service will listen on a specific port on each node of the cluster. NodePort exposed applications are then reachable from outside the cluster using any node IP and the reserved node port number.

# Create a NodePort service for a deployment or generate manifest
$ kubectl expose deployment/my-deploy -n my-namespace --name my-deploy-service --type=NodePort --port=80 --target-port=8080 [--save-config=true --dry-run=client -o yaml > deployment-service.yaml]

The '--port' option indicates the listening port of the service and '--target-port' indicates the port name or number inside pods on which the service should redirect traffic.

The port number that will be opened on each node will be choosen randomly. To set that port to a specific value, edit the 'nodePort' field of the NodePort service manifest.

Ingress

To create an ingress resource or generate ingress manifest, use this:

$ kubectl create ingress my-ingress --rule="my-domain.com/app=backend-service-name:8080,tls=my-cert" --class="nginx" [--save-config=true --dry-run=client -o yaml > ingress.yaml]

# Use this rule instead for redirecting all requests
# regardless the hostname to backend-service-name on port 8080
--rule="/*=backend-service-name:8080"

Managing storage

StorageClass

StorageClass is used to provide dynamic storage resource provisioning in Kubernetes. It can also be used to enable storage expansion feature when using static storage resource provisioning.

Have a look at StorageClass API reference for all the fields of a StorageClass resources manifest.

Currently, the StorageClass resource can't be created using the 'kubectl create' command. Here is a sample manifest for creating that resource:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local
provisioner: kubernetes.io/no-provisioner
allowVolumeExpansion: true  
reclaimPolicy: Retain 
  • provisioner: name of the provisioner used to dynamically provision storage resources of this StorageClass. The provisioner named 'kubernetes.io/no-provisioner' indicates that the StorageClass does not provide dynamic storage resources provisioning
  • allowVolumeExpansion: allow expansion of the storage resources created through or using this StorageClass. This is how we allow storage expansion by updating PersistentVolumeClaim (PVC) resources 'spec.resources.requests.storage'
  • reclaimPolicy: determines what happens when claims (PVC) of dynamically provisioned storage resources are deleted. Possible values are:
    • Delete (default): delete the storage resource
    • Retain: keep the storage resource and its data. The storage resources PV will not be available for use by another claim
    • Recycle (deprecated): remove the storage resources data and make the PV available for use by another claim

PersistentVolumes (PV)

PersistentVolume resources represent the underlying storage resources that will actually be used by pods to store data.

Have a look at PersistentVolume API reference for all the fields of a PersistentVolume resources manifest and also all the backends volumes supported (hostPath, nfs...).

Currently, the PersistentVolume resource can't be created using the 'kubectl create' command. Here is a sample manifest for creating that resource:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  storageClassName: local
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /local
    type: Directory
  persistentVolumeReclaimPolicy: Retain
  • storageClassName: name of the StorageClass to which this PV belongs
  • capacity.storage: size of the PV. 1k = 1000 bytes and 1Ki = 1024 bytes. Same for megabytes and gigabytes (m, Mi and g, Gi)
  • persistentVolumeReclaimPolicy: determines what happens when claims (PVC) of manually or dynamically provisioned PVs are deleted. For dynamically provisioned PVs, the value of this param is taken from the StorageClass 'reclaimPolicy' setting. Possible values are:
    • Delete (default for dynamically provisioned PVs): delete the PV
    • Retain (default for manually provisioned PVs): keep the PV and its underlying storage resource data. The PV will not be available for use by another claim
    • Recycle (deprecated): remove the storage resources data and make the PV available for use by another claim

PersistentVolumeClaim (PVC)

PersistentVolumeClaim resources represent users requests for using storage resources. This request or claim can be configured to dynamically provision and use storage resources represented by the PV resource or to directly use a manually provisioned PV.

Pods can then use storage resources by declaring volumes referencing PVCs by their names at 'spec.volumes.persitentVolumeClaim.claimName'.

Have a look at PersistentVolumeClaim API reference for all the fields of a PersistentVolumeClaim resources manifest.

Currently, the PersistentVolumeClaim resource can't be created using the 'kubectl create' command.

Here is a sample manifest for dynamic storage resource provisioning through StorageClass:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: my-sc # Name of the StorageClass for dynamic provisioning
  resources:
    requests:
      storage: 3Gi # requested volume size

And two others for using manually provisioned storage resources:

  • First option:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi # requested volume size
  volumeName: my-pv # Directly specify the name of the PV to use

with this one, the requested storage resource size won't be expandable ('spec.resrouces.requests.storage' field) and we will get the following error if we try:

error: persistentvolumeclaims "my-pvc" could not be patched:
persistentvolumeclaims "my-pvc" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize
You can run `kubectl replace -f /tmp/kubectl edit-1069802193.yaml` to try this update again.
  • Second option:

Use a StorageClass that has the 'allowVolumeExpansion' field set to 'true' and the 'provisioner' field set to 'kubernetes.io/no-provisioner' (no dynamic provisioning) in both the PV and PVC 'spec.storageClassName':

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local # Name of the StorageClass that allows volume expansion
  resources:
    requests:
      storage: 3Gi # requested volume size

Managing RBAC

RBAC or Role Based Access Control is one way of implementing access control in Kubernetes. RBAC can be namespace scoped or cluster scoped.

We use Role and RoleBinding for access control on a set of Kubernetes resources 'inside specific namespaces only'.

We use ClusterRole and ClusterRoleBinding for access control on 'all resources of any namespace' of the cluster.

ServiceAccount

To create a ServiceAccount named my-sa inside my-sa-namespace namespace:

# Command line
$ kubectl create sa my-sa -n my-sa-namespace

# or 

# Manifest
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-sa
  namespace: my-sa-namespace

Role or ClusterRole

Role or ClusterRole resources represent the access control rules. That's where we define which permissions to grant on which resources. For instance, 'read pods', 'update secrets', 'create services' and so on. So, in order to create those resources, we need names of available permissions (create, get, list, update...) and names of concerned resources (pods, secrets, deployments...).

# Create Role/ClusterRole or generate manifests
# Use the indicated dry run option to generate equivalent yaml manifest only

# Role resource
# Should be put inside a specific namespace 
# (-n <namespace>, otherwise go to default)
$ kubectl create role my-role --verb=get,list --resource=pods,secrets -n role-ns [--save-config=true --dry-run=client -o yaml > role.yaml]

# ClusterRole resource
# Can't be put inside a specific namespace
kubectl create clusterrole my-clusterrole --verb=get,list --resource=pods,secrets [--save-config=true --dry-run=client -o yaml > clusterrole.yaml]

RoleBinding or ClusterRoleBinding

RoleBinding or ClusterRoleBinding resources associate a Role or ClusterRole resource to a principal (user, group or service account). That association grants permissions defined inside the Role or ClusterRole to the principal.

# Create RoleBinding/ClusterRoleBinding or generate manifests
# Use the indicated dry run option to generate equivalent yaml manifest only

# RoleBinding resource
# Should be put inside a specific namespace
# (-n <namespace>, otherwise go to default)
$ kubectl create rolebinding my-rolebinding --role=my-role --serviceaccount=my-sa-namespace:my-sa -n role-ns [--save-config=true --dry-run=client -o yaml > rolebinding.yaml]

# ClusterRoleBinding resource
# Can't be put inside a specific namespace
kubectl create clusterrolebinding my-clusterrolebinding --clusterrole=my-clusterrole --serviceaccount=my-sa-namespace:my-sa [--save-config=true --dry-run=client -o yaml > clusterrolebinding.yaml]

Managing Pods networking

NetworkPolicy

Before using this, have a look at the pre-requisites.

NetworkPolicies can be used to create networking rules that allow/deny traffic between pods or between pods and the outside world.

Select pods on which the NetworkPolicy applies

Achieved using 'spec.podSelector'.

An empty value selects all pods from the namespace where the NetworkPolicy resource is created. Otherwise, selects pods with the labels supplied at 'spec.podSelector.matchLabels' as illustrated below:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
(...)
spec:
  podSelector:
    matchLabels:
      app: mysuperapp
(...)
Choose the NetworkPolicy types

Achieved using 'spec.policyTypes'.

A list containing either 'Ingress', 'Egress' or both ('Ingress' and 'Egress'):

  • Ingress: when specified, only inbound traffic allowed by the 'spec.ingress' rules will be authorized. Outbound traffic will continu to work as before
  • Egress: when specified, only outbound traffic allowed by the 'spec.egress' rules will be authorized. Inbound traffic will continu to work as before
  • When both 'Ingress' and 'Egress' are specified, all incoming and outgoing traffic to/from the selected pods will be denied and only those defined by the 'spec.ingress' and 'spec.egress' rules will be allowed
Create the NetworkPolicy rules

Achieved with 'spec.ingress' for filtering incoming network traffic and 'spec.egress' for outgoing traffic.

For ingress rules, we specfiy the allowed sources using 'spec.ingress.from'. For egress rules, we specify the target destinations using 'spec.ingress.to'. For both types of rules, we also specify the allowed protocols and ports using 'spec.ingress.ports' and 'spec.egress.ports'.

Here are the possibilities with have regarding sources and destination selection in 'spec.ingress.from' and 'spec.egress.to':

  • ipBlock: a specific IP address or IP address range
  • podSelector: pods from the NetworkPolicy namespace, selected by pods labels using 'podSelector.matchLabels'
  • namespaceSelector: pods from a specific namespace, selected by the namespace labels. If 'podSelector' is also specified, selects pods matching labels defined at 'podSelector.matchLabels', only from the namespaces matching labels defined 'namespaceSelector.matchLabels'

Here is an example NetworkPolicy manifest allowing:

  • inbound traffic to all pods inside the app-backends namespace on port 80, only from pods inside the app-frontends namespace
  • outbound traffic from all pods inside the app-bakends namespace, only to the IP addresses in the '10.0.0.0/24' range
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: my-network-policy
  namespace: app-backends
spec:
  podSelector: # empty => selects all pods inside the namespace where the NetworkPolicy is deployed (app-backends in this case)
  #  matchLabels:
  #    app: superapp # Otherwise, selects all pods with this label from NetworkPolicy namespace
  policyTypes:
  - Ingress
  - Egress
  ingress: # rules for inbound traffic to pods selected by spec.podSelector
  - from: # selecting the allowed sources
    - namespaceSelector: # allowing namespaces by their labels
        matchLabels:
          kubernetes.io/metadata.name: app-frontends # label for selecting the 'app-frontend' namespace
    - podSelector: # allowing pods by their labels
        matchLabels:
          app-frontends-tier: web # allow all pods with this label
    ports: # target protocols and ports we are allowing traffic to on pods selected by spec.podSelector
    - protocol: TCP
      port: 80
  egress: # rules for outbound traffic to pods selected by spec.podSelector
  - to:
    - ipBlock:
        cidr: 10.0.0.0/24 # target IP range we are allowing traffic to from the pods selected by spec.podSelector
    ports:
    - protocol: TCP
      port: 5978

Observability and troubleshooting

Nodes and Pods + containers resources usage

# Get pods containers resources usage from a specific namespace
$ kubectl top pods --containers -n $namesapce

# Get default namespaces pods resources usage
$ kubectl top pods 

# Get cluster nodes resources usage
$ kubectl top nodes

Pods logs

# Get all logs from my-pod inside default namespace
$ kubectl logs my-pod

# Get last 20 logs lines from my-deploy and follow outputs
$ kubectl logs --tail=20 -f deploy/my-deploy -n $namespace

# Get last 20 logs lines from my-sts
$ kubectl logs --tail=20 sts/my-sts -n $namespace

Cluster events

# Get events from a specific namespace
$ kubectl get events -n $namespace

# Get events from all namespaces
$ kubectl get events -A

Testing RBAC

# Can I get pods inside my-app namespace ?
$ kubectl auth can-i get pods -n my-app

# Can the my-sa service account residing inside 
# my-app namespace list secrets inside default namespace ?
$ kubectl auth can-i list secrets --as --as=system:serviceaccount:my-app:my-sa

# Can I do anything in my-app namespace ?
$ kubectl auth can-i '*' '*' -n my-app