Skip to main content

Kubernetes (Part 2)

Setup​

We will be using a virtual machine in the faculty's cloud.

When creating a virtual machine in the Launch Instance window:

  • Name your VM using the following convention: cc_lab<no>_<username>, where <no> is the lab number and <username> is your institutional account.
  • Select Boot from image in Instance Boot Source section
  • Select CC 2024-2025 in Image Name section
  • Select the m1.xlarge flavor.

In the virtual machine:

  • Download the laboratory archive from here in the work directory. Use: wget https://repository.grid.pub.ro/cs/cc/laboratoare/lab-kubernetes-part-2.zip to download the archive.
  • Extract the archive.
  • Run the setup script bash lab-kubernetes-part-2.sh.
$ # create the working dir
$ mkdir ~/work
$ # change the working dir
$ cd ~/work
$ # download the archive
$ wget https://repository.grid.pub.ro/cs/cc/laboratoare/lab-kubernetes-part-2.zip
$ unzip lab-kubernetes-part-2.zip
$ # run setup script; it may take a while
$ bash lab-kubernetes-part-2.sh

Create a local Kubernetes cluster using kind create cluster:

student@lab-kubernetes:~$ kind create cluster
Creating cluster "kind" ...
βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό
βœ“ Preparing nodes πŸ“¦
βœ“ Writing configuration πŸ“œ
βœ“ Starting control-plane πŸ•ΉοΈ
βœ“ Installing CNI πŸ”Œ
βœ“ Installing StorageClass πŸ’Ύ
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Thanks for using kind! 😊

Liveness probes​

Software applications, no matter how well written and tested, are alyways prone to errors, crashes, deadlocks etc. Sometimes, the only way to restore functionality is to restart the application.

When running in production, it is very important that application errors are detected as soon as they occur and then automatically mitigated.

In Kubernetes, we have the concept of liveness probes, which help us by continuously monitoring a container and taking an action if a failure occurs.

Setup: a crashy app​

To illustrate the concept, we will use an app that was specially built for this lab. The app is a simple HTTP server written in Python, that runs normally for a specified number of seconds, and starts to return errors after that.

If you are curious, you can find the source code in ~/work/crashy-app/server.py. The time after the app starts to error out is defined by the CRASH_AFTER environment variable.

The docker image for this app should already exist:

student@cc-lab:~/work$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
crashy-app 1.0.0 f0a327e2fc35 56 minutes ago 67MB
[...]

Let's load this image into the Kind cluster:

student@cc-lab:~/work$ kind load docker-image crashy-app:1.0.0
Image: "crashy-app:1.0.0" with ID "sha256:f0a327e2fc354173521a6425d679e3adaa95de11ca3b8e5306e8b58655f310e4" not yet present on node "kind-control-plane", loading...

We will create a deployment for this app and apply it. Notice that the CRASH_AFTER environment variable will be set to 60 seconds.

student@cc-lab:~/work$ cat crashy-deployment.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashy-app
labels:
app: crashy
spec:
replicas: 1
selector:
matchLabels:
app: crashy
template:
metadata:
labels:
app: crashy
spec:
containers:
- name: crashy-app
image: crashy-app:1.0.0
ports:
- containerPort: 80
env:
- name: CRASH_AFTER
value: "60"

student@cc-lab:~/work$ kubectl apply -f crashy-deployment.yaml
deployment.apps/crashy-app created

student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
crashy-app 1/1 1 1 8s

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
crashy-app-5bc4d6474b-lgnk9 1/1 Running 0 11s

Let's expose the app via a service:

student@cc-lab:~/work$ cat crashy-service.yaml 
apiVersion: v1
kind: Service
metadata:
name: crashy-app
spec:
type: NodePort
selector:
app: crashy
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30080

student@cc-lab:~/work$ kubectl apply -f crashy-service.yaml
service/crashy-app created

student@cc-lab:~/work$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
crashy-app NodePort 10.96.67.208 <none> 80:30080/TCP 6s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 24m

Notice that at the beggining, the app works normally:

student@cc-lab:~/work$ curl http://172.18.0.2:30080
Hi, my name is crashy-app-5bc4d6474b-lgnk9 and I'm a crashy app...
But I didn't crash... yet :D

After 60 seconds, it starts to return errors:

student@cc-lab:~/work$ curl http://172.18.0.2:30080
Hi, my name is crashy-app-5bc4d6474b-lgnk9 and I'm a crashy app...
I crashed 2.85 seconds ago, sorry about that :(

If you use curl -v, you will see that the server returns a HTTP 500 status code.

If we list the pods, we see the pod as running, so Kubernetes has no way to know that the app is not available.

The only way to recover is to delete the pod, which will force the deployment to create a new one. We can do this manually:

student@cc-lab:~/work$ kubectl delete pod/crashy-app-5bc4d6474b-lgnk9
pod "crashy-app-5bc4d6474b-lgnk9" deleted

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
crashy-app-5bc4d6474b-2svb4 1/1 Running 0 9s

But we will have to keep doing this again and again, which is not convenient.

Defining a liveness probe​

A liveness probe helps us by periodically polling for a condition. When the condition fails, the container is automatically restarted.

We will be using a httpGet probe, which queries an HTTP endpoint of the app. Most cloud-native apps have a separate endpoint for health monitoring, which is more lightweight (it doesn't perform the full processing, but only returns the status of the service).

Our crashy app responds to the /health endpoint, which can also be queried manually:

student@cc-lab:~/work$ curl http://172.18.0.2:30080/health
200 OK
[...]
student@cc-lab:~/work$ curl http://172.18.0.2:30080/health
500 Internal Server Error

Let's edit the deployment manifest by defining a liveness probe:

student@cc-lab:~/work$ cat crashy-deployment.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashy-app
labels:
app: crashy
spec:
replicas: 1
selector:
matchLabels:
app: crashy
template:
metadata:
labels:
app: crashy
spec:
containers:
- name: crashy-app
image: crashy-app:1.0.0
ports:
- containerPort: 80
env:
- name: CRASH_AFTER
value: "60"
livenessProbe:
httpGet:
path: /health
port: 80
periodSeconds: 1
failureThreshold: 3
terminationGracePeriodSeconds: 1
note

The parameters have the following meaning:

  • httpGet.path - the path of the HTTP endpoint to probe
  • httpGet.port - the port of the HTTP endpoint to probe
  • periodSeconds - how many seconds to wait between two probes
  • failureThreshold - after how many failed probes is the container considered dead
  • terminationGracePeriodSeconds - how many seconds to wait before sending the KILL signal to a failed container

Apply the modified manifest:

student@cc-lab:~/work$ kubectl apply -f crashy-deployment.yaml 
deployment.apps/crashy-app configured

Visualize the events for the pod and observe that the container is periodically restarted after three consecutive failed probes:

student@cc-lab:~/work$ kubectl events --for pod/crashy-app-5799b6fd57-sd56v --watch
LAST SEEN TYPE REASON OBJECT MESSAGE
23s Normal Scheduled Pod/crashy-app-5799b6fd57-sd56v Successfully assigned default/crashy-app-5799b6fd57-sd56v to kind-control-plane
22s Normal Pulled Pod/crashy-app-5799b6fd57-sd56v Container image "crashy-app:1.0.0" already present on machine
22s Normal Created Pod/crashy-app-5799b6fd57-sd56v Created container: crashy-app
22s Normal Started Pod/crashy-app-5799b6fd57-sd56v Started container crashy-app
0s Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x2 over 1s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x3 over 2s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s Normal Killing Pod/crashy-app-5799b6fd57-sd56v Container crashy-app failed liveness probe, will be restarted
0s (x2 over 65s) Normal Pulled Pod/crashy-app-5799b6fd57-sd56v Container image "crashy-app:1.0.0" already present on machine
0s (x2 over 65s) Normal Created Pod/crashy-app-5799b6fd57-sd56v Created container: crashy-app
0s (x2 over 65s) Normal Started Pod/crashy-app-5799b6fd57-sd56v Started container crashy-app
0s (x4 over 65s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x5 over 66s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x6 over 67s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x2 over 65s) Normal Killing Pod/crashy-app-5799b6fd57-sd56v Container crashy-app failed liveness probe, will be restarted
0s (x3 over 2m10s) Normal Pulled Pod/crashy-app-5799b6fd57-sd56v Container image "crashy-app:1.0.0" already present on machine
0s (x3 over 2m10s) Normal Created Pod/crashy-app-5799b6fd57-sd56v Created container: crashy-app
0s (x3 over 2m10s) Normal Started Pod/crashy-app-5799b6fd57-sd56v Started container crashy-app
[...]
^C

The number of restarts can also be seen in the pod list:

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
crashy-app-5799b6fd57-sd56v 1/1 Running 3 (3s ago) 3m19s

Verify using curl that the app automatically recovers after a failure.

Readiness probes​

Productions apps are often complex and are not ready to process traffic as soon as they are started. Usually, they need some time to initialize (seconds or even minutes). During the initialization time, traffic should not be routed to the respective instances, because it would not be processed anyway, and the users would see errors.

In Kubernetes, we have the concept of readiness probes which monitor a container and only accept traffic if they are ready.

Setup: a lazy app​

To illustrate the concept, we will use an app that was specially built for this lab. The app is a simple HTTP server written in Python, that take a specified number of seconds to initialize, and runs normally after that.

If you are curious, you can find the source code in ~/work/lazy-app/server.py. The initialization time in seconds is a random number between zero and READY_AFTER_MAX.

The docker image for this app should already exist:

student@cc-lab:~/work$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
lazy-app 1.0.0 f7eac9e4eda7 42 minutes ago 67MB
[...]

Let's load this image into the Kind cluster:

student@cc-lab:~/work$ kind load docker-image lazy-app:1.0.0
Image: "lazy-app:1.0.0" with ID "sha256:f7eac9e4eda7cc3b492cdfe6aff791cfd763567fb0502d5c8bb96cbc0cf032ed" not yet present on node "kind-control-plane", loading...

We will create a deployment for this app and apply it. Notice that the READY_AFTER_MAX environment variable will be set to 60 seconds. The deployment will have 5 replicas, which means that there will be 5 pods that can serve requests.

student@cc-lab:~/work$ cat lazy-deployment.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
name: lazy-app
labels:
app: lazy
spec:
replicas: 5
selector:
matchLabels:
app: lazy
template:
metadata:
labels:
app: lazy
spec:
containers:
- name: lazy-app
image: lazy-app:1.0.0
ports:
- containerPort: 80
env:
- name: READY_AFTER_MAX
value: "300"

student@cc-lab:~/work$ kubectl apply -f lazy-deployment.yaml
deployment.apps/lazy-app created

student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
lazy-app 5/5 5 5 8s

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-674fb54b7d-9bckf 1/1 Running 0 4s
lazy-app-674fb54b7d-fsstv 1/1 Running 0 4s
lazy-app-674fb54b7d-hbsgg 1/1 Running 0 4s
lazy-app-674fb54b7d-tjddz 1/1 Running 0 4s
lazy-app-674fb54b7d-wxx7p 1/1 Running 0 4s

Let's expose the app via a service:

student@cc-lab:~/work$ cat lazy-service.yaml 
apiVersion: v1
kind: Service
metadata:
name: lazy-app
spec:
type: NodePort
selector:
app: lazy
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30081

student@cc-lab:~/work$ kubectl apply -f lazy-service.yaml
service/lazy-app created

student@cc-lab:~/work$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 52m
lazy-app NodePort 10.96.180.27 <none> 80:30081/TCP 38m

We can see that all 5 instances are shown as "ready", but if we try to connect using curl, we don't always get successful responses:

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-cdsvk and I'm a lazy app...
Getting ready... 24.81 more seconds please :D

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-rrpcj and I'm a lazy app...
Getting ready... 119.54 more seconds please :D

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-cdsvk and I'm a lazy app...
Getting ready... 17.34 more seconds please :D

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-sn2sh and I'm a lazy app...
Getting ready... 184.09 more seconds please :D

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-rrpcj and I'm a lazy app...
Getting ready... 110.19 more seconds please :D

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-nvfkv and I'm a lazy app...
But I'm finally ready! :)

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-sn2sh and I'm a lazy app...
Getting ready... 178.67 more seconds please :D

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-nvfkv and I'm a lazy app...
But I'm finally ready! :)

Depending on the pod where the request is routed, we will see a successful or a failed response. Ideally, the service would only route requests to pods that are ready.

Defining a readiness probe​

A readiness probe helps us by periodically polling for a condition. When the condition is successful, the container is automatically marked as ready.

We will be using a httpGet probe, which queries an HTTP endpoint of the app. Most cloud-native apps have a separate endpoint for health monitoring, which is more lightweight (it doesn't perform the full processing, but only returns the status of the service).

Our lazy app responds to the /health endpoint, which can also be queried manually:

student@cc-lab:~/work$ curl http://172.18.0.2:30081/health
500 Internal Server Error
[...]
student@cc-lab:~/work$ curl http://172.18.0.2:30081/health
200 OK

First, let's delete the current deployment:

student@cc-lab:~/work$ kubectl delete deployments lazy-app
deployment.apps "lazy-app" deleted

Then, let's create a new deployment that defines a readiness probe:

student@cc-lab:~/work$ cat lazy-deployment.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
name: lazy-app
labels:
app: lazy
spec:
replicas: 5
selector:
matchLabels:
app: lazy
template:
metadata:
labels:
app: lazy
spec:
containers:
- name: lazy-app
image: lazy-app:1.0.0
ports:
- containerPort: 80
env:
- name: READY_AFTER_MAX
value: "300"
readinessProbe:
httpGet:
path: /health
port: 80
periodSeconds: 1
successThreshold: 2
note

The parameters have the following meaning:

  • httpGet.path - the path of the HTTP endpoint to probe
  • httpGet.port - the port of the HTTP endpoint to probe
  • periodSeconds - how many seconds to wait between two probes
  • successThreshold - after how many successful probes is the container considered ready

Apply the new manifest and observe that initially no pod is ready:

student@cc-lab:~/work$ kubectl apply -f lazy-deployment.yaml 
deployment.apps/lazy-app created

student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
lazy-app 0/5 5 0 1s

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-6d55bd7894-jnkm6 0/1 Running 0 2s
lazy-app-6d55bd7894-qt5mm 0/1 Running 0 2s
lazy-app-6d55bd7894-wsncf 0/1 Running 0 2s
lazy-app-6d55bd7894-zdhtv 0/1 Running 0 1s
lazy-app-6d55bd7894-zkxgm 0/1 Running 0 2s

Verify with curl that requests are only routed to pods that are ready:

student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-nvfkv and I'm a lazy app...
But I'm finally ready! :)

Eventually, they all become ready gradually.

You can observe that by listing the pods:

student@cc-lab:~/work$ kubectl get pods 
NAME READY STATUS RESTARTS AGE
lazy-app-6d55bd7894-jnkm6 1/1 Running 0 41s
lazy-app-6d55bd7894-qt5mm 0/1 Running 0 41s
lazy-app-6d55bd7894-wsncf 0/1 Running 0 41s
lazy-app-6d55bd7894-zdhtv 1/1 Running 0 40s
lazy-app-6d55bd7894-zkxgm 0/1 Running 0 41s

[...]

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-6d55bd7894-jnkm6 1/1 Running 0 5m56s
lazy-app-6d55bd7894-qt5mm 1/1 Running 0 5m56s
lazy-app-6d55bd7894-wsncf 1/1 Running 0 5m56s
lazy-app-6d55bd7894-zdhtv 1/1 Running 0 5m55s
lazy-app-6d55bd7894-zkxgm 1/1 Running 0 5m56s

Or inspecting the deployment:

student@cc-la:~/work$ kubectl get deployments --watch
NAME READY UP-TO-DATE AVAILABLE AGE
lazy-app 0/5 5 0 2s
lazy-app 1/5 5 1 2m30s
lazy-app 2/5 5 2 2m36s
lazy-app 3/5 5 3 2m51s
lazy-app 4/5 5 4 3m38s
lazy-app 5/5 5 5 4m37s
^C

Scaling an app​

In production, the amount of traffic for an app is rarely constant. If the traffic to our app increases, we may need to scale the app (create mode pods, identical to the ones that already exist).

Let's start with the hello-app with only one replica.

Create and apply the deployment:

student@lab-kubernetes:~$ cat hello-app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app
labels:
app: hello
spec:
replicas: 1
selector:
matchLabels:
app: hello
template:
metadata:
labels:
app: hello
spec:
containers:
- name: hello-app
image: gitlab.cs.pub.ro:5050/scgc/cloud-courses/hello-app:1.0
ports:
- containerPort: 8080

student@lab-kubernetes:~$ kubectl apply -f hello-app-deployment.yaml
deployment.apps/hello-app created

student@lab-kubernetes:~$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-app 1/1 1 1 13s

student@lab-kubernetes:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-599bb4bf7f-l45k4 1/1 Running 0 17s

Then create and apply the service that exposes the app:

student@lab-kubernetes:~$ cat hello-app-service.yaml
apiVersion: v1
kind: Service
metadata:
name: hello-app
spec:
type: NodePort
selector:
app: hello
ports:
- protocol: TCP
port: 8080
targetPort: 8080
nodePort: 30082

student@lab-kubernetes:~$ kubectl apply -f hello-app-service.yaml
service/hello-app created

student@lab-kubernetes:~$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-app NodePort 10.96.186.102 <none> 8080:30082/TCP 7m42s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20h

Now, let's scale hello-app to 10 pods. For this, change the value for replicas in hello-app-deployment.yaml to 10, and reapply the manifest:

student@lab-kubernetes:~$ kubectl apply -f hello-app-deployment.yaml
deployment.apps/hello-app configured

student@lab-kubernetes:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-599bb4bf7f-25w8g 1/1 Running 0 6s
hello-app-599bb4bf7f-7xzgr 0/1 ContainerCreating 0 5s
hello-app-599bb4bf7f-gr9xb 1/1 Running 0 6s
hello-app-599bb4bf7f-l45k4 1/1 Running 0 44m
hello-app-599bb4bf7f-mbgx7 0/1 ContainerCreating 0 6s
hello-app-599bb4bf7f-ps2dj 1/1 Running 0 6s
hello-app-599bb4bf7f-r6xqv 1/1 Running 0 6s
hello-app-599bb4bf7f-rrnws 0/1 ContainerCreating 0 5s
hello-app-599bb4bf7f-tnqtz 1/1 Running 0 6s
hello-app-599bb4bf7f-wh7qx 0/1 ContainerCreating 0 6s

After a while, you'll see that all 10 pods are running. Also, the deployment shows 10 available pods:

student@lab-kubernetes:~$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-app 10/10 10 10 45m

Replica sets​

What actually happened is that a Kubernetes replica set associated with the deployment, of scale 10, was created:

student@lab-kubernetes:~$ kubectl get replicasets
NAME DESIRED CURRENT READY AGE
hello-app-599bb4bf7f 10 10 10 1m

Testing the scaled app​

Connect multiple times to the service, using curl. You will notice that each time, a different pod responds:

student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-r6xqv
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-gr9xb
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-rrnws
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-7xzgr
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-ps2dj

Autoscaling​

In production, it's unfeasible to manually scale up and down an application. Instead, we need a solution that automatically does this, as resource demands are changing.

In Kubernetes, we have the concept of horizontal pod autoscaler, which adds or removes pods from a replica set based on resource usage.

Defining resource constrains​

Remove the current `hello-app1 deployment, if any:

First, let's delete the current deployment:

student@cc-lab:~/work$ kubectl delete deployments hello-app
deployment.apps "hello-app" deleted

Then, let's create and apply a new deployment that defines resource constraints:

student@cc-lab:~/work$ cat hello-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app
labels:
app: hello
spec:
replicas: 1
selector:
matchLabels:
app: hello
template:
metadata:
labels:
app: hello
spec:
containers:
- name: hello-app
image: gitlab.cs.pub.ro:5050/scgc/cloud-courses/hello-app:1.0
ports:
- containerPort: 8080
resources:
limits:
cpu: 200m
requests:
cpu: 100m

student@cc-lab$ kubectl apply -f hello-deployment.yaml
deployment.apps/hello-app created
note

The parameters have the following meaning:

  • resources.requests.cpu - minimum resources requested by the container (0.1 CPU cores in this case)
  • resources.limits.cpu - maximum resources requested by the container (0.2 CPU cores in this case)

Installing the metrics server​

In order for Kubernetes to measure resource utilization, we must install the metrics server, which is not installed by default in Kind.

We will use Helm, which is a package manager for Kubernetes.

Using Helm will be the scope of another lab. For now, run the following commands to install the metrics server:

student@cc-lab:~/work$ helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
student@cc-lab:~/work$ helm repo update
student@cc-lab:~/work$ helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server --namespace kube-system

Defining the autoscaling policy​

Now, let's define and apply the horizontal cpu autoscaler:

student@cc-lab:~/work$ cat hello-autoscaler.yaml 
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hello
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hello-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 5

student@cc-lab:~/work$ kubectl apply -f hello-autoscaler.yaml
horizontalpodautoscaler.autoscaling/hello created
note

The parameters have the following meaning:

  • minreplicas - minimum replicas to scale down to
  • maxReplicas - maximum replicas to scale up to
  • averageUtilization - when to scale; in this case, when the average CPU load across pods is greater than 5%

Also, inspect the horizontal pod autoscaler:

student@cc-lab:~/work$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello Deployment/hello-app cpu: 0%/5% 1 10 1 2m30s
note

The values set for resource limits and average utilization are unrealisticaly low, but we did this to be able to generate load.

Generating load​

Open another terminal and run a while loop that sends curl requests:

student@cc-lab:~$ while true; do curl http://172.18.0.2:30082/; sleep 0.01; done

In the first terminal, inspect the horizontal pod autoscaler:

student@cc-lab:~/work$ kubectl get hpa --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello Deployment/hello-app cpu: 0%/5% 1 10 1 20m
hello Deployment/hello-app cpu: 2%/5% 1 10 1 21m
hello Deployment/hello-app cpu: 18%/5% 1 10 1 21m
hello Deployment/hello-app cpu: 16%/5% 1 10 4 21m
hello Deployment/hello-app cpu: 5%/5% 1 10 4 22m
hello Deployment/hello-app cpu: 4%/5% 1 10 4 23m
[...]

Observe how additional replicas have been automatically created:

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-f447d7765-72sp8 1/1 Running 0 2m10s
hello-app-f447d7765-bwllr 1/1 Running 0 6m3s
hello-app-f447d7765-jr8kx 1/1 Running 0 2m10s
hello-app-f447d7765-v7lnq 1/1 Running 0 2m10s

Stopping the load​

Stop the while loop from the other terminal. Continue to inspect the horizontal pod autoscaler:

student@cc-lab:~/work$ kubectl get hpa --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello Deployment/hello-app cpu: 5%/5% 1 10 4 23m
hello Deployment/hello-app cpu: 3%/5% 1 10 4 24m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 25m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 27m
hello Deployment/hello-app cpu: 1%/5% 1 10 4 27m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 28m
hello Deployment/hello-app cpu: 1%/5% 1 10 4 29m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 29m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 29m
hello Deployment/hello-app cpu: 1%/5% 1 10 3 29m
hello Deployment/hello-app cpu: 0%/5% 1 10 1 30m
hello Deployment/hello-app cpu: 1%/5% 1 10 1 30m
hello Deployment/hello-app cpu: 0%/5% 1 10 1 31m

Notice that after a few minutes, the instances have been scaled down to 1:

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-f447d7765-bwllr 1/1 Running 0 14m

Exercise - fine tuning​

Try to tune the:

  • resources parameters (resources.requests.cpu and resources.limits.cpu)
  • autoscaler parameter (averageUtilization)
  • the way you generate traffic

in order to reach the maximum number of 10 instances.

Ingress​

Even if we can expose Kubernetes apps using services, each service runs on a different port. If we want a single point of acccess to all apps in the Kubernetes cluster, we can use an Ingress.

An Ingress is a Kubernetes object that acts like an API gateway. Each service can be accessed using a different HTTP resource path.

Setting up an ingress on the Kind cluster​

Kind doesn't have the full Ingress functionality by default, so we have to install some dependencies.

First, let's install the Ingress controller functionality:

student@cc-lab:~/work$ kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/deploy-ingress-nginx.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
serviceaccount/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
configmap/ingress-nginx-controller created
service/ingress-nginx-controller created
service/ingress-nginx-controller-admission created
deployment.apps/ingress-nginx-controller created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
ingressclass.networking.k8s.io/nginx created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created

Then, we must install the cloud provider add-on for Kind. Go to https://github.com/kubernetes-sigs/cloud-provider-kind/releases and download the archive for Linux AMD64 architecture.

Extract it, and run the executable in a different terminal. Keep it running.

student@cc-lab-alexandru-carp:~$ wget https://github.com/kubernetes-sigs/cloud-provider-kind/releases/download/v0.6.0/cloud-provider-kind_0.6.0_linux_amd64.tar.gz
[...]

student@cc-lab-alexandru-carp:~$ tar -xvf cloud-provider-kind_0.6.0_linux_amd64.tar.gz
LICENSE
README.md
cloud-provider-kind

student@cc-lab-alexandru-carp:~$ ./cloud-provider-kind
[...]

Configuring another service​

Let's configure another service, similar to the hello-app one, so that the ingress will route traffic to both services. This time, we will use hello-app:2.0 image.

Create and apply the second deployment:

student@cc-lab:~/work$ cat hello-deployment-v2.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app-v2
labels:
app: hello-v2
spec:
replicas: 1
selector:
matchLabels:
app: hello-v2
template:
metadata:
labels:
app: hello-v2
spec:
containers:
- name: hello-app
image: gitlab.cs.pub.ro:5050/scgc/cloud-courses/hello-app:2.0
ports:
- containerPort: 8080
resources:
limits:
cpu: 200m
requests:
cpu: 100m

student@cc-lab:~/work$ kubectl apply -f hello-deployment-v2.yaml
deployment.apps/hello-app-v2 created

student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-app 1/1 1 1 54m
hello-app-v2 1/1 1 1 12s

student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-f447d7765-bwllr 1/1 Running 0 35m
hello-app-v2-5b9fbc5465-wr6nr 1/1 Running 0 5s

Create apply the second service:

student@cc-lab:~/work$ cat hello-service-v2.yaml 
apiVersion: v1
kind: Service
metadata:
name: hello-app-v2
spec:
type: NodePort
selector:
app: hello-v2
ports:
- protocol: TCP
port: 8080
targetPort: 8080
nodePort: 30083

student@cc-lab:~/work$ kubectl apply -f hello-service-v2.yaml
service/hello-app-v2 created

student@cc-lab:~/work$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-app NodePort 10.96.172.199 <none> 8080:30082/TCP 55m
hello-app-v2 NodePort 10.96.33.190 <none> 8080:30083/TCP 3s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 162m

Defining the ingress​

We will define an ingress so that:

  • /v1 path will point to the service for hello-app:1.0
  • /v2 path will point to the service for hello-app:2.0
student@cc-lab:~/work$ cat hello-ingress.yaml 
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hello-ingress
spec:
rules:
- http:
paths:
- pathType: Prefix
path: /v1
backend:
service:
name: hello-app
port:
number: 8080
- pathType: Prefix
path: /v2
backend:
service:
name: hello-app-v2
port:
number: 8080

student@cc-lab:~/work$ kubectl apply -f hello-ingress.yaml
ingress.networking.k8s.io/hello-ingress configured

student@cc-lab:~/work$ kubectl describe ingress hello-ingress
Name: hello-ingress
Labels: <none>
Namespace: default
Address:
Ingress Class: <none>
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
*
/v1 hello-app:8080 (10.244.0.50:8080)
/v2 hello-app-v2:8080 (10.244.0.54:8080)
Annotations: <none>
Events: <none>

Testing traffic routing​

For identifying the IP address associated to the ingress, we must inspect the services in the ingress-nginx namespace:

student@cc-lab:~/work$ kubectl get --namespace ingress-nginx services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller LoadBalancer 10.96.75.175 172.18.0.3 80:32191/TCP,443:32729/TCP 5m45s
ingress-nginx-controller-admission ClusterIP 10.96.116.85 <none> 443/TCP

In our case, the IP address is 172.18.0.3.

Let's test traffic routing with curl:

student@cc-lab:~/work$ curl 172.18.0.3/v1
Hello, world!
Version: 1.0.0
Hostname: hello-app-f447d7765-bwllr

student@cc-lab:~/work$ curl 172.18.0.3/v2
Hello, world!
Version: 2.0.0
Hostname: hello-app-v2-5b9fbc5465-wr6nr