Kubernetes (Part 2)
Setupβ
We will be using a virtual machine in the faculty's cloud.
When creating a virtual machine in the Launch Instance window:
- Name your VM using the following convention:
cc_lab<no>_<username>
, where<no>
is the lab number and<username>
is your institutional account. - Select Boot from image in Instance Boot Source section
- Select CC 2024-2025 in Image Name section
- Select the m1.xlarge flavor.
In the virtual machine:
- Download the laboratory archive from here in the
work
directory. Use:wget https://repository.grid.pub.ro/cs/cc/laboratoare/lab-kubernetes-part-2.zip
to download the archive. - Extract the archive.
- Run the setup script
bash lab-kubernetes-part-2.sh
.
$ # create the working dir
$ mkdir ~/work
$ # change the working dir
$ cd ~/work
$ # download the archive
$ wget https://repository.grid.pub.ro/cs/cc/laboratoare/lab-kubernetes-part-2.zip
$ unzip lab-kubernetes-part-2.zip
$ # run setup script; it may take a while
$ bash lab-kubernetes-part-2.sh
Create a local Kubernetes cluster using kind create cluster
:
student@lab-kubernetes:~$ kind create cluster
Creating cluster "kind" ...
β Ensuring node image (kindest/node:v1.23.4) πΌ
β Preparing nodes π¦
β Writing configuration π
β Starting control-plane πΉοΈ
β Installing CNI π
β Installing StorageClass πΎ
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Thanks for using kind! π
Liveness probesβ
Software applications, no matter how well written and tested, are alyways prone to errors, crashes, deadlocks etc. Sometimes, the only way to restore functionality is to restart the application.
When running in production, it is very important that application errors are detected as soon as they occur and then automatically mitigated.
In Kubernetes, we have the concept of liveness probes, which help us by continuously monitoring a container and taking an action if a failure occurs.
Setup: a crashy appβ
To illustrate the concept, we will use an app that was specially built for this lab. The app is a simple HTTP server written in Python, that runs normally for a specified number of seconds, and starts to return errors after that.
If you are curious, you can find the source code in ~/work/crashy-app/server.py
.
The time after the app starts to error out is defined by the CRASH_AFTER
environment variable.
The docker image for this app should already exist:
student@cc-lab:~/work$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
crashy-app 1.0.0 f0a327e2fc35 56 minutes ago 67MB
[...]
Let's load this image into the Kind cluster:
student@cc-lab:~/work$ kind load docker-image crashy-app:1.0.0
Image: "crashy-app:1.0.0" with ID "sha256:f0a327e2fc354173521a6425d679e3adaa95de11ca3b8e5306e8b58655f310e4" not yet present on node "kind-control-plane", loading...
We will create a deployment for this app and apply it. Notice that the CRASH_AFTER
environment variable will be set to 60
seconds.
student@cc-lab:~/work$ cat crashy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashy-app
labels:
app: crashy
spec:
replicas: 1
selector:
matchLabels:
app: crashy
template:
metadata:
labels:
app: crashy
spec:
containers:
- name: crashy-app
image: crashy-app:1.0.0
ports:
- containerPort: 80
env:
- name: CRASH_AFTER
value: "60"
student@cc-lab:~/work$ kubectl apply -f crashy-deployment.yaml
deployment.apps/crashy-app created
student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
crashy-app 1/1 1 1 8s
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
crashy-app-5bc4d6474b-lgnk9 1/1 Running 0 11s
Let's expose the app via a service:
student@cc-lab:~/work$ cat crashy-service.yaml
apiVersion: v1
kind: Service
metadata:
name: crashy-app
spec:
type: NodePort
selector:
app: crashy
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30080
student@cc-lab:~/work$ kubectl apply -f crashy-service.yaml
service/crashy-app created
student@cc-lab:~/work$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
crashy-app NodePort 10.96.67.208 <none> 80:30080/TCP 6s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 24m
Notice that at the beggining, the app works normally:
student@cc-lab:~/work$ curl http://172.18.0.2:30080
Hi, my name is crashy-app-5bc4d6474b-lgnk9 and I'm a crashy app...
But I didn't crash... yet :D
After 60 seconds, it starts to return errors:
student@cc-lab:~/work$ curl http://172.18.0.2:30080
Hi, my name is crashy-app-5bc4d6474b-lgnk9 and I'm a crashy app...
I crashed 2.85 seconds ago, sorry about that :(
If you use curl -v
, you will see that the server returns a HTTP 500 status code.
If we list the pods, we see the pod as running, so Kubernetes has no way to know that the app is not available.
The only way to recover is to delete the pod, which will force the deployment to create a new one. We can do this manually:
student@cc-lab:~/work$ kubectl delete pod/crashy-app-5bc4d6474b-lgnk9
pod "crashy-app-5bc4d6474b-lgnk9" deleted
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
crashy-app-5bc4d6474b-2svb4 1/1 Running 0 9s
But we will have to keep doing this again and again, which is not convenient.
Defining a liveness probeβ
A liveness probe helps us by periodically polling for a condition. When the condition fails, the container is automatically restarted.
We will be using a httpGet probe, which queries an HTTP endpoint of the app. Most cloud-native apps have a separate endpoint for health monitoring, which is more lightweight (it doesn't perform the full processing, but only returns the status of the service).
Our crashy app responds to the /health
endpoint, which can also be queried manually:
student@cc-lab:~/work$ curl http://172.18.0.2:30080/health
200 OK
[...]
student@cc-lab:~/work$ curl http://172.18.0.2:30080/health
500 Internal Server Error
Let's edit the deployment manifest by defining a liveness probe:
student@cc-lab:~/work$ cat crashy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashy-app
labels:
app: crashy
spec:
replicas: 1
selector:
matchLabels:
app: crashy
template:
metadata:
labels:
app: crashy
spec:
containers:
- name: crashy-app
image: crashy-app:1.0.0
ports:
- containerPort: 80
env:
- name: CRASH_AFTER
value: "60"
livenessProbe:
httpGet:
path: /health
port: 80
periodSeconds: 1
failureThreshold: 3
terminationGracePeriodSeconds: 1
The parameters have the following meaning:
httpGet.path
- the path of the HTTP endpoint to probehttpGet.port
- the port of the HTTP endpoint to probeperiodSeconds
- how many seconds to wait between two probesfailureThreshold
- after how many failed probes is the container considered deadterminationGracePeriodSeconds
- how many seconds to wait before sending theKILL
signal to a failed container
Apply the modified manifest:
student@cc-lab:~/work$ kubectl apply -f crashy-deployment.yaml
deployment.apps/crashy-app configured
Visualize the events for the pod and observe that the container is periodically restarted after three consecutive failed probes:
student@cc-lab:~/work$ kubectl events --for pod/crashy-app-5799b6fd57-sd56v --watch
LAST SEEN TYPE REASON OBJECT MESSAGE
23s Normal Scheduled Pod/crashy-app-5799b6fd57-sd56v Successfully assigned default/crashy-app-5799b6fd57-sd56v to kind-control-plane
22s Normal Pulled Pod/crashy-app-5799b6fd57-sd56v Container image "crashy-app:1.0.0" already present on machine
22s Normal Created Pod/crashy-app-5799b6fd57-sd56v Created container: crashy-app
22s Normal Started Pod/crashy-app-5799b6fd57-sd56v Started container crashy-app
0s Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x2 over 1s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x3 over 2s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s Normal Killing Pod/crashy-app-5799b6fd57-sd56v Container crashy-app failed liveness probe, will be restarted
0s (x2 over 65s) Normal Pulled Pod/crashy-app-5799b6fd57-sd56v Container image "crashy-app:1.0.0" already present on machine
0s (x2 over 65s) Normal Created Pod/crashy-app-5799b6fd57-sd56v Created container: crashy-app
0s (x2 over 65s) Normal Started Pod/crashy-app-5799b6fd57-sd56v Started container crashy-app
0s (x4 over 65s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x5 over 66s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x6 over 67s) Warning Unhealthy Pod/crashy-app-5799b6fd57-sd56v Liveness probe failed: HTTP probe failed with statuscode: 500
0s (x2 over 65s) Normal Killing Pod/crashy-app-5799b6fd57-sd56v Container crashy-app failed liveness probe, will be restarted
0s (x3 over 2m10s) Normal Pulled Pod/crashy-app-5799b6fd57-sd56v Container image "crashy-app:1.0.0" already present on machine
0s (x3 over 2m10s) Normal Created Pod/crashy-app-5799b6fd57-sd56v Created container: crashy-app
0s (x3 over 2m10s) Normal Started Pod/crashy-app-5799b6fd57-sd56v Started container crashy-app
[...]
^C
The number of restarts can also be seen in the pod list:
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
crashy-app-5799b6fd57-sd56v 1/1 Running 3 (3s ago) 3m19s
Verify using curl
that the app automatically recovers after a failure.
Readiness probesβ
Productions apps are often complex and are not ready to process traffic as soon as they are started. Usually, they need some time to initialize (seconds or even minutes). During the initialization time, traffic should not be routed to the respective instances, because it would not be processed anyway, and the users would see errors.
In Kubernetes, we have the concept of readiness probes which monitor a container and only accept traffic if they are ready.
Setup: a lazy appβ
To illustrate the concept, we will use an app that was specially built for this lab. The app is a simple HTTP server written in Python, that take a specified number of seconds to initialize, and runs normally after that.
If you are curious, you can find the source code in ~/work/lazy-app/server.py
.
The initialization time in seconds is a random number between zero and READY_AFTER_MAX
.
The docker image for this app should already exist:
student@cc-lab:~/work$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
lazy-app 1.0.0 f7eac9e4eda7 42 minutes ago 67MB
[...]
Let's load this image into the Kind cluster:
student@cc-lab:~/work$ kind load docker-image lazy-app:1.0.0
Image: "lazy-app:1.0.0" with ID "sha256:f7eac9e4eda7cc3b492cdfe6aff791cfd763567fb0502d5c8bb96cbc0cf032ed" not yet present on node "kind-control-plane", loading...
We will create a deployment for this app and apply it. Notice that the READY_AFTER_MAX
environment variable will be set to 60
seconds.
The deployment will have 5 replicas, which means that there will be 5 pods that can serve requests.
student@cc-lab:~/work$ cat lazy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: lazy-app
labels:
app: lazy
spec:
replicas: 5
selector:
matchLabels:
app: lazy
template:
metadata:
labels:
app: lazy
spec:
containers:
- name: lazy-app
image: lazy-app:1.0.0
ports:
- containerPort: 80
env:
- name: READY_AFTER_MAX
value: "300"
student@cc-lab:~/work$ kubectl apply -f lazy-deployment.yaml
deployment.apps/lazy-app created
student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
lazy-app 5/5 5 5 8s
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-674fb54b7d-9bckf 1/1 Running 0 4s
lazy-app-674fb54b7d-fsstv 1/1 Running 0 4s
lazy-app-674fb54b7d-hbsgg 1/1 Running 0 4s
lazy-app-674fb54b7d-tjddz 1/1 Running 0 4s
lazy-app-674fb54b7d-wxx7p 1/1 Running 0 4s
Let's expose the app via a service:
student@cc-lab:~/work$ cat lazy-service.yaml
apiVersion: v1
kind: Service
metadata:
name: lazy-app
spec:
type: NodePort
selector:
app: lazy
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30081
student@cc-lab:~/work$ kubectl apply -f lazy-service.yaml
service/lazy-app created
student@cc-lab:~/work$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 52m
lazy-app NodePort 10.96.180.27 <none> 80:30081/TCP 38m
We can see that all 5 instances are shown as "ready", but if we try to connect using curl
, we don't always get successful responses:
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-cdsvk and I'm a lazy app...
Getting ready... 24.81 more seconds please :D
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-rrpcj and I'm a lazy app...
Getting ready... 119.54 more seconds please :D
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-cdsvk and I'm a lazy app...
Getting ready... 17.34 more seconds please :D
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-sn2sh and I'm a lazy app...
Getting ready... 184.09 more seconds please :D
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-rrpcj and I'm a lazy app...
Getting ready... 110.19 more seconds please :D
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-nvfkv and I'm a lazy app...
But I'm finally ready! :)
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-sn2sh and I'm a lazy app...
Getting ready... 178.67 more seconds please :D
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-nvfkv and I'm a lazy app...
But I'm finally ready! :)
Depending on the pod where the request is routed, we will see a successful or a failed response. Ideally, the service would only route requests to pods that are ready.
Defining a readiness probeβ
A readiness probe helps us by periodically polling for a condition. When the condition is successful, the container is automatically marked as ready.
We will be using a httpGet probe, which queries an HTTP endpoint of the app. Most cloud-native apps have a separate endpoint for health monitoring, which is more lightweight (it doesn't perform the full processing, but only returns the status of the service).
Our lazy app responds to the /health
endpoint, which can also be queried manually:
student@cc-lab:~/work$ curl http://172.18.0.2:30081/health
500 Internal Server Error
[...]
student@cc-lab:~/work$ curl http://172.18.0.2:30081/health
200 OK
First, let's delete the current deployment:
student@cc-lab:~/work$ kubectl delete deployments lazy-app
deployment.apps "lazy-app" deleted
Then, let's create a new deployment that defines a readiness probe:
student@cc-lab:~/work$ cat lazy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: lazy-app
labels:
app: lazy
spec:
replicas: 5
selector:
matchLabels:
app: lazy
template:
metadata:
labels:
app: lazy
spec:
containers:
- name: lazy-app
image: lazy-app:1.0.0
ports:
- containerPort: 80
env:
- name: READY_AFTER_MAX
value: "300"
readinessProbe:
httpGet:
path: /health
port: 80
periodSeconds: 1
successThreshold: 2
The parameters have the following meaning:
httpGet.path
- the path of the HTTP endpoint to probehttpGet.port
- the port of the HTTP endpoint to probeperiodSeconds
- how many seconds to wait between two probessuccessThreshold
- after how many successful probes is the container considered ready
Apply the new manifest and observe that initially no pod is ready:
student@cc-lab:~/work$ kubectl apply -f lazy-deployment.yaml
deployment.apps/lazy-app created
student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
lazy-app 0/5 5 0 1s
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-6d55bd7894-jnkm6 0/1 Running 0 2s
lazy-app-6d55bd7894-qt5mm 0/1 Running 0 2s
lazy-app-6d55bd7894-wsncf 0/1 Running 0 2s
lazy-app-6d55bd7894-zdhtv 0/1 Running 0 1s
lazy-app-6d55bd7894-zkxgm 0/1 Running 0 2s
Verify with curl
that requests are only routed to pods that are ready:
student@cc-lab:~/work$ curl http://172.18.0.2:30081
Hi, my name is lazy-app-7c44789765-nvfkv and I'm a lazy app...
But I'm finally ready! :)
Eventually, they all become ready gradually.
You can observe that by listing the pods:
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-6d55bd7894-jnkm6 1/1 Running 0 41s
lazy-app-6d55bd7894-qt5mm 0/1 Running 0 41s
lazy-app-6d55bd7894-wsncf 0/1 Running 0 41s
lazy-app-6d55bd7894-zdhtv 1/1 Running 0 40s
lazy-app-6d55bd7894-zkxgm 0/1 Running 0 41s
[...]
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
lazy-app-6d55bd7894-jnkm6 1/1 Running 0 5m56s
lazy-app-6d55bd7894-qt5mm 1/1 Running 0 5m56s
lazy-app-6d55bd7894-wsncf 1/1 Running 0 5m56s
lazy-app-6d55bd7894-zdhtv 1/1 Running 0 5m55s
lazy-app-6d55bd7894-zkxgm 1/1 Running 0 5m56s
Or inspecting the deployment:
student@cc-la:~/work$ kubectl get deployments --watch
NAME READY UP-TO-DATE AVAILABLE AGE
lazy-app 0/5 5 0 2s
lazy-app 1/5 5 1 2m30s
lazy-app 2/5 5 2 2m36s
lazy-app 3/5 5 3 2m51s
lazy-app 4/5 5 4 3m38s
lazy-app 5/5 5 5 4m37s
^C
Scaling an appβ
In production, the amount of traffic for an app is rarely constant. If the traffic to our app increases, we may need to scale the app (create mode pods, identical to the ones that already exist).
Let's start with the hello-app
with only one replica.
Create and apply the deployment:
student@lab-kubernetes:~$ cat hello-app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app
labels:
app: hello
spec:
replicas: 1
selector:
matchLabels:
app: hello
template:
metadata:
labels:
app: hello
spec:
containers:
- name: hello-app
image: gitlab.cs.pub.ro:5050/scgc/cloud-courses/hello-app:1.0
ports:
- containerPort: 8080
student@lab-kubernetes:~$ kubectl apply -f hello-app-deployment.yaml
deployment.apps/hello-app created
student@lab-kubernetes:~$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-app 1/1 1 1 13s
student@lab-kubernetes:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-599bb4bf7f-l45k4 1/1 Running 0 17s
Then create and apply the service that exposes the app:
student@lab-kubernetes:~$ cat hello-app-service.yaml
apiVersion: v1
kind: Service
metadata:
name: hello-app
spec:
type: NodePort
selector:
app: hello
ports:
- protocol: TCP
port: 8080
targetPort: 8080
nodePort: 30082
student@lab-kubernetes:~$ kubectl apply -f hello-app-service.yaml
service/hello-app created
student@lab-kubernetes:~$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-app NodePort 10.96.186.102 <none> 8080:30082/TCP 7m42s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20h
Now, let's scale hello-app
to 10 pods. For this, change the value for replicas
in hello-app-deployment.yaml
to 10
, and reapply the manifest:
student@lab-kubernetes:~$ kubectl apply -f hello-app-deployment.yaml
deployment.apps/hello-app configured
student@lab-kubernetes:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-599bb4bf7f-25w8g 1/1 Running 0 6s
hello-app-599bb4bf7f-7xzgr 0/1 ContainerCreating 0 5s
hello-app-599bb4bf7f-gr9xb 1/1 Running 0 6s
hello-app-599bb4bf7f-l45k4 1/1 Running 0 44m
hello-app-599bb4bf7f-mbgx7 0/1 ContainerCreating 0 6s
hello-app-599bb4bf7f-ps2dj 1/1 Running 0 6s
hello-app-599bb4bf7f-r6xqv 1/1 Running 0 6s
hello-app-599bb4bf7f-rrnws 0/1 ContainerCreating 0 5s
hello-app-599bb4bf7f-tnqtz 1/1 Running 0 6s
hello-app-599bb4bf7f-wh7qx 0/1 ContainerCreating 0 6s
After a while, you'll see that all 10 pods are running. Also, the deployment shows 10 available pods:
student@lab-kubernetes:~$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-app 10/10 10 10 45m
Replica setsβ
What actually happened is that a Kubernetes replica set associated with the deployment, of scale 10
, was created:
student@lab-kubernetes:~$ kubectl get replicasets
NAME DESIRED CURRENT READY AGE
hello-app-599bb4bf7f 10 10 10 1m
Testing the scaled appβ
Connect multiple times to the service, using curl
. You will notice that each time, a different pod responds:
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-r6xqv
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-gr9xb
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-rrnws
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-7xzgr
student@lab-kubernetes:~$ curl http://172.18.0.2:30082
Hello, world!
Version: 1.0.0
Hostname: hello-app-599bb4bf7f-ps2dj
Autoscalingβ
In production, it's unfeasible to manually scale up and down an application. Instead, we need a solution that automatically does this, as resource demands are changing.
In Kubernetes, we have the concept of horizontal pod autoscaler, which adds or removes pods from a replica set based on resource usage.
Defining resource constrainsβ
Remove the current `hello-app1 deployment, if any:
First, let's delete the current deployment:
student@cc-lab:~/work$ kubectl delete deployments hello-app
deployment.apps "hello-app" deleted
Then, let's create and apply a new deployment that defines resource constraints:
student@cc-lab:~/work$ cat hello-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app
labels:
app: hello
spec:
replicas: 1
selector:
matchLabels:
app: hello
template:
metadata:
labels:
app: hello
spec:
containers:
- name: hello-app
image: gitlab.cs.pub.ro:5050/scgc/cloud-courses/hello-app:1.0
ports:
- containerPort: 8080
resources:
limits:
cpu: 200m
requests:
cpu: 100m
student@cc-lab$ kubectl apply -f hello-deployment.yaml
deployment.apps/hello-app created
The parameters have the following meaning:
resources.requests.cpu
- minimum resources requested by the container (0.1 CPU cores in this case)resources.limits.cpu
- maximum resources requested by the container (0.2 CPU cores in this case)
Installing the metrics serverβ
In order for Kubernetes to measure resource utilization, we must install the metrics server, which is not installed by default in Kind.
We will use Helm, which is a package manager for Kubernetes.
Using Helm will be the scope of another lab. For now, run the following commands to install the metrics server:
student@cc-lab:~/work$ helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
student@cc-lab:~/work$ helm repo update
student@cc-lab:~/work$ helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server --namespace kube-system
Defining the autoscaling policyβ
Now, let's define and apply the horizontal cpu autoscaler:
student@cc-lab:~/work$ cat hello-autoscaler.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hello
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hello-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 5
student@cc-lab:~/work$ kubectl apply -f hello-autoscaler.yaml
horizontalpodautoscaler.autoscaling/hello created
The parameters have the following meaning:
minreplicas
- minimum replicas to scale down tomaxReplicas
- maximum replicas to scale up toaverageUtilization
- when to scale; in this case, when the average CPU load across pods is greater than 5%
Also, inspect the horizontal pod autoscaler:
student@cc-lab:~/work$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello Deployment/hello-app cpu: 0%/5% 1 10 1 2m30s
The values set for resource limits and average utilization are unrealisticaly low, but we did this to be able to generate load.
Generating loadβ
Open another terminal and run a while loop that sends curl requests:
student@cc-lab:~$ while true; do curl http://172.18.0.2:30082/; sleep 0.01; done
In the first terminal, inspect the horizontal pod autoscaler:
student@cc-lab:~/work$ kubectl get hpa --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello Deployment/hello-app cpu: 0%/5% 1 10 1 20m
hello Deployment/hello-app cpu: 2%/5% 1 10 1 21m
hello Deployment/hello-app cpu: 18%/5% 1 10 1 21m
hello Deployment/hello-app cpu: 16%/5% 1 10 4 21m
hello Deployment/hello-app cpu: 5%/5% 1 10 4 22m
hello Deployment/hello-app cpu: 4%/5% 1 10 4 23m
[...]
Observe how additional replicas have been automatically created:
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-f447d7765-72sp8 1/1 Running 0 2m10s
hello-app-f447d7765-bwllr 1/1 Running 0 6m3s
hello-app-f447d7765-jr8kx 1/1 Running 0 2m10s
hello-app-f447d7765-v7lnq 1/1 Running 0 2m10s
Stopping the loadβ
Stop the while loop from the other terminal. Continue to inspect the horizontal pod autoscaler:
student@cc-lab:~/work$ kubectl get hpa --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello Deployment/hello-app cpu: 5%/5% 1 10 4 23m
hello Deployment/hello-app cpu: 3%/5% 1 10 4 24m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 25m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 27m
hello Deployment/hello-app cpu: 1%/5% 1 10 4 27m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 28m
hello Deployment/hello-app cpu: 1%/5% 1 10 4 29m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 29m
hello Deployment/hello-app cpu: 0%/5% 1 10 4 29m
hello Deployment/hello-app cpu: 1%/5% 1 10 3 29m
hello Deployment/hello-app cpu: 0%/5% 1 10 1 30m
hello Deployment/hello-app cpu: 1%/5% 1 10 1 30m
hello Deployment/hello-app cpu: 0%/5% 1 10 1 31m
Notice that after a few minutes, the instances have been scaled down to 1:
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-f447d7765-bwllr 1/1 Running 0 14m
Exercise - fine tuningβ
Try to tune the:
- resources parameters (
resources.requests.cpu
andresources.limits.cpu
) - autoscaler parameter (
averageUtilization
) - the way you generate traffic
in order to reach the maximum number of 10 instances.
Ingressβ
Even if we can expose Kubernetes apps using services, each service runs on a different port. If we want a single point of acccess to all apps in the Kubernetes cluster, we can use an Ingress.
An Ingress is a Kubernetes object that acts like an API gateway. Each service can be accessed using a different HTTP resource path.
Setting up an ingress on the Kind clusterβ
Kind doesn't have the full Ingress functionality by default, so we have to install some dependencies.
First, let's install the Ingress controller functionality:
student@cc-lab:~/work$ kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/deploy-ingress-nginx.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
serviceaccount/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
configmap/ingress-nginx-controller created
service/ingress-nginx-controller created
service/ingress-nginx-controller-admission created
deployment.apps/ingress-nginx-controller created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
ingressclass.networking.k8s.io/nginx created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
Then, we must install the cloud provider add-on for Kind. Go to https://github.com/kubernetes-sigs/cloud-provider-kind/releases and download the archive for Linux AMD64 architecture.
Extract it, and run the executable in a different terminal. Keep it running.
student@cc-lab-alexandru-carp:~$ wget https://github.com/kubernetes-sigs/cloud-provider-kind/releases/download/v0.6.0/cloud-provider-kind_0.6.0_linux_amd64.tar.gz
[...]
student@cc-lab-alexandru-carp:~$ tar -xvf cloud-provider-kind_0.6.0_linux_amd64.tar.gz
LICENSE
README.md
cloud-provider-kind
student@cc-lab-alexandru-carp:~$ ./cloud-provider-kind
[...]
Configuring another serviceβ
Let's configure another service, similar to the hello-app
one, so that the ingress will route traffic to both services.
This time, we will use hello-app:2.0
image.
Create and apply the second deployment:
student@cc-lab:~/work$ cat hello-deployment-v2.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app-v2
labels:
app: hello-v2
spec:
replicas: 1
selector:
matchLabels:
app: hello-v2
template:
metadata:
labels:
app: hello-v2
spec:
containers:
- name: hello-app
image: gitlab.cs.pub.ro:5050/scgc/cloud-courses/hello-app:2.0
ports:
- containerPort: 8080
resources:
limits:
cpu: 200m
requests:
cpu: 100m
student@cc-lab:~/work$ kubectl apply -f hello-deployment-v2.yaml
deployment.apps/hello-app-v2 created
student@cc-lab:~/work$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-app 1/1 1 1 54m
hello-app-v2 1/1 1 1 12s
student@cc-lab:~/work$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-app-f447d7765-bwllr 1/1 Running 0 35m
hello-app-v2-5b9fbc5465-wr6nr 1/1 Running 0 5s
Create apply the second service:
student@cc-lab:~/work$ cat hello-service-v2.yaml
apiVersion: v1
kind: Service
metadata:
name: hello-app-v2
spec:
type: NodePort
selector:
app: hello-v2
ports:
- protocol: TCP
port: 8080
targetPort: 8080
nodePort: 30083
student@cc-lab:~/work$ kubectl apply -f hello-service-v2.yaml
service/hello-app-v2 created
student@cc-lab:~/work$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-app NodePort 10.96.172.199 <none> 8080:30082/TCP 55m
hello-app-v2 NodePort 10.96.33.190 <none> 8080:30083/TCP 3s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 162m
Defining the ingressβ
We will define an ingress so that:
/v1
path will point to the service forhello-app:1.0
/v2
path will point to the service forhello-app:2.0
student@cc-lab:~/work$ cat hello-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hello-ingress
spec:
rules:
- http:
paths:
- pathType: Prefix
path: /v1
backend:
service:
name: hello-app
port:
number: 8080
- pathType: Prefix
path: /v2
backend:
service:
name: hello-app-v2
port:
number: 8080
student@cc-lab:~/work$ kubectl apply -f hello-ingress.yaml
ingress.networking.k8s.io/hello-ingress configured
student@cc-lab:~/work$ kubectl describe ingress hello-ingress
Name: hello-ingress
Labels: <none>
Namespace: default
Address:
Ingress Class: <none>
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
*
/v1 hello-app:8080 (10.244.0.50:8080)
/v2 hello-app-v2:8080 (10.244.0.54:8080)
Annotations: <none>
Events: <none>
Testing traffic routingβ
For identifying the IP address associated to the ingress, we must inspect the services in the ingress-nginx
namespace:
student@cc-lab:~/work$ kubectl get --namespace ingress-nginx services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller LoadBalancer 10.96.75.175 172.18.0.3 80:32191/TCP,443:32729/TCP 5m45s
ingress-nginx-controller-admission ClusterIP 10.96.116.85 <none> 443/TCP
In our case, the IP address is 172.18.0.3
.
Let's test traffic routing with curl
:
student@cc-lab:~/work$ curl 172.18.0.3/v1
Hello, world!
Version: 1.0.0
Hostname: hello-app-f447d7765-bwllr
student@cc-lab:~/work$ curl 172.18.0.3/v2
Hello, world!
Version: 2.0.0
Hostname: hello-app-v2-5b9fbc5465-wr6nr