MinIO (S3 on Kubernetes)

Setup

We will be using a virtual machine in the faculty's cloud.

When creating a virtual machine in the Launch Instance window:

Name your VM using the following convention: cc_lab<no>_<username>, where <no> is the lab number and <username> is your institutional account.
Select Boot from image in Instance Boot Source section
Select CC 2024-2025 in Image Name section
Select the m1.xlarge flavor.

In the base virtual machine:

Download the laboratory archive from here. Use: wget https://repository.grid.pub.ro/cs/cc/laboratoare/install-kind.zip to download the archive.
Extract the archive.
Run the setup script bash install-kind.sh.

$ # download the archive
$ wget https://repository.grid.pub.ro/cs/cc/laboratoare/install-kind.zip
$ unzip install-kind.zip
$ bash install-kind.sh

Before we start

What is MinIO?

MinIO is an open-source, high-performance, S3-compatible object storage system. It allows users to store unstructured data like photos, videos, log files, backups, and container images.

Key features of MinIO:

Lightweight and scalable: Can be deployed quickly and scales horizontally.
S3 API compatibility: Works seamlessly with applications written for AWS S3.
High Performance: Designed for high-throughput workloads.
Built for Kubernetes: Native support for Kubernetes deployments.

We will explain these concepts more deeply in the following chapters.

Why do we need MinIO?

Object storage is crucial when applications need to store and retrieve large amounts of unstructured data reliably.

Here are a few real-world use cases:

Machine Learning Models
- Store massive training datasets like images and audio.
- Serve models for production services directly from object storage.
Backup and Archival
- Snapshots of databases or virtual machines stored reliably.
- Cost-effective storage for rarely accessed data.
Web Applications
- Host static assets like images, CSS, and videos.
- Provide easy upload/download functionality for users.

MinIO is a lightweight, cost-effective solution for all these tasks when we don't want to depend on a public cloud (like AWS S3).

MinIO is great, but what are the alternatives?

	MinIO	AWS S3	Ceph	GlusterFS
Performance	Very High	Very High	Moderate	Moderate
Latency	Low	Low	High	Moderate
S3 Compatibility	Full	Native	Partial (via RADOS Gateway)	No
Persistence	Strong (Disk-based)	Strong	Strong	Strong
Scalability	Excellent (Horizontal)	Excellent	Excellent	Moderate

What tools we will use?

In this lab, we'll use Kubernetes resources (Deployments, Services) and MinIO's client tools to interact with the object storage.

info

We'll access MinIO via its web UI (localhost:9001) or via the MinIO client (mc) installed in your VM.

info

Option 1: Web UI Access

After deploying MinIO, you'll forward the service port to your machine:

student@lab-s3:~$ kubectl port-forward -n minio deployment/minio 9000:9000 9001:9001

Then, navigate to http://localhost:9001 in your browser.

info

Option 2: MinIO Client (mc)

We'll install mc to interact with the storage via command line.

Useful scripts

We will automate several tasks using shell scripts provided in the lab archive. You will find YAML files to:

Deploy the MinIO server
Deploy a test application
Configure access from inside Kubernetes

We'll also provide small snippets to quickly test upload and download functionality.

MinIO Setup Step-by-Step

Step 0: Deploy a Kubernetes Cluster

Create a local Kubernetes cluster using kind create cluster:

student@lab-kubernetes:~$ kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.23.4) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Thanks for using kind! 😊

Step 1: Deploy MinIO Server

Create the following file minio-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: minio
  namespace: minio
spec:
  replicas: 1
  selector:
    matchLabels:
      app: minio
  template:
    metadata:
      labels:
        app: minio
    spec:
      containers:
      - name: minio
        image: quay.io/minio/minio:latest
        args:
        - server
        - /data
        - --console-address
        - ":9001"
        env:
        - name: MINIO_ROOT_USER
          value: "minioadmin"
        - name: MINIO_ROOT_PASSWORD
          value: "minioadmin"
        ports:
        - containerPort: 9000
        - containerPort: 9001
        volumeMounts:
        - name: storage
          mountPath: /data
      volumes:
      - name: storage
        emptyDir: {}

Create the following file minio-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: minio-service
  namespace: minio
spec:
  type: ClusterIP
  ports:
    - port: 9000
      targetPort: 9000
  selector:
    app: minio

Apply the resources:

student@lab-s3:~$ kubectl create namespace minio
student@lab-s3:~$ kubectl apply -f minio-deployment.yaml
student@lab-s3:~$ kubectl apply -f minio-service.yaml

Check that MinIO is running:

student@lab-s3:~$ kubectl get pods -n minio

Step 2: Setup MinIO Client (mc)

student@lab-s3:~$ wget https://dl.min.io/client/mc/release/linux-amd64/mc
student@lab-s3:~$ chmod +x mc
student@lab-s3:~$ sudo mv mc /usr/local/bin/

Configure mc:

student@lab-s3:~$ kubectl port-forward -n minio deployment/minio 9000:9000 9001:9001
student@lab-s3:~$ mc alias set local http://localhost:9000 minioadmin minioadmin

Step 3: Create a Bucket and Upload Files

We will use the mc command-line tool to interact with MinIO, create buckets, and upload files:

student@lab-s3:~$ mc mb local/mybucket
student@lab-s3:~$ echo "hello cloud computing" > testfile.txt
student@lab-s3:~$ mc cp testfile.txt local/mybucket
student@lab-s3:~$ # check the result
student@lab-s3:~$ mc ls local/mybucket
student@lab-s3:~$ mc cat local/mybucket/testfile.txt

Step 4: Access S3 from a Kubernetes App

We will use a python script to upload files to the MinIO bucket from a Kubernetes pod.

We need a ConfigMap to store the script and a Deployment to run it.

Create the following file uploader-configmap.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: uploader-script
  namespace: default
data:
  uploader.py: |
    import boto3, time

    s3 = boto3.client(
        's3',
        endpoint_url='http://minio-service.minio.svc.cluster.local:9000',
        aws_access_key_id='minioadmin',
        aws_secret_access_key='minioadmin',
        region_name='us-east-1'
    )

    while True:
        with open('/tmp/hello.txt', 'w') as f:
            f.write('hello from kubernetes')
        s3.upload_file('/tmp/hello.txt', 'mybucket', 'hello.txt')
        print('Uploaded hello.txt')
        time.sleep(30)

Create the following file uploader-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: uploader
spec:
  replicas: 1
  selector:
    matchLabels:
      app: uploader
  template:
    metadata:
      labels:
        app: uploader
    spec:
      containers:
      - name: uploader
        image: python:3.10
        command: ["bash", "-c"]
        args:
          - |
            pip install boto3 && python /app/uploader.py
        volumeMounts:
        - name: script-volume
          mountPath: /app
        env:
        - name: AWS_ACCESS_KEY_ID
          value: "minioadmin"
        - name: AWS_SECRET_ACCESS_KEY
          value: "minioadmin"
      volumes:
      - name: script-volume
        configMap:
          name: uploader-script

Deploy the example uploader app:

student@lab-s3:~$ kubectl apply -f uploader-configmap.yaml
student@lab-s3:~$ kubectl apply -f uploader-deployment.yaml

Check app logs:

student@lab-s3:~$ kubectl logs -l app=uploader

This app will attempt to upload a file into your MinIO bucket.

Note: To reload the script, you can restart the deployment:

student@lab-s3:~$ kubectl rollout restart deployment uploader

Exercises

Task 1: Upload multiple files

Use a for loop to create and upload 10 text files to your bucket.

for i in {1..10}; do echo "File $i" > file$i.txt; mc cp file$i.txt local/mybucket; done

Check if all files are present in the Web UI!

Task 2: Deploy a second app to read files

Create a simple Kubernetes Deployment (yaml provided as example for upload) that lists files in the bucket.

What differences do you notice compared to uploading?

Task 3: Upload timestamped files

Modify the uploader application so that each uploaded file has a unique name based on the current timestamp.

Hint: Update the Python code inside the uploader container to:

import boto3, os, time
from datetime import datetime

s3 = boto3.client('s3',
  endpoint_url='http://minio-service.minio.svc.cluster.local:9000',
  aws_access_key_id='minioadmin',
  aws_secret_access_key='minioadmin',
  region_name='us-east-1')

while True:
    filename = f"hello_{datetime.utcnow().strftime('%Y%m%d_%H%M%S')}.txt"
    filepath = f"/tmp/{filename}"
    with open(filepath, 'w') as f:
        f.write('hello from kubernetes')
    s3.upload_file(filepath, 'mybucket', filename)
    print(f'Uploaded {filename}')
    time.sleep(30)

This change will prevent overwriting and simulate realistic object uploads.

Task 4: Create a private bucket

Use mc to create a new bucket called privatebucket and set it to be private (no anonymous access):

student@lab-s3:~$ mc mb local/privatebucket
student@lab-s3:~$ mc anonymous set none local/privatebucket

To list the policies, use:

student@lab-s3:~$ mc anonymous get local/privatebucket

Create a file inside the bucket and try to access the bucket via HTTP without credentials. What happens?

student@lab-s3:~$ mc mb local/privatebucket
student@lab-s3:~$ echo "hello cloud computing" > testfile.txt
student@lab-s3:~$ mc cp testfile.txt local/privatebucket

student@lab-s3:~$ curl http://localhost:9000/privatebucket/testfile.txt
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code>[...]

Hint: To test you can share a public link to the bucket using mc share command.

student@lab-s3:~$ mc share download local/privatebucket/testfile.txt
student@lab-s3:~$ curl <PASTE PUBLIC LINK HERE>

Note: To make a bucket public, use mc anonymous set download local/privatebucket.

Task 5: Backup and restore a bucket

Using mc, copy all files from mybucket to a backup bucket called backupbucket:

student@lab-s3:~$ mc mb local/backupbucket
student@lab-s3:~$ mc mirror local/mybucket local/backupbucket

Now delete a file from mybucket, and restore it from backupbucket!

Task 6: Deploy a second MinIO instance

Deploy a second MinIO server in a different namespace (minio2).

Make sure:

It uses different service names.
It listens on a different NodePort if needed.

Use it to create a separate bucket and upload a file there.

Setup​

Before we start​

What is MinIO?​

Why do we need MinIO?​

MinIO is great, but what are the alternatives?​

What tools we will use?​

Useful scripts​

MinIO Setup Step-by-Step​

Step 0: Deploy a Kubernetes Cluster​

Step 1: Deploy MinIO Server​

Step 2: Setup MinIO Client (mc)​

Step 3: Create a Bucket and Upload Files​

Step 4: Access S3 from a Kubernetes App​

Exercises​

Task 1: Upload multiple files​

Task 2: Deploy a second app to read files​

Task 3: Upload timestamped files​

Task 4: Create a private bucket​

Task 5: Backup and restore a bucket​

Task 6: Deploy a second MinIO instance​

Setup

Before we start

What is MinIO?

Why do we need MinIO?

MinIO is great, but what are the alternatives?

What tools we will use?

Useful scripts

MinIO Setup Step-by-Step

Step 0: Deploy a Kubernetes Cluster

Step 1: Deploy MinIO Server

Step 2: Setup MinIO Client (mc)

Step 3: Create a Bucket and Upload Files

Step 4: Access S3 from a Kubernetes App

Exercises

Task 1: Upload multiple files

Task 2: Deploy a second app to read files

Task 3: Upload timestamped files

Task 4: Create a private bucket

Task 5: Backup and restore a bucket

Task 6: Deploy a second MinIO instance