Skip to main content

MinIO (S3 on Kubernetes)

Setup

We will be using a virtual machine in the faculty's cloud.

When creating a virtual machine in the Launch Instance window:

  • Name your VM using the following convention: cc_lab<no>_<username>, where <no> is the lab number and <username> is your institutional account.
  • Select Boot from image in Instance Boot Source section
  • Select CC 2024-2025 in Image Name section
  • Select the m1.xlarge flavor.

In the base virtual machine:

  • Download the laboratory archive from here. Use: wget https://repository.grid.pub.ro/cs/cc/laboratoare/install-kind.zip to download the archive.
  • Extract the archive.
  • Run the setup script bash install-kind.sh.
$ # download the archive
$ wget https://repository.grid.pub.ro/cs/cc/laboratoare/install-kind.zip
$ unzip install-kind.zip
$ bash install-kind.sh

Before we start

What is MinIO?

MinIO is an open-source, high-performance, S3-compatible object storage system. It allows users to store unstructured data like photos, videos, log files, backups, and container images.

Key features of MinIO:

  • Lightweight and scalable: Can be deployed quickly and scales horizontally.
  • S3 API compatibility: Works seamlessly with applications written for AWS S3.
  • High Performance: Designed for high-throughput workloads.
  • Built for Kubernetes: Native support for Kubernetes deployments.

We will explain these concepts more deeply in the following chapters.

Why do we need MinIO?

Object storage is crucial when applications need to store and retrieve large amounts of unstructured data reliably.

Here are a few real-world use cases:

  1. Machine Learning Models
    • Store massive training datasets like images and audio.
    • Serve models for production services directly from object storage.
  2. Backup and Archival
    • Snapshots of databases or virtual machines stored reliably.
    • Cost-effective storage for rarely accessed data.
  3. Web Applications
    • Host static assets like images, CSS, and videos.
    • Provide easy upload/download functionality for users.

MinIO is a lightweight, cost-effective solution for all these tasks when we don't want to depend on a public cloud (like AWS S3).

MinIO is great, but what are the alternatives?

MinIOAWS S3CephGlusterFS
PerformanceVery HighVery HighModerateModerate
LatencyLowLowHighModerate
S3 CompatibilityFullNativePartial (via RADOS Gateway)No
PersistenceStrong (Disk-based)StrongStrongStrong
ScalabilityExcellent (Horizontal)ExcellentExcellentModerate

What tools we will use?

In this lab, we'll use Kubernetes resources (Deployments, Services) and MinIO's client tools to interact with the object storage.

info

We'll access MinIO via its web UI (localhost:9001) or via the MinIO client (mc) installed in your VM.

info

Option 1: Web UI Access

After deploying MinIO, you'll forward the service port to your machine:

student@lab-s3:~$ kubectl port-forward -n minio deployment/minio 9000:9000 9001:9001

Then, navigate to http://localhost:9001 in your browser.

info

Option 2: MinIO Client (mc)

We'll install mc to interact with the storage via command line.

Useful scripts

We will automate several tasks using shell scripts provided in the lab archive. You will find YAML files to:

  • Deploy the MinIO server
  • Deploy a test application
  • Configure access from inside Kubernetes

We'll also provide small snippets to quickly test upload and download functionality.

MinIO Setup Step-by-Step

Step 0: Deploy a Kubernetes Cluster

Create a local Kubernetes cluster using kind create cluster:

student@lab-kubernetes:~$ kind create cluster
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.23.4) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Thanks for using kind! 😊

Step 1: Deploy MinIO Server

Create the following file minio-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
namespace: minio
spec:
replicas: 1
selector:
matchLabels:
app: minio
template:
metadata:
labels:
app: minio
spec:
containers:
- name: minio
image: quay.io/minio/minio:latest
args:
- server
- /data
- --console-address
- ":9001"
env:
- name: MINIO_ROOT_USER
value: "minioadmin"
- name: MINIO_ROOT_PASSWORD
value: "minioadmin"
ports:
- containerPort: 9000
- containerPort: 9001
volumeMounts:
- name: storage
mountPath: /data
volumes:
- name: storage
emptyDir: {}

Create the following file minio-service.yaml:

apiVersion: v1
kind: Service
metadata:
name: minio-service
namespace: minio
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
selector:
app: minio

Apply the resources:

student@lab-s3:~$ kubectl create namespace minio
student@lab-s3:~$ kubectl apply -f minio-deployment.yaml
student@lab-s3:~$ kubectl apply -f minio-service.yaml

Check that MinIO is running:

student@lab-s3:~$ kubectl get pods -n minio

Step 2: Setup MinIO Client (mc)

student@lab-s3:~$ wget https://dl.min.io/client/mc/release/linux-amd64/mc
student@lab-s3:~$ chmod +x mc
student@lab-s3:~$ sudo mv mc /usr/local/bin/

Configure mc:

student@lab-s3:~$ kubectl port-forward -n minio deployment/minio 9000:9000 9001:9001
student@lab-s3:~$ mc alias set local http://localhost:9000 minioadmin minioadmin

Step 3: Create a Bucket and Upload Files

We will use the mc command-line tool to interact with MinIO, create buckets, and upload files:

student@lab-s3:~$ mc mb local/mybucket
student@lab-s3:~$ echo "hello cloud computing" > testfile.txt
student@lab-s3:~$ mc cp testfile.txt local/mybucket
student@lab-s3:~$ # check the result
student@lab-s3:~$ mc ls local/mybucket
student@lab-s3:~$ mc cat local/mybucket/testfile.txt

Step 4: Access S3 from a Kubernetes App

We will use a python script to upload files to the MinIO bucket from a Kubernetes pod.

We need a ConfigMap to store the script and a Deployment to run it.

Create the following file uploader-configmap.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
name: uploader-script
namespace: default
data:
uploader.py: |
import boto3, time

s3 = boto3.client(
's3',
endpoint_url='http://minio-service.minio.svc.cluster.local:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin',
region_name='us-east-1'
)

while True:
with open('/tmp/hello.txt', 'w') as f:
f.write('hello from kubernetes')
s3.upload_file('/tmp/hello.txt', 'mybucket', 'hello.txt')
print('Uploaded hello.txt')
time.sleep(30)

Create the following file uploader-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
name: uploader
spec:
replicas: 1
selector:
matchLabels:
app: uploader
template:
metadata:
labels:
app: uploader
spec:
containers:
- name: uploader
image: python:3.10
command: ["bash", "-c"]
args:
- |
pip install boto3 && python /app/uploader.py
volumeMounts:
- name: script-volume
mountPath: /app
env:
- name: AWS_ACCESS_KEY_ID
value: "minioadmin"
- name: AWS_SECRET_ACCESS_KEY
value: "minioadmin"
volumes:
- name: script-volume
configMap:
name: uploader-script

Deploy the example uploader app:

student@lab-s3:~$ kubectl apply -f uploader-configmap.yaml
student@lab-s3:~$ kubectl apply -f uploader-deployment.yaml

Check app logs:

student@lab-s3:~$ kubectl logs -l app=uploader

This app will attempt to upload a file into your MinIO bucket.

Note: To reload the script, you can restart the deployment:

student@lab-s3:~$ kubectl rollout restart deployment uploader

Exercises

Task 1: Upload multiple files

Use a for loop to create and upload 10 text files to your bucket.

for i in {1..10}; do echo "File $i" > file$i.txt; mc cp file$i.txt local/mybucket; done

Check if all files are present in the Web UI!

Task 2: Deploy a second app to read files

Create a simple Kubernetes Deployment (yaml provided as example for upload) that lists files in the bucket.

What differences do you notice compared to uploading?

Task 3: Upload timestamped files

Modify the uploader application so that each uploaded file has a unique name based on the current timestamp.

Hint: Update the Python code inside the uploader container to:

import boto3, os, time
from datetime import datetime

s3 = boto3.client('s3',
endpoint_url='http://minio-service.minio.svc.cluster.local:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin',
region_name='us-east-1')

while True:
filename = f"hello_{datetime.utcnow().strftime('%Y%m%d_%H%M%S')}.txt"
filepath = f"/tmp/{filename}"
with open(filepath, 'w') as f:
f.write('hello from kubernetes')
s3.upload_file(filepath, 'mybucket', filename)
print(f'Uploaded {filename}')
time.sleep(30)

This change will prevent overwriting and simulate realistic object uploads.

Task 4: Create a private bucket

Use mc to create a new bucket called privatebucket and set it to be private (no anonymous access):

student@lab-s3:~$ mc mb local/privatebucket
student@lab-s3:~$ mc anonymous set none local/privatebucket

To list the policies, use:

student@lab-s3:~$ mc anonymous get local/privatebucket

Create a file inside the bucket and try to access the bucket via HTTP without credentials. What happens?

student@lab-s3:~$ mc mb local/privatebucket
student@lab-s3:~$ echo "hello cloud computing" > testfile.txt
student@lab-s3:~$ mc cp testfile.txt local/privatebucket

student@lab-s3:~$ curl http://localhost:9000/privatebucket/testfile.txt
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code>[...]

Hint: To test you can share a public link to the bucket using mc share command.

student@lab-s3:~$ mc share download local/privatebucket/testfile.txt
student@lab-s3:~$ curl <PASTE PUBLIC LINK HERE>

Note: To make a bucket public, use mc anonymous set download local/privatebucket.

Task 5: Backup and restore a bucket

Using mc, copy all files from mybucket to a backup bucket called backupbucket:

student@lab-s3:~$ mc mb local/backupbucket
student@lab-s3:~$ mc mirror local/mybucket local/backupbucket

Now delete a file from mybucket, and restore it from backupbucket!

Task 6: Deploy a second MinIO instance

Deploy a second MinIO server in a different namespace (minio2).

Make sure:

  • It uses different service names.
  • It listens on a different NodePort if needed.

Use it to create a separate bucket and upload a file there.