MinIO (S3 on Kubernetes)
Setup
We will be using a virtual machine in the faculty's cloud.
When creating a virtual machine in the Launch Instance window:
- Name your VM using the following convention:
cc_lab<no>_<username>
, where<no>
is the lab number and<username>
is your institutional account. - Select Boot from image in Instance Boot Source section
- Select CC 2024-2025 in Image Name section
- Select the m1.xlarge flavor.
In the base virtual machine:
- Download the laboratory archive from here.
Use:
wget https://repository.grid.pub.ro/cs/cc/laboratoare/install-kind.zip
to download the archive. - Extract the archive.
- Run the setup script
bash install-kind.sh
.
$ # download the archive
$ wget https://repository.grid.pub.ro/cs/cc/laboratoare/install-kind.zip
$ unzip install-kind.zip
$ bash install-kind.sh
Before we start
What is MinIO?
MinIO is an open-source, high-performance, S3-compatible object storage system. It allows users to store unstructured data like photos, videos, log files, backups, and container images.
Key features of MinIO:
- Lightweight and scalable: Can be deployed quickly and scales horizontally.
- S3 API compatibility: Works seamlessly with applications written for AWS S3.
- High Performance: Designed for high-throughput workloads.
- Built for Kubernetes: Native support for Kubernetes deployments.
We will explain these concepts more deeply in the following chapters.
Why do we need MinIO?
Object storage is crucial when applications need to store and retrieve large amounts of unstructured data reliably.
Here are a few real-world use cases:
- Machine Learning Models
- Store massive training datasets like images and audio.
- Serve models for production services directly from object storage.
- Backup and Archival
- Snapshots of databases or virtual machines stored reliably.
- Cost-effective storage for rarely accessed data.
- Web Applications
- Host static assets like images, CSS, and videos.
- Provide easy upload/download functionality for users.
MinIO is a lightweight, cost-effective solution for all these tasks when we don't want to depend on a public cloud (like AWS S3).
MinIO is great, but what are the alternatives?
MinIO | AWS S3 | Ceph | GlusterFS | |
---|---|---|---|---|
Performance | Very High | Very High | Moderate | Moderate |
Latency | Low | Low | High | Moderate |
S3 Compatibility | Full | Native | Partial (via RADOS Gateway) | No |
Persistence | Strong (Disk-based) | Strong | Strong | Strong |
Scalability | Excellent (Horizontal) | Excellent | Excellent | Moderate |
What tools we will use?
In this lab, we'll use Kubernetes resources (Deployments, Services) and MinIO's client tools to interact with the object storage.
We'll access MinIO via its web UI (localhost:9001
) or via the MinIO client (mc
) installed in your VM.
Option 1: Web UI Access
After deploying MinIO, you'll forward the service port to your machine:
student@lab-s3:~$ kubectl port-forward -n minio deployment/minio 9000:9000 9001:9001
Then, navigate to http://localhost:9001 in your browser.
Option 2: MinIO Client (mc)
We'll install mc
to interact with the storage via command line.
Useful scripts
We will automate several tasks using shell scripts provided in the lab archive. You will find YAML files to:
- Deploy the MinIO server
- Deploy a test application
- Configure access from inside Kubernetes
We'll also provide small snippets to quickly test upload and download functionality.
MinIO Setup Step-by-Step
Step 0: Deploy a Kubernetes Cluster
Create a local Kubernetes cluster using kind create cluster
:
student@lab-kubernetes:~$ kind create cluster
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.23.4) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Thanks for using kind! 😊
Step 1: Deploy MinIO Server
Create the following file minio-deployment.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
namespace: minio
spec:
replicas: 1
selector:
matchLabels:
app: minio
template:
metadata:
labels:
app: minio
spec:
containers:
- name: minio
image: quay.io/minio/minio:latest
args:
- server
- /data
- --console-address
- ":9001"
env:
- name: MINIO_ROOT_USER
value: "minioadmin"
- name: MINIO_ROOT_PASSWORD
value: "minioadmin"
ports:
- containerPort: 9000
- containerPort: 9001
volumeMounts:
- name: storage
mountPath: /data
volumes:
- name: storage
emptyDir: {}
Create the following file minio-service.yaml
:
apiVersion: v1
kind: Service
metadata:
name: minio-service
namespace: minio
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
selector:
app: minio
Apply the resources:
student@lab-s3:~$ kubectl create namespace minio
student@lab-s3:~$ kubectl apply -f minio-deployment.yaml
student@lab-s3:~$ kubectl apply -f minio-service.yaml
Check that MinIO is running:
student@lab-s3:~$ kubectl get pods -n minio
Step 2: Setup MinIO Client (mc)
student@lab-s3:~$ wget https://dl.min.io/client/mc/release/linux-amd64/mc
student@lab-s3:~$ chmod +x mc
student@lab-s3:~$ sudo mv mc /usr/local/bin/
Configure mc
:
student@lab-s3:~$ kubectl port-forward -n minio deployment/minio 9000:9000 9001:9001
student@lab-s3:~$ mc alias set local http://localhost:9000 minioadmin minioadmin
Step 3: Create a Bucket and Upload Files
We will use the mc
command-line tool to interact with MinIO, create buckets, and upload files:
student@lab-s3:~$ mc mb local/mybucket
student@lab-s3:~$ echo "hello cloud computing" > testfile.txt
student@lab-s3:~$ mc cp testfile.txt local/mybucket
student@lab-s3:~$ # check the result
student@lab-s3:~$ mc ls local/mybucket
student@lab-s3:~$ mc cat local/mybucket/testfile.txt
Step 4: Access S3 from a Kubernetes App
We will use a python script to upload files to the MinIO bucket from a Kubernetes pod.
We need a ConfigMap
to store the script and a Deployment
to run it.
Create the following file uploader-configmap.yaml
:
apiVersion: v1
kind: ConfigMap
metadata:
name: uploader-script
namespace: default
data:
uploader.py: |
import boto3, time
s3 = boto3.client(
's3',
endpoint_url='http://minio-service.minio.svc.cluster.local:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin',
region_name='us-east-1'
)
while True:
with open('/tmp/hello.txt', 'w') as f:
f.write('hello from kubernetes')
s3.upload_file('/tmp/hello.txt', 'mybucket', 'hello.txt')
print('Uploaded hello.txt')
time.sleep(30)
Create the following file uploader-deployment.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: uploader
spec:
replicas: 1
selector:
matchLabels:
app: uploader
template:
metadata:
labels:
app: uploader
spec:
containers:
- name: uploader
image: python:3.10
command: ["bash", "-c"]
args:
- |
pip install boto3 && python /app/uploader.py
volumeMounts:
- name: script-volume
mountPath: /app
env:
- name: AWS_ACCESS_KEY_ID
value: "minioadmin"
- name: AWS_SECRET_ACCESS_KEY
value: "minioadmin"
volumes:
- name: script-volume
configMap:
name: uploader-script
Deploy the example uploader app:
student@lab-s3:~$ kubectl apply -f uploader-configmap.yaml
student@lab-s3:~$ kubectl apply -f uploader-deployment.yaml
Check app logs:
student@lab-s3:~$ kubectl logs -l app=uploader
This app will attempt to upload a file into your MinIO bucket.
Note: To reload the script, you can restart the deployment:
student@lab-s3:~$ kubectl rollout restart deployment uploader
Exercises
Task 1: Upload multiple files
Use a for
loop to create and upload 10 text files to your bucket.
for i in {1..10}; do echo "File $i" > file$i.txt; mc cp file$i.txt local/mybucket; done
Check if all files are present in the Web UI!
Task 2: Deploy a second app to read files
Create a simple Kubernetes Deployment (yaml provided as example for upload
) that lists files in the bucket.
What differences do you notice compared to uploading?
Task 3: Upload timestamped files
Modify the uploader application so that each uploaded file has a unique name based on the current timestamp.
Hint: Update the Python code inside the uploader container to:
import boto3, os, time
from datetime import datetime
s3 = boto3.client('s3',
endpoint_url='http://minio-service.minio.svc.cluster.local:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin',
region_name='us-east-1')
while True:
filename = f"hello_{datetime.utcnow().strftime('%Y%m%d_%H%M%S')}.txt"
filepath = f"/tmp/{filename}"
with open(filepath, 'w') as f:
f.write('hello from kubernetes')
s3.upload_file(filepath, 'mybucket', filename)
print(f'Uploaded {filename}')
time.sleep(30)
This change will prevent overwriting and simulate realistic object uploads.
Task 4: Create a private bucket
Use mc
to create a new bucket called privatebucket
and set it to be private (no anonymous access):
student@lab-s3:~$ mc mb local/privatebucket
student@lab-s3:~$ mc anonymous set none local/privatebucket
To list the policies, use:
student@lab-s3:~$ mc anonymous get local/privatebucket
Create a file inside the bucket and try to access the bucket via HTTP without credentials. What happens?
student@lab-s3:~$ mc mb local/privatebucket
student@lab-s3:~$ echo "hello cloud computing" > testfile.txt
student@lab-s3:~$ mc cp testfile.txt local/privatebucket
student@lab-s3:~$ curl http://localhost:9000/privatebucket/testfile.txt
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code>[...]
Hint: To test you can share a public link to the bucket using mc share
command.
student@lab-s3:~$ mc share download local/privatebucket/testfile.txt
student@lab-s3:~$ curl <PASTE PUBLIC LINK HERE>
Note: To make a bucket public, use mc anonymous set download local/privatebucket
.
Task 5: Backup and restore a bucket
Using mc
, copy all files from mybucket
to a backup bucket called backupbucket
:
student@lab-s3:~$ mc mb local/backupbucket
student@lab-s3:~$ mc mirror local/mybucket local/backupbucket
Now delete a file from mybucket
, and restore it from backupbucket
!
Task 6: Deploy a second MinIO instance
Deploy a second MinIO server in a different namespace (minio2
).
Make sure:
- It uses different service names.
- It listens on a different NodePort if needed.
Use it to create a separate bucket and upload a file there.