EOS
Setup
We will be using a virtual machine in the faculty's cloud.
When creating a virtual machine in the Launch Instance window:
- Name your VM using the following convention:
scgc_lab<no>_<username>, where<no>is the lab number and<username>is your institutional account. - Select Boot from image in Instance Boot Source section
- Select SCGC Template in Image Name section
- Select the m1.eos flavor.
In the base virtual machine:
-
Download the laboratory archive from here in the
workdirectory. Use:wget https://repository.grid.pub.ro/cs/scgc/laboratoare/lab-eos.zipto download the archive. -
Extract the archive. The
.qcow2files will be used to start virtual machines using therunvm.shscript. -
Start the virtual machines using
bash runvm.sh. -
The username for connecting to the nested VMs is
studentand the password isstudent.
$ # change the working dir
$ cd ~/work
$ # download the archive
$ wget https://repository.grid.pub.ro/cs/scgc/laboratoare/lab-eos.zip
$ unzip lab-eos.zip
$ # start VMs; it may take a while
$ bash runvm.sh
$ # check if the VMs booted
$ virsh net-dhcp-leases labvms
There will be five virtual machines that are created. You must add them to the /etc/hosts file on the host to be able to refer to the VMs by name instead of by their IP addresses.
192.168.100.11 fst-1
192.168.100.12 fst-2
192.168.100.13 fst-3
192.168.100.14 mgm
192.168.100.15 qdb
EOS Intro
EOS is a distributed disk storage system developed to meet the demanding data requirements of physics experiments at CERN. It provides low-latency remote access to stored data, making it well-suited for large-scale physics analysis. Designed to efficiently manage multi-PB file namespaces, EOS serves as the primary storage system at the CERN IT data center, as well as at numerous sites across the Worldwide LHC Computing Grid (WLCG).
EOS Architecture
From an architectural point of view, EOS is divided in metadata and data storage components. An EOS instance is composed of three core services:
- MGM: The manager service responsible for handling the metadata of all files and directories in an EOS instance. It serves as the entry point for external clients, handling authentication and authorization. It provides clients with a hierarchical view of the stored data and, during read and write operations, translates logical file paths into physical locations on the data nodes, redirecting clients accordingly. Additionally, the MGM oversees background tasks such as load balancing and the overall coordination of storage nodes.
- FST: Manages the physical storage. An EOS instance can have multiple FSTs, each able to manage several disks.
- QuarkDB (QDB): Database that offers persistent storage for metadata of all files and directories on the EOS instance, with the MGM caching the recently accessed entries.
To interact with an EOS instance, EOS provides a command-line interface called the EOS console, which we will also use during this lab to manage our own EOS instances.

EOS provides three conceptual views of the storage space:
- Filesystem (fs) view: contains all filesystems that store data.
- Node view: groups filesystems by their hosting FST nodes.
- Group view: organizes filesystems into scheduling groups.
We will interact with these three views when setting up the EOS instance.

Setting up an EOS instance
Since EOS is a distributed storage system, the setup script creates a cluster consisting of five AlmaLinux 9 machines: mgm, qdb, fst-1, and fst-2, fst-3. These nodes will host the services required for our EOS deployment.
Each core EOS service (MGM, QDB, and FST) runs as a daemon on its dedicated machine. For a correct configuration, the services must be initialized in the following order: QDB, MGM, and finally the FSTs.
Before proceeding with the EOS setup, ensure that all cluster hostnames are listed in /etc/hosts on every machine. This is critical for proper name resolution and communication between nodes.
EOS requires that each node must have a domain in its hostname (e.g., mgm.spd.ro, not just mgm) and each node hostname must resolve to its network IP address, not to 127.0.0.1.
On each machine, configure /etc/hosts as follows:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.100.14 mgm.spd.ro
192.168.100.15 qdb.spd.ro
192.168.100.11 fst-1.spd.ro
192.168.100.12 fst-2.spd.ro
192.168.100.13 fst-3.spd.ro
Also set the hostname on each node:
# On MGM node:
[root@mgm ~]$ hostnamectl set-hostname mgm.spd.ro
# On QDB node:
[root@qdb ~]$ hostnamectl set-hostname qdb.spd.ro
# On each FST node:
[root@fst-1 ~]$ hostnamectl set-hostname fst-1.spd.ro
# Repeat for fst-2.spd.ro and fst-3.spd.ro
# Verify the configuration on each node:
[root@mgm ~]$ hostname -i
192.168.100.14
# The output must be the network IP (e.g., `192.168.100.14`), not `127.0.0.1`.
In this setup, we will use the eos daemon command to configure and manage the EOS services.
QDB Setup
The QDB node is responsible for maintaining the database that stores the file metadata of the EOS instance, including details such as file location, attributes, and ownership.
To allow the other EOS services to communicate with the QDB service, make sure that TCP port 7777 is open on the QDB host:
[root@qdb ~]$ firewall-cmd --permanent --add-port=7777/tcp
success
[root@qdb ~]$ firewall-cmd --reload
success
We will start by generating the default QDB configuration file:
[root@qdb ~]$ eos daemon config qdb
# Config file at: /etc/eos/config/qdb/qdb
This configuration file /etc/eos/config/qdb/qdb defines the QuarkDB service for the EOS cluster. In our setup, the default configuration is ready to use and does not need any changes.
Let’s inspect its contents:
[root@qdb ~]$ cat /etc/eos/config/qdb/qdb
[sysconfig]
# ----------------------
# name of this QDB node
# ----------------------
QDB_HOST=${SERVER_HOST}
# defaults
QDB_PORT=7777
QDB_CLUSTER_ID=${INSTANCE_NAME}
QDB_NODE=${QDB_HOST}:${QDB_PORT}
QDB_NODES=${QDB_HOST}:${QDB_PORT}
QDB_PATH=/var/lib/qdb
[init]
test -d ${QDB_PATH} || quarkdb-create --path ${QDB_PATH} --clusterID ${QDB_CLUSTER_ID} --nodes ${QDB_NODES}
chown -R daemon:daemon ${QDB_PATH}
[qdb:xrootd:qdb]
xrd.port ${QDB_PORT}
xrd.protocol redis:${QDB_PORT} libXrdQuarkDB.so
redis.database /var/lib/qdb
redis.mode raft
redis.myself ${QDB_NODE}
A few important points to pay attention to in the config file:
-
Service parameters such as host, port, cluster ID and storage path for the QDB service.
-
Initialization script to create the database directory if it does not yet exist and to set the appropriate permissions. The directory must be owned by the daemon user.
-
Raft mode (
redis.mode raft): QuarkDB relies on the Raft consensus algorithm to ensure that all metadata changes are consistently replicated across QDB nodes. In a multi-node deployment, one node acts as the leader while others follow, providing fault tolerance and consistency. In this single-node test setup, Raft is still enabled, but the node serves as both leader and follower. In production, multiple QDB nodes are recommended to fully leverage the high-availability design of Raft. -
Redis compatibility: The QDB database can be queried and interacted with using standard Redis protocol. This allows the other EOS services, MGM and FST, to communicate with the metadata database efficiently.
Every instance must have a unique instance-private shared secret. The command will create a local file /etc/eos.keytab storing the instance-specific shared secret needed for MGM,FST,MQ:
[root@qdb ~]$ eos daemon sss recreate
Now everything is set up to run the QDB service:
[root@qdb ~]$ eos daemon run qdb
EOS_USE_MQ_ON_QDB=1
EOS_XROOTD=/opt/eos/xrootd/
GEO_TAG=local
INSTANCE_NAME=eosdev
LD_LIBRARY_PATH=/opt/eos/xrootd//lib64:/opt/eos/grpc/lib64
LD_PRELOAD=/usr/lib64/libjemalloc.so
QDB_CLUSTER_ID=eosdev
QDB_HOST=qdb.spd.ro
QDB_NODE=qdb.spd.ro:7777
QDB_NODES=qdb.spd.ro:7777
QDB_PATH=/var/lib/qdb
QDB_PORT=7777
SERVER_HOST=qdb.spd.ro
# ---------------------------------------
# ------------- i n i t -----------------
# ---------------------------------------
# run: mkdir -p /var/run/eos/
# run: chown daemon:root /var/run/eos/
# run: mkdir -p /var/cache/eos/
# run: chown daemon:root /var/cache/eos/
# run: if [ -e /etc/eos.keytab ]; then chown daemon /etc/eos.keytab ; chmod 400 /etc/eos.keytab ; fi
# run: mkdir -p /var/eos/md /var/eos/report
# run: chmod 755 /var/eos /var/eos/report
# run: mkdir -p /var/spool/eos/core/mgm /var/spool/eos/core/mq /var/spool/eos/core/fst /var/spool/eos/core/qdb /var/spool/eos/admin
# run: mkdir -p /var/log/eos
# run: chown -R daemon /var/spool/eos
# run: find /var/log/eos -maxdepth 1 -type d -exec chown daemon {} \;
# run: find /var/eos/ -maxdepth 1 -mindepth 1 -not -path "/var/eos/fs" -not -path "/var/eos/fusex" -type d -exec chown -R daemon {} \;
# run: chmod -R 775 /var/spool/eos
# run: mkdir -p /var/eos/auth /var/eos/stage
# run: chown daemon /var/eos/auth /var/eos/stage
# run: setfacl -m default:u:daemon:r /var/eos/auth/
# run: test -d ${QDB_PATH} || quarkdb-create --path ${QDB_PATH} --clusterID ${QDB_CLUSTER_ID} --nodes ${QDB_NODES}
# run: chown -R daemon:daemon ${QDB_PATH}
# ---------------------------------------
# ------------- x r o o t d ------------
# ---------------------------------------
# running config file: /var/run/eos/xrd.cf.qdb
# ---------------------------------------
xrd.port 7777
xrd.protocol redis:7777 libXrdQuarkDB.so
redis.database /var/lib/qdb
redis.mode raft
redis.myself qdb.spd.ro:7777
redis.password_file /etc/eos.keytab
#########################################
The command above shows the environment variables, initialization steps, and configuration details used to start the QuarkDB service.
The log file located at /var/log/eos/qdb/xrdlog.qdb contains detailed startup information and potential error messages. Let’s inspect it to confirm that the QDB service has launched successfully:
[root@qdb ~]$ less +G /var/log/eos/qdb/xrdlog.qdb
251005 09:46:48 24411 Starting on Linux 5.14.0-503.14.1.el9_5.x86_64
251005 09:46:48 24411 eos-qdb -n qdb -c /var/run/eos/xrd.cf.qdb -l /var/log/eos/xrdlog.qdb -R daemon -k fifo -s /var/run/eos/xrd.qdb.qdb.pid
Copr. 2004-2012 Stanford University, xrd version 5.8.4
++++++ eos-qdb qdb@qdb.spd.ro initialization started.
Config using configuration file /var/run/eos/xrd.cf.qdb
=====> xrd.port 7777
=====> xrd.protocol redis:7777 libXrdQuarkDB.so
Config maximum number of connections restricted to 524288
Config maximum number of threads restricted to 5569
Plugin protocol libXrdQuarkDB-5.so not found; falling back to using libXrdQuarkDB.so
Plugin Unable to find symbol XrdgetProtocolPort in protocol libXrdQuarkDB.so
Copr. 2012 Stanford University, xroot protocol 5.1.0 version 5.8.4
++++++ xroot protocol initialization started.
[...]
------ quarkdb protocol plugin initialization completed.
------ eos-qdb qdb@qdb.spd.ro:7777 initialization completed.
[...]
Once the QDB service is running successfully, open another terminal and check its status using the Redis CLI:
[root@qdb ~]$ redis-cli -p 7777 raft-info
1) TERM 1
2) LOG-START 0
3) LOG-SIZE 159
4) LEADER qdb.spd.ro:7777
5) CLUSTER-ID eosdev
6) COMMIT-INDEX 158
7) LAST-APPLIED 158
8) BLOCKED-WRITES 0
9) LAST-STATE-CHANGE 642 (10 minutes, 42 seconds)
10) ----------
11) MYSELF qdb.spd.ro:7777
12) VERSION 5.3.22.1
13) STATUS LEADER
14) NODE-HEALTH GREEN
15) JOURNAL-FSYNC-POLICY sync-important-updates
16) ----------
17) MEMBERSHIP-EPOCH 0
18) NODES qdb.spd.ro:7777
19) OBSERVERS
20) QUORUM-SIZE 1
The output confirms that the QDB node is healthy, acting as the Raft leader, and that the cluster is functioning.
MGM Setup
Next we will move on to launch the MGM service, the manager node of the EOS instance. We will further use this node to interact with the EOS instance.
To enable communication between QDB and MGM, copy the /etc/eos.keytab file created on the QDB file on the MGM file.
Enure that TCP port 1094 is open on the MGM to allow the other EOS services to communicate with the MGM service:
[root@mgm ~]$ firewall-cmd --permanent --add-port=1094/tcp
success
[root@mgmb ~]$ firewall-cmd --reload
success
Generate the default configuration file for the MGM service:
[root@mgm ~]$ eos daemon config mgm
# Config file at: /etc/eos/config/mgm/mgm
Edit /etc/eos/config/mgm/mgm to set the correct hostname and port of QDB service:
[root@mgm ~]$ grep mgmofs.qdbcluster /etc/eos/config/mgm/mgm
mgmofs.qdbcluster qdb.spd.ro:7777
Let's inspect the MGM config file:
[root@mgm ~]$ cat /etc/eos/config/mgm/mgm
# ------------------------------------------------------------ #
[mgm:xrootd:mgm]
# ------------------------------------------------------------ #
###########################################################
xrootd.fslib libXrdEosMgm.so
xrootd.seclib libXrdSec.so
xrootd.async off nosf
xrootd.chksum adler32
###########################################################
xrd.sched mint 8 maxt 256 idle 64
###########################################################
all.export / nolock
all.role manager
###########################################################
oss.fdlimit 16384 32768
###########################################################
# UNIX authentication
sec.protocol unix
# SSS authentication
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
###########################################################
sec.protbind [::ffff:127.0.0.1] unix sss
sec.protbind localhost.localdomain unix sss
sec.protbind localhost unix sss
sec.protbind * only ${KRB5} ${GSI} sss unix
###########################################################
[...]
#-------------------------------------------------------------------------------
# Set the namespace plugin implementation
#-------------------------------------------------------------------------------
mgmofs.nslib /usr/lib64/libEosNsQuarkdb.so
# Quarkdb custer configuration used for the namespace
mgmofs.qdbcluster qdb.spd.ro:7777
mgmofs.qdbpassword_file /etc/eos.keytab
[...]
The configuration file above defines various settings for the MGM service, including MGM-specific variables, cluster authentication mechanisms, and the QDB connection details.
Now that the MGM configuration is set, let’s start the MGM service:
[root@mgm ~]$ eos daemon run mgm
DAEMON_COREFILE_LIMIT=unlimited
EOS_ALLOW_SAME_HOST_IN_GROUP=1
EOS_AUTOLOAD_CONFIG=default
EOS_BROKER_URL=root://localhost:1097//eos/
EOS_GEOTAG=local
EOS_HTTP_CONNECTION_MEMORY_LIMIT=4194304
EOS_HTTP_THREADPOOL=epoll
EOS_HTTP_THREADPOOL_SIZE=16
EOS_INSTANCE_NAME=eosdev
EOS_MGM_ALIAS=mgm.spd.ro
EOS_MGM_FUSEX_MAX_CHILDREN=262144
EOS_MGM_HOST=mgm.spd.ro
EOS_MGM_HOST_TARGET=mgm.spd.ro
EOS_MGM_HTTP_PORT=8000
EOS_MGM_LISTING_CACHE=0
EOS_MGM_MASTER1=mgm.spd.ro
EOS_MGM_MASTER2=mgm.spd.ro
EOS_NO_STACKTRACE=1
EOS_NS_ACCOUNTING=1
EOS_START_SYNC_SEPARATELY=1
EOS_SYNCTIME_ACCOUNTING=1
EOS_USE_MQ_ON_QDB=1
EOS_UTF8=""
EOS_XROOTD=/opt/eos/xrootd/
GEO_TAG=local
GSI=
INSTANCE_NAME=eosdev
KRB5=
KRB5RCACHETYPE=none
LD_LIBRARY_PATH=/opt/eos/xrootd//lib64:/opt/eos/grpc/lib64
LD_PRELOAD=/usr/lib64/libjemalloc.so
SERVER_HOST=mgm.spd.ro
XDG_CACHE_HOME=/var/cache/eos/
# ---------------------------------------
# ------------- i n i t -----------------
# ---------------------------------------
# run: mkdir -p /var/run/eos/
# run: chown daemon:root /var/run/eos/
# run: mkdir -p /var/cache/eos/
# run: chown daemon:root /var/cache/eos/
# run: if [ -e /etc/eos.keytab ]; then chown daemon /etc/eos.keytab ; chmod 400 /etc/eos.keytab ; fi
# run: mkdir -p /var/eos/md /var/eos/report
# run: chmod 755 /var/eos /var/eos/report
# run: mkdir -p /var/spool/eos/core/mgm /var/spool/eos/core/mq /var/spool/eos/core/fst /var/spool/eos/core/qdb /var/spool/eos/admin
# run: mkdir -p /var/log/eos
# run: chown -R daemon /var/spool/eos
# run: find /var/log/eos -maxdepth 1 -type d -exec chown daemon {} \;
# run: find /var/eos/ -maxdepth 1 -mindepth 1 -not -path "/var/eos/fs" -not -path "/var/eos/fusex" -type d -exec chown -R daemon {} \;
# run: chmod -R 775 /var/spool/eos
# run: mkdir -p /var/eos/auth /var/eos/stage
# run: chown daemon /var/eos/auth /var/eos/stage
# run: setfacl -m default:u:daemon:r /var/eos/auth/
# ---------------------------------------
# ------------- x r o o t d ------------
# ---------------------------------------
# running config file: /var/run/eos/xrd.cf.mgm
# ---------------------------------------
xrootd.fslib libXrdEosMgm.so
xrootd.seclib libXrdSec.so
xrootd.async off nosf
xrootd.chksum adler32
xrd.sched mint 8 maxt 256 idle 64
all.export / nolock
all.role manager
oss.fdlimit 16384 32768
sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
sec.protbind [::ffff:127.0.0.1] unix sss
sec.protbind localhost.localdomain unix sss
sec.protbind localhost unix sss
sec.protbind * only sss unix
mgmofs.fs /
mgmofs.targetport 1095
mgmofs.broker root://localhost:1097//eos/
mgmofs.instance eosdev
mgmofs.metalog /var/eos/md
mgmofs.txdir /var/eos/tx
mgmofs.authdir /var/eos/auth
mgmofs.archivedir /var/eos/archive
mgmofs.qosdir /var/eos/qos
mgmofs.reportstorepath /var/eos/report
mgmofs.autoloadconfig default
mgmofs.qoscfg /var/eos/qos/qos.conf
mgmofs.auththreads 64
mgmofs.authport 15555
mgmofs.authlocal 1
mgmofs.fstgw someproxy.cern.ch:3001
mgmofs.nslib /usr/lib64/libEosNsQuarkdb.so
mgmofs.qdbcluster qdb.spd.ro:7777
mgmofs.qdbpassword_file /etc/eos.keytab
#########################################
Register objects provided by NsQuarkdbPlugin ...
Using the eos ns command, let’s inspect the namespace of the EOS instance. The namespace is active, but no read or write operations have been performed yet:
[root@mgm ~]$ eos ns | head
# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL Files 7 [booted] (0s)
ALL Directories 18
ALL Total boot time 1 s
ALL Contention write: 0.00 % read:0.00 %
# ------------------------------------------------------------------------------------
ALL Replication is_master=true master_id=mgm.spd.ro:1094
# ------------------------------------------------------------------------------------
To verify that the MGM service started successfully, inspect the log file /var/log/eos/mgm/xrdlog.mgm or initialization messages:
[root@mgm ~]$ less +G /var/log/eos/mgm/xrdlog.mgm
251006 11:23:28 24249 Starting on Linux 5.14.0-503.14.1.el9_5.x86_64
251006 11:23:28 24249 eos-mgm -n mgm -c /var/run/eos/xrd.cf.mgm -l /var/log/eos/xrdlog.mgm -R daemon -s /var/run/eos/xrd.mgm.mgm.pid
Copr. 2004-2012 Stanford University, xrd version 5.8.4
++++++ eos-mgm mgm@mgm.spd.ro initialization started.
Config using configuration file /var/run/eos/xrd.cf.mgm
=====> xrd.sched mint 8 maxt 256 idle 64
Config maximum number of connections restricted to 524288
Config maximum number of threads restricted to 5569
Copr. 2012 Stanford University, xroot protocol 5.1.0 version 5.8.4
++++++ xroot protocol initialization started.
=====> xrootd.fslib libXrdEosMgm.so
=====> xrootd.seclib libXrdSec.so
=====> xrootd.async off nosf
=====> xrootd.chksum adler32
=====> all.export / nolock
Config exporting /
Plugin loaded secprot 5.8.4 from seclib libXrdSec-5.so
[...]
------ xroot protocol initialization completed.
------ eos-mgm mgm@mgm.spd.ro:1094 initialization completed.
[...]
Interacting with EOS
Now that our MGM node is up and running, we can interact with our EOS instance through a console-like interface that provides access to the full EOS command set.
To open the EOS console, simply run the command eos and when the console starts, you’ll see a banner similar to this:
[root@mgm ~]$ eos
# ---------------------------------------------------------------------------
# EOS Copyright (C) 2011-2025 CERN/Switzerland
# This program comes with ABSOLUTELY NO WARRANTY; for details type `license'.
# This is free software, and you are welcome to redistribute it
# under certain conditions; type `license' for details.
# ---------------------------------------------------------------------------
EOS_INSTANCE=eosdev
EOS_SERVER_VERSION=5.3.21 EOS_SERVER_RELEASE=1
EOS_CLIENT_VERSION=5.3.21 EOS_CLIENT_RELEASE=1
EOS Console [root://localhost] |/>
We are currently using the root user, so the console starts with administrative privileges, allowing us to access all EOS management commands. Otherwise, use the sudo prefix to start the console, since without it you will have access to a limited set of commands.
You can explore available commands by pressing TAB, which will display the complete list of commands. Let's try a few basic commands:
EOS Console [root://localhost] |/>
.q archive chmod convert df fileinfo fusex info ln motd oldfind rclone rm sched stat touch whoami
? attr chown cp du find geosched inspector ls mv pwd reconnect rmdir scitoken status tracker
access backup clear daemon evict fs group io map newfind qos recycle role silent test version
accounting cat config debug exit fsck health json member node quit register route space timing vid
acl cd console devices file fuse help license mkdir ns quota report rtlog squash token who
EOS Console [root://localhost] |/> whoami
Virtual Identity: uid=0 (0,3,65534) gid=0 (0,4,65534) [authz:sss] sudo* host=localhost domain=localdomain
To check out detailed usage information for any command, add the -h flag:
EOS Console [root://localhost] |/> attr -h
'[eos] attr ..' provides the extended attribute interface for directories in EOS.
Usage: attr [OPTIONS] ls|set|get|rm ...
Options:
attr [-r] ls <identifier> :
: list attributes of path
-r : list recursive on all directory children
attr [-r] set [-c] <key>=<value> <identifier> :
: set attributes of path (-r : recursive) (-c : only if attribute does not exist already)
attr [-r] set default=replica|raiddp|raid5|raid6|archive|qrain <identifier> :
: set attributes of path (-r recursive) to the EOS defaults for replicas, dual-parity-raid (4+2), raid-6 (4+2) or archive layouts (5+3).
-r : set recursive on all directory children
attr [-r] [-V] get <key> <identifier> :
: get attributes of path (-r recursive)
-r : get recursive on all directory children
-V : only print the value
attr [-r] rm <key> <identifier> :
: delete attributes of path (-r recursive)
-r : delete recursive on all directory children
attr [-r] link <origin> <identifier> :
: link attributes of <origin> under the attributes of <identifier> (-r recursive)
-r : apply recursive on all directory children
attr [-r] unlink <identifier> :
: remove attribute link of <identifier> (-r recursive)
-r : apply recursive on all directory children
attr [-r] fold <identifier> :
: fold attributes of <identifier> if an attribute link is defined (-r recursive)
all attributes which are identical to the origin-link attributes are removed locally
-r : apply recursive on all directory children
You can also execute EOS commands directly from your console without entering the interactive console. In this case, simply prefix each command with eos, for example:
[root@mgm ~]$ eos whoami
Virtual Identity: uid=0 (0,3,65534) gid=0 (0,4,65534) [authz:sss] sudo* host=localhost domain=localdomain
FSTs Setup
The FSTs are the nodes responsible for storing the actual physical data. In our EOS deployment, we will configure three FST nodes.
To enable communication between FST and the other services, copy the /etc/eos.keytab file created on the QDB file on the MGM file.
Before proceeding to the actual FST setup, we have to create the default storage space for the EOS instance:
[root@mgm ~]$ eos space define default
info: creating space 'default'
The command above creates a logical storage space named default. After defining the space, activate it:
[root@mgm ~]$ eos space set default on
Check out the EOS space configuration:
[root@mgm ~]$ eos space ls
┌──────────┬────────────────┬────────────┬────────────┬──────┬─────────┬───────────────┬──────────────┬─────────────┬─────────────┬──────────────┬──────┬──────┬──────────┬───────────┬───────────┬──────┬────────┬───────────┬──────┬────────┬───────────┐
│type │ name│ groupsize│ groupmod│ N(fs)│ N(fs-rw)│ sum(usedbytes)│ sum(capacity)│ capacity(rw)│ nom.capacity│sched.capacity│ usage│ quota│ balancing│ threshold│ converter│ ntx│ active│ wfe│ ntx│ active│ intergroup│
└──────────┴────────────────┴────────────┴────────────┴──────┴─────────┴───────────────┴──────────────┴─────────────┴─────────────┴──────────────┴──────┴──────┴──────────┴───────────┴───────────┴──────┴────────┴───────────┴──────┴────────┴───────────┘
spaceview default 0 24 0 0 0 B 0 B 0 B 0 B 0 B 0.00 off off 20 off 0 0 off 1 0 off
The default space is listed with its group settings, usage, and capacity, both currently empty. To provide actual storage, FST nodes need to be added to the EOS deployment.
Let’s register a filesystem under each /data01 and /data02 directories on each FST node, specifying the unique ID, hostname, service port, storage path, and the logical space group in which the filesystem will be placed:
[root@mgm ~]$ eos fs add fs-1 fst-1.spd.ro:1095 /data01
[root@mgm ~]$ eos fs add fs-2 fst-1.spd.ro:1095 /data02
[root@mgm ~]$ eos fs add fs-3 fst-2.spd.ro:1095 /data01
[root@mgm ~]$ eos fs add fs-4 fst-2.spd.ro:1095 /data02
[root@mgm ~]$ eos fs add fs-5 fst-3.spd.ro:1095 /data01
[root@mgm ~]$ eos fs add fs-6 fst-3.spd.ro:1095 /data02
To check that the filesystems were created, run:
[root@mgm ~]# eos fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬──────┬────────┬────────────────┐
│host │port│ id│ path│ schedgroup│ geotag│ boot│ configstatus│ drain│ usage│ active│ health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴──── ────────────┴────────────┴──────────────┴────────────┴──────┴────────┴────────────────┘
fst-1.spd.ro 1095 1 /data01 default.0 off nodrain 0.00
fst-1.spd.ro 1095 2 /data02 default.1 off nodrain 0.00
fst-2.spd.ro 1095 3 /data01 default.2 off nodrain 0.00
fst-2.spd.ro 1095 4 /data02 default.3 off nodrain 0.00
fst-3.spd.ro 1095 5 /data01 default.4 off nodrain 0.00
fst-3.spd.ro 1095 6 /data02 default.5 off nodrain 0.00
By default, EOS automatically assigns each newly created filesystem to its own group. Therefore after registration, EOS will create six scheduling groups:
default.0 - default.5. When writing data, EOS selects a single group and stores the file only on the filesystems that belong to that group. For example, to store files with two replicas, at least two filesystems must be in the same group. To use erasure coding, six filesystems are required per group. To enable both replication and erasure coding across on our EOS instance, we’ll move all the filesystems into a single group, default.0:
[root@mgm ~]# for name in 2 3 4 5 6; do eos fs mv --force $name default.0; done
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scan_rain_interval=2419200
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 2 moved to group default.0
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scan_rain_interval=2419200
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 3 moved to group default.0
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scan_rain_interval=2419200
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 4 moved to group default.0
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scan_rain_interval=2419200
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 5 moved to group default.0
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scan_rain_interval=2419200
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 6 moved to group default.0
Run eos fs ls again to ensure that all the filesystems were moved into the default.0 group:
[root@mgm ~]# eos fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬──────┬────────┬────────────────┐
│host │port│ id│ path│ schedgroup│ geotag│ boot│ configstatus│ drain│ usage│ active│ health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴────────────────┴────────────┴──────────────┴────────────┴──────┴────────┴────────────────┘
fst-1.spd.ro 1095 1 /data01 default.0 off nodrain 0.00
fst-1.spd.ro 1095 2 /data02 default.0 off nodrain 0.00
fst-2.spd.ro 1095 3 /data01 default.0 off nodrain 0.00
fst-2.spd.ro 1095 4 /data02 default.0 off nodrain 0.00
fst-3.spd.ro 1095 5 /data01 default.0 off nodrain 0.00
fst-3.spd.ro 1095 6 /data02 default.0 off nodrain 0.00
One last step, enable shared secret authentication to allow communication between the MGM and the FST nodes and grant the daemon user the necessary sudo privileges:
[root@mgm ~]$ eos vid enable sss
success: set vid [ eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.auth=sss mgm.vid.cmd=map mgm.vid.gid=0 mgm.vid.key=<key> mgm.vid.pattern=<pwd> mgm.vid.uid=0 ]
[root@mgm ~]$ eos vid set membership daemon +sudo
success: set vid [ eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.cmd=membership mgm.vid.key=daemon:root mgm.vid.source.uid=daemon mgm.vid.target.sudo=true ]
Now everything is set to start configuring the first FST node, fst-1.
Ensure that the TCP port 1094 is open on the FST host to expose the FST service:
[root@fst-1 ~]$ firewall-cmd --permanent --add-port=1095/tcp
success
[root@fst-1 ~]$ firewall-cmd --reload
success
Generate the default FST configuration file:
[root@fst-1 ~]# eos daemon config fst
# Config file at: /etc/eos/config/fst/fst
Edit the FST configuration file /etc/eos/config/fst/fst to set the MGM alias and QDB host and port as follows:
[root@fst-1 ~]# cat /etc/eos/config/fst/fst | head
# ------------------------------------------------------------ #
[sysconfig]
# ------------------------------------------------------------ #
EOS_XRDCP=${EOS_XROOTD}/bin/xrdcp
EOS_MGM_ALIAS=mgm.spd.ro
EOS_GEOTAG=local::geo
QDB_HOST=qdb.spd.ro
QDB_PORT=7777
Each FST node must host at least one filesystem to store data. Each filesystem requires a root directory on the node, where the physical copies of files will be stored. Let’s create the directory /data01 to host the first filesystem (fsid=1), assign the same unique filesystem ID used during registration on the MGM, and set the daemon user as the owner:
[root@fst-1 ~]# mkdir /data01
[root@fst-1 ~]# echo fs-1 > /data01/.eosfsuuid
[root@fst-1 ~]# chown -R daemon:daemon /data01
[root@fst-1 ~]# ls -al /data01
total 4
drwxr-xr-x. 2 root root 24 Sep 10 09:54 .
dr-xr-xr-x. 19 root root 247 Sep 10 09:54 ..
-rw-r--r--. 1 root root 6 Sep 10 09:54 .eosfsuuid
Exercise - Add another directory for second filesystem
Follow the same steps to create a second directory /data02, which will serve as the root directory for the second filesystem hosted by fst-1. Note: The filesystem ID must match the one registered on the MGM.
The first storage node is now ready to run the FST service:
[root@fst-1 ~]$ eos daemon run fst
EOS_GEOTAG=local::geo
EOS_MGM_ALIAS=mgm.spd.ro
EOS_USE_MQ_ON_QDB=1
EOS_XRDCP=/opt/eos/xrootd//bin/xrdcp
EOS_XROOTD=/opt/eos/xrootd/
GEO_TAG=local
INSTANCE_NAME=eosdev
LD_LIBRARY_PATH=/opt/eos/xrootd//lib64:/opt/eos/grpc/lib64
LD_PRELOAD=/usr/lib64/libjemalloc.so
QDB_HOST=qdb.spd.ro
QDB_PORT=7777
SERVER_HOST=fst-1.spd.ro
# ---------------------------------------
# ------------- i n i t -----------------
# ---------------------------------------
# ---------------------------------------
# ------------- u n s h a r e -----------
# ---------------------------------------
# run: mkdir -p /var/run/eos/
# run: chown daemon:root /var/run/eos/
# run: mkdir -p /var/cache/eos/
# run: chown daemon:root /var/cache/eos/
# run: if [ -e /etc/eos.keytab ]; then chown daemon /etc/eos.keytab ; chmod 400 /etc/eos.keytab ; fi
# run: mkdir -p /var/eos/md /var/eos/report
# run: chmod 755 /var/eos /var/eos/report
# run: mkdir -p /var/spool/eos/core/mgm /var/spool/eos/core/mq /var/spool/eos/core/fst /var/spool/eos/core/qdb /var/spool/eos/admin
# run: mkdir -p /var/log/eos
# run: chown -R daemon /var/spool/eos
# run: find /var/log/eos -maxdepth 1 -type d -exec chown daemon {} \;
# run: find /var/eos/ -maxdepth 1 -mindepth 1 -not -path "/var/eos/fs" -not -path "/var/eos/fusex" -type d -exec chown -R daemon {} \;
# run: chmod -R 775 /var/spool/eos
# run: mkdir -p /var/eos/auth /var/eos/stage
# run: chown daemon /var/eos/auth /var/eos/stage
# run: setfacl -m default:u:daemon:r /var/eos/auth/
# ---------------------------------------
# ------------- x r o o t d ------------
# ---------------------------------------
# running config file: /var/run/eos/xrd.cf.fst
# ---------------------------------------
xrd.network keepalive
xrd.port 1095
xrootd.fslib -2 libXrdEosFst.so
xrootd.async off nosf
xrootd.redirect mgm.spd.ro:1094 chksum
xrootd.seclib libXrdSec.so
sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
sec.protbind * only unix sss
all.export / nolock
all.trace none
all.manager localhost 2131
ofs.persist off
ofs.osslib libEosFstOss.so
ofs.tpc pgm /opt/eos/xrootd//bin/xrdcp
fstofs.broker root://localhost:1097//eos/
fstofs.autoboot true
fstofs.quotainterval 10
fstofs.metalog /var/eos/md/
fstofs.filemd_handler attr
fstofs.qdbcluster qdb.spd.ro:7777
fstofs.qdbpassword_file /etc/eos.keytab
#########################################
Check the FST log file /var/log/eos/qdb/xrdlog.qdb to ensure that the FST setup is completed:
[student@fst-1 ~]$ less +G /var/log/eos/fst/xrdlog.fst
251030 18:45:42 4228 Starting on Linux 5.14.0-570.46.1.el9_6.x86_64
251030 18:45:42 4228 eos-fst -n fst -c /var/run/eos/xrd.cf.fst -l /var/log/eos/xrdlog.fst -R daemon -s /var/run/eos/xrd.fst.fst.pid
Copr. 2004-2012 Stanford University, xrd version 5.8.4
++++++ eos-fst fst@fst-1.spd.ro initialization started.
Config using configuration file /var/run/eos/xrd.cf.fst
=====> xrd.network keepalive
=====> xrd.port 1095
251030 18:45:42 4228 XrdSetIF: Skipping duplicate private interface [::192.168.100.11]
Config maximum number of connections restricted to 524288
Config maximum number of threads restricted to 2891
Copr. 2012 Stanford University, xroot protocol 5.1.0 version 5.8.4
++++++ xroot protocol initialization started.
=====> xrootd.fslib -2 libXrdEosFst.so
=====> xrootd.async off nosf
=====> xrootd.redirect mgm:1094 chksum
=====> xrootd.seclib libXrdSec.so
=====> all.export / nolock
[...]
------ xroot protocol initialization completed.
------ eos-fst fst@fst-1.spd.ro:1095 initialization completed.
Exercise - Complete the EOS deployment
Follow the same steps on the remaining two storage nodes, fst-2 and fst-3.
Adding the storage space
Once the FST nodes are configured and running, the next step is to register their available storage space on the MGM. The /dataXY directories we previously created on each FST will serve as the root of the filesystems where EOS stores its data.
Check the list of FST nodes to confirm that they are online:
[root@mgm ~]$ eos node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬────────────────┬─────┐
│type │ hostport│ geotag│ status│ activated│ heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴────────────────┴─────┘
nodesview fst-1.spd.ro:1095 local::geo online ??? 0 2
nodesview fst-2.spd.ro:1095 local::geo online ??? 0 2
nodesview fst-3.spd.ro:1095 local::geo online ??? 0 2
Check the list of registered filesystems to confirm that they were correctly created:
[root@mgm ~]$ eos fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬──────┬────────┬────────────────┐
│host │port│ id│ path│ schedgroup│ geotag│ boot│ configstatus│ drain│ usage│ active│ health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴────────────────┴────────────┴─────── ───────┴────────────┴──────┴────────┴────────────────┘
fst-1.spd.ro:1095 1095 1 /data01 default.0 local::geo down off nodrain 26.20 no smartctl
fst-1.spd.ro:1095 1095 2 /data02 default.0 local::geo down off nodrain 26.20 no smartctl
fst-2.spd.ro:1095 1095 3 /data01 default.0 local::geo down off nodrain 26.16 no smartctl
fst-2.spd.ro:1095 1095 4 /data02 default.0 local::geo down off nodrain 26.16 no smartctl
fst-3.spd.ro:1095 1095 5 /data01 default.0 local::geo down off nodrain 26.16 no smartctl
fst-3.spd.ro:1095 1095 6 /data02 default.0 local::geo down off nodrain 26.16 no smartctl
Notice that the FST nodes are not activated and filesystems are down. Boot all registered filesystems:
[root@mgm ~]# eos -j fs ls | jq ".result[].id" | xargs -i eos fs boot {}
Set each filesystem to read-write mode so that files can be stored and retrieved:
[root@mgm ~]# eos -j fs ls | jq ".result[].id" | xargs -i eos fs config {} configstatus=rw
Activate the FST nodes:
[root@mgm ~]# eos -j node ls | jq ".result[].hostport" | xargs -i eos node set {} on
Check the status of all filesystems again. They should now appear as booted, in rw mode, and online:
[root@mgm ~]# eos fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬──────┬────────┬────────────────┐
│host │port│ id│ path│ schedgroup│ geotag│ boot│ configstatus│ drain│ usage│ active│ health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴────────────────┴────────────┴──────────────┴────────────┴──────┴────────┴────────────────┘
fst-1.spd.ro 1095 1 /data01 default.0 local::geo booted rw nodrain 26.56 online no smartctl
fst-1.spd.ro 1095 2 /data02 default.0 local::geo booted rw nodrain 26.56 online no smartctl
fst-2.spd.ro 1095 3 /data01 default.0 local::geo booted rw nodrain 26.50 online no smartctl
fst-2.spd.ro 1095 4 /data02 default.0 local::geo booted rw nodrain 26.50 online no smartctl
fst-3.spd.ro 1095 5 /data01 default.0 local::geo booted rw nodrain 26.79 online no smartctl
fst-3.spd.ro 1095 6 /data02 default.0 local::geo booted rw nodrain 26.79 online no smartctl
At the same time, check that the FST nodes are activated:
[root@mgm ~]# eos node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬────────────────┬─────┐
│type │ hostport│ geotag│ status│ activated│ heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴────────────────┴─────┘
nodesview fst-1.spd.ro:1095 local::geo online on 1 2
nodesview fst-2.spd.ro:1095 local::geo online on 0 2
nodesview fst-3.spd.ro:1095 local::geo online on 1 2
Finally, we can check that the storage space provided by the FST nodes is available and visible to the MGM and all the six filesystems are configured as rw:
[root@mgm ~]# eos space ls
┌──────────┬────────────────┬────────────┬────────────┬──────┬─────────┬───────────────┬──────────────┬─────────────┬─────────────┬──────────────┬──────┬──────┬──────────┬───────────┬───────────┬──────┬────────┬───────────┬──────┬────────┬───────────┐
│type │ name│ groupsize│ groupmod│ N(fs)│ N(fs-rw)│ sum(usedbytes)│ sum(capacity)│ capacity(rw)│ nom.capacity│sched.capacity│ usage│ quota│ balancing│ threshold│ converter│ ntx│ active│ wfe│ ntx│ active│ intergroup│
└──────────┴────────────────┴────────────┴────────────┴──────┴─────────┴───────────────┴──────────────┴─────────────┴─────────────┴──────────────┴──────┴──────┴──────────┴───────────┴───────────┴──────┴────────┴───────────┴──────┴────────┴───────────┘
spaceview default 0 24 6 6 13.52 GB 51.13 GB 51.13 GB 0 B 37.61 GB 26.45 off off 20 ??? 0 0 off 1 0 off
If the number of read-write filesystems is not 6, run eos space set default on again and wait for the configuration to reload.
Writing and reading files in EOS
EOS organizes its storage space across multiple filesystems, which are grouped together into storage groups. Each FST node can host one or more of these filesystems. From the user's perspective, EOS exposes a unified logical file hierarchy, but internally, every file path maps to a specific physical location on an FST node.
EOS supports different file layouts that dictate how data is stored across nodes. The layout is set per directory, allowing to manage data based on importance and performance needs.
1. Replica layout
The Replica layout in EOS stores complete copies of a file across multiple filesystems, based on a configurable replication factor. This approach ensures data redundancy and high availability, allowing file access even if one or more copies become unavailable.
When a write request is received by EOS, it creates multiple copies of the file according to the replication factor and stores them across different filesystems. These filesystems are selected from the same scheduling group based on predefined balancing rules, such as available space and node load.
By default, EOS creates a base directory at /eos/dev/. Before writing any data on the EOS instance, let's create a directory and define its replication policy with a replication factor of 2, ensuring that each file stored inside it is duplicated on two filesystems:
[root@mgm ~]# eos mkdir /eos/dev/replica # create the directory
[root@mgm ~]# eos attr -r set default=replica /eos/dev/replica # set replica layout
[root@mgm ~]# eos attr -r set sys.forced.nstripes=2 /eos/dev/replica # set replication factor to 2
[root@mgm ~]# sudo eos attr ls /eos/dev/replica # check directory settings
sys.eos.btime="1761832881.81783206"
sys.forced.blocksize="4k"
sys.forced.checksum="adler"
sys.forced.layout="replica"
sys.forced.nstripes="2"
sys.forced.space="default"
This configuration ensures that all files stored in /eos/dev/replica is duplicated on two FST nodes.
Let’s create a file and upload it to the /eos/dev/replica directory in EOS:
[root@mgm ~]# dd if=/dev/urandom of=my_file bs=1M count=10
[root@mgm ~]# eos cp ~/my_file /eos/dev/replica
[eoscp] my_file Total 10.00 MB |====================| 100.00 % [48.5 MB/s]
[eos-cp] copied 1/1 files and 10.49 MB in 0.30 seconds with 35.33 MB/s
Verify that the file exists by listing the contents of the /eos/dev/replica directory:
[root@mgm ~]# eos ls /eos/dev/replica
my_file
The directory /eos/dev/replica is part of the logical namespace that EOS exposes to users, internally EOS maps each logical file entry to one or more physical replicas stored across different FST nodes.
When a user writes to or reads from /eos/dev/replica/, they do not need to know where the actual data resides. This mapping between the logical file and its physical replicas is maintained in the file metadata, which can be inspected using the eos fileinfo command:
[root@mgm ~]# eos fileinfo /eos/dev/replica/my_file
File: '/eos/dev/replica/my_file' Flags: 0640
Size: 10485760
Status: healthy
Modify: Mon Nov 10 23:37:34 2025 Timestamp: 1762810654.531039000
Change: Mon Nov 10 23:37:34 2025 Timestamp: 1762810654.322678393
Access: Mon Nov 10 23:37:34 2025 Timestamp: 1762810654.322679714
Birth: Mon Nov 10 23:37:34 2025 Timestamp: 1762810654.322678393
CUid: 0 CGid: 0 Fxid: 00000010 Fid: 16 Pid: 17 Pxid: 00000011
XStype: adler XS: b1 29 e8 5a ETAGs: "4294967296:b129e85a"
Layout: replica Stripes: 2 Blocksize: 4k LayoutId: 00100012 Redundancy: d1::t0
#Rep: 2
┌───┬──────┬────────────────────────┬────────────────┬────────────────┬──────────┬──────────────┬────────────┬────────┬────────────────────────┐
│no.│ fs-id│ host│ schedgroup│ path│ boot│ configstatus│ drain│ active│ geotag│
└───┴──────┴────────────────────────┴────────────────┴────────────────┴──────────┴──────────────┴────────────┴────────┴────────────────────────┘
0 1 fst-1.spd.ro default.0 /data01 booted rw nodrain online local::geo
1 2 fst-1.spd.ro default.0 /data02 booted rw nodrain online local::geo
*******
The metadata of a file includes details such as the logical filename, the unique file identifier (Fxid), and the replication factor (#Rep).
In this example, EOS created two physical replicas of the file, stored on the filesystems with fs-id 1 and fs-id 2, both located on the fst-1 node.
Exercise
- Write other files in the
/eos/dev/replicadirectory. Inspect them using theeos fileinfocommand and observe which filesystems store the replicas. - On each storage node,
eos-fst-1,eos-fst-2, andeos-fst-3, inspect the/dataXYdirectories to see how EOS organizes physical files. - Create a new EOS directory and set the replication factor to 3. Populate the directory with files and use
eos fileinfoto observe how the files are distributed across the six filesystems.
Hint: Add the --fullpath flag to the eos fileinfo command to get the complete physical path of each replica on the FST nodes.
Reading replica files
When a client wants to read a file, the MGM retrieves the file metadata and points the client to one of the FSTs that stores a copy of the file.
Let's download the file we previously stored on our EOS instance:
[root@mgm ~]# eos cp /eos/dev/replica/my_file ~/my_file_from_eos
[eoscp] my_file Total 0.00 MB |====================| 100.00 % [209.7 MB/s]
[eos-cp] copied 1/1 files and 50 MB in 0.11 seconds with 97.09 MB/s
Verify that the original file with the one retrieved from EOS are identical:
[root@mgm ~]# diff ~/my_file ~/my_file_from_eos
2. RAIN layout
RAIN (Redundant Array of Independent Nodes) is a fault-tolerant layout in EOS that uses erasure coding to protect data with lower storage overhead than full replication.
Instead of storing full copies, erasure coding splits each file into data blocks and parity blocks, which are distributed across multiple FST nodes. For example, a 4+2 configuration splits a file into 4 data blocks and 2 parity blocks, allowing the system to tolerate up to 2 missing blocks without data loss.
The RAIN write process involves several steps:
- The original file is divided into multiple data blocks using erasure coding algorithms.
- Parity blocks are computed from the data blocks to support reconstruction in case of failure.
- Data and parity blocks are distributed across different filesystems from the same group.
Let's create a directory /eosdev/rain and define its layout to raid6, ensuring that each file stored inside it are split in 4 data blocks and 2 parity blocks:
[root@mgm ~]# eos mkdir /eos/dev/rain # create the directory
[root@mgm ~]# eos attr -r set default=raid6 /eos/dev/rain # set raid6 layout
[root@mgm ~]# eos attr ls /eos/dev/rain # check directory settings
sys.eos.btime="1761822399.256165363"
sys.forced.blockchecksum="crc32c"
sys.forced.blocksize="1M"
sys.forced.checksum="adler"
sys.forced.layout="raid6"
sys.forced.nstripes="6"
sys.forced.space="default"
Let’s create a file and upload it to the /eos/dev/rain directory in EOS:
[root@mgm ~]# dd if=/dev/urandom of=my_rain_file bs=1M count=10
[root@mgm ~]# eos cp ~/my_rain_file /eos/dev/rain
[eoscp] my_rain_file Total 10.00 MB |====================| 100.00 % [59.9 MB/s]]
[eos-cp] copied 1/1 files and 10.49 MB in 0.46 seconds with 22.86 MB/s
List the contents of the /eos/dev/rain directory to check that the file was uploaded:
[root@mgm ~]# eos ls /eos/dev/rain
my_rain_file
Use the eos fileinfo command to inspect the erasure coded file:
[root@mgm ~]# eos fileinfo /eos/dev/rain/my_rain_file
File: '/eos/dev/rain/my_rain_file' Flags: 0640
Size: 10485760
Status: healthy
Modify: Mon Nov 10 17:00:19 2025 Timestamp: 1762786819.895897000
Change: Mon Nov 10 17:00:19 2025 Timestamp: 1762786819.226030661
Access: Mon Nov 10 17:00:19 2025 Timestamp: 1762786819.226032180
Birth: Mon Nov 10 17:00:19 2025 Timestamp: 1762786819.226030661
CUid: 0 CGid: 0 Fxid: 0000000b Fid: 11 Pid: 18 Pxid: 00000012
XStype: adler XS: af 61 ca 06 ETAGs: "2952790016:af61ca06"
Layout: raid6 Stripes: 6 Blocksize: 1M LayoutId: 20640542 Redundancy: d3::t0
#Rep: 6
┌───┬──────┬────────────────────────┬────────────────┬────────────────┬──────────┬──────────────┬────────────┬────────┬────────────────────────┐
│no.│ fs-id│ host│ schedgroup│ path│ boot│ configstatus│ drain│ active│ geotag│
└───┴──────┴────────────────────────┴────────────────┴────────────────┴──────────┴──────────────┴────────────┴────────┴────────────────────────┘
0 1 fst-1.spd.ro default.0 /data01 rw nodrain online local::geo
1 2 fst-1.spd.ro default.0 /data02 rw nodrain online local::geo
2 6 fst-3.spd.ro default.0 /data02 rw nodrain online local::geo
3 4 fst-2.spd.ro default.0 /data02 rw nodrain online local::geo
4 3 fst-2.spd.ro default.0 /data01 rw nodrain online local::geo
5 5 fst-3.spd.ro default.0 /data01 rw nodrain online local::geo
*******
In this example, the file layout is set to raid6, meaning that EOS splits the file into six stripes and distributes them across the six available filesystems. Because EOS writes all stripes within a single selected group, we previously placed all filesystems in the same storage group to ensure our EOS instance supports erasure coded files.
Reading RAIN files
Reading from RAIN storage requires reconstructing the original file from its distributed fragments:
- The MGM identifies the locations of all required stripes.
- Using available data stripes and parity information, EOS reconstructs the original file.
- The system can read the stripes from different filesystems in parallel to improve performance.
Let's download the file we previously stored on our EOS instance:
[root@mgm ~]# eos cp /eos/dev/rain/my_rain_file ~/my_file_from_eos
[eoscp] my_rain_file Total 10.00 MB |====================| 100.00 % [9.3 MB/s]]
[eos-cp] copied 1/1 files and 10.49 MB in 1.51 seconds with 6.95 MB/s
Verify that the original file with the one retrieved from EOS are identical:
[root@mgm ~]# diff ~/my_rain_file ~/my_file_from_eos
Exercise
- Write other files in the
/eos/dev/raindirectory. Inspect them using theeos fileinfocommand and observe which fileystems nodes store the stripes. - On each storage node,
eos-fst-1,eos-fst-2, andeos-fst-3, inspect the/dataXYdirectories to see how EOS organizes the stripes. Compare the size of individual stripes with the size of the original file. - Create a new EOS directory and set a different RAIN layout (e.g.,
raid5). Populate the directory with files and useeos fileinfoto observe how the stripes are distributed across the six filesystems.
Hint: Add the --fullpath flag to the eos fileinfo command to display the complete physical path of each stripe on the FST nodes.
Hin: To check out all RAIN layouts supported by EOS, check out eos attr -h and look for the sys.forced.layout variable.
Key differences between Replica and RAIN layouts
- Storage Efficiency vs Performance: RAIN provides greater storage efficiency by storing only fragments and parity instead of full copies. However, this comes with higher computational overhead due to the need for encoding during writes and reconstruction during reads. Replica layouts, on the other hand, are less efficient in terms of storage but offer faster and simpler access to data.
- Fault Tolerance: Both provide fault tolerance, but RAIN can recover from more complex failure scenarios
- Use Cases: Replica is ideal for frequently accessed data, while RAIN is better for large, less frequently accessed data
Access Management in EOS
EOS Virtual Identities
EOS uses a VID (Virtual Identity) system to manage user authentication and authorization. When a user connects to EOS, the system maps their credentials to a virtual identity.
Every operation in EOS is performed in the context of a virtual identity. The MGM translates each user credentials into a VID that determines the operations that can be performed by the user.
Check your current virtual identity when using the root user:
[root@mgm ~]# eos whoami
Virtual Identity: uid=0 (0,3,65534) gid=0 (0,4,65534) [authz:sss] sudo* host=localhost domain=localdomain
The above output shows uid=0 (user ID for root), gid=0 (group ID for root), [authz:sss] (authentication method using shared secret system), and sudo* (has sudo privileges).
Meanwhile, the user student is mapped to nobody (UID 65534):
[student@mgm ~]# eos whoami
Virtual Identity: uid=65534 (65534) gid=65534 (65534) [authz:unix] host=localhost domain=localdomain
The student user authenticates via [authz:unix] (Unix authentication) instead of sss, and is mapped to the nobody user. We'll edit this mapping later using the EOS VID configuration.
Understanding POSIX Permissions vs ACLs
When you create a directory in EOS, it gets default POSIX permissions like drwxr-xr-x (owner has full access, others can read). ACLs allow you to grant specific permissions to individual users or groups beyond what POSIX permissions allow. For a complete list of ACL rules, see the EOS ACL documentation
Example: Granting Write Access to a Non-Sudo User
Let's demonstrate how to use ACLs to grant the user student write access to a directory, configuring a virtual identity mapping and and an ACL.
First, create a test directory as root and upload a file:
[root@mgm ~]# eos mkdir /eos/dev/shared
[root@mgm ~]# eos attr set default=replica /eos/dev/shared
[root@mgm ~]# eos attr set sys.forced.nstripes=2 /eos/dev/shared
[root@mgm ~]# echo "This is a shared file" > ~/shared_file
[root@mgm ~]# eos cp ~/shared_file /eos/dev/shared/
[eoscp] shared_file Total 0.00 MB |====================| 100.00 % [0.0 MB/s]
[eos-cp] copied 1/1 files and 22 B in 0.11 seconds with 194 B/s
Check the default permissions:
[root@mgm ~]# eos ls -l /eos/dev/ | grep shared
drwxrwxr-x 1 root root 22 Nov 11 00:24 shared
The permissions drwxrwxr-x mean "others" can read (r), but not write (w).
Reading from /eos/dev/shared as user student works because the default POSIX permissions allow read access:
[student@mgm ~]$ eos ls /eos/dev/shared
shared_file
[student@mgm ~]$ eos cp /eos/dev/shared/shared_file ~/my_copy
[eoscp] shared_file Total 0.00 MB |====================| 100.00 %
Now try to write:
[student@mgm ~]$ echo "Hello from student!" > ~/my_file
[student@mgm ~]$ eos cp ~/my_file /eos/dev/shared/
[student@mgm ~]$ eos cp ~/my_file /eos/dev/shared/
Secsss (getKeyTab): Unable to open /etc/eos.keytab; permission denied
Unable to open keytab file.
Secsss (getKeyTab): Unable to open /etc/eos.keytab; permission denied
Unable to open keytab file.
error: target file open failed - errno=13 : Permission denied [[ERROR] Server responded with an error: [3010] Unable to open file /eos/dev/shared/my_shared_file; Operation not permitted
]
error: failed copying path=root://localhost//eos/dev/shared/my_file
#WARNING [eos-cp] copied 0/1 files and 0 B in 0.08 seconds with 0 B/s
The user student needs permissions for write access.
You may see Unable to open /etc/eos.keytab warnings. These can be overlooked since the student user authenticates via Unix authentication (authz:unix), not shared secret system (sss).
Before setting an ACL, we need to ensure the student user is properly mapped to a virtual identity in EOS.
Start by enabling Unix authentication:
[root@mgm ~]# eos vid enable unix
success: set vid [ eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.auth=unix mgm.vid.cmd=map mgm.vid.gid=99 mgm.vid.key=<key> mgm.vid.pattern=<pwd> mgm.vid.uid=99 ]
Check current virtual identity of the user student :
[student@mgm ~]$ eos whoami
Virtual Identity: uid=99 (99,65534) gid=99 (99) [authz:unix]
By default, EOS maps Unix-authenticated users to a non-privileged virtual identity with UID 99. Let's configure EOS VID to map the user student to virtual UID 1000 and GID 1000:
[root@mgm ~]# eos vid set map -unix "student" vuid:1000 vgid:1000
success: set vid [ eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.auth=unix mgm.vid.cmd=map mgm.vid.gid=1000 mgm.vid.key=<key> mgm.vid.pattern=student mgm.vid.uid=1000 ]
Check that the mapping was applied:
[student@mgm ~]$ eos whoami
Virtual Identity: uid=1000 (1000) gid=1000 (1000) [authz:unix] host=localhost domain=localdomain
Now when a user authenticates using unix authentication with the username student, EOS VID maps them to virtual UID 1000 and GID 1000.
By default, only POSIX permissions are checked in EOS. User ACL evaluation must be explicitly enabled on directories:
[root@mgm ~]# eos attr set sys.eval.useracl=1 /eos/dev/shared
[root@mgm ~]# eos attr get sys.eval.useracl /eos/dev/shared
sys.eval.useracl="1"
Now grant the student user write permissions using ACLs:
[root@mgm ~]# eos acl --user u:1000=rwx /eos/dev/shared
The --user flag sets user ACLs, u:1000 specifies the user with UID 1000, and =rwx grants read, write, and browse permissions.
Check that the ACL was correctly set:
[root@mgm ~]# eos acl --list /eos/dev/shared
# user.acl
u:1000:rwx
Check for the ACL indicator (+ sign) in /eos/dev/shared directory permissions:
[root@mgm ~]# eos ls -l /eos/dev/ | grep shared
drwxrwxr-+ 1 root root 22 Nov 11 00:24 shared
Now the user student should be able to write in the /eos/dev/shared/ directory:
[student@mgm ~]$ eos cp ~/my_file /eos/dev/shared/
[eoscp] my_file Total 0.00 MB |====================| 100.00 % [0.0 MB/s]
[eos-cp] copied 1/1 files and 18 B in 0.11 seconds with 194 B/s
[student@mgm ~]$ eos ls /eos/dev/shared
shared_file
my_file
Error detection in EOS
Distributed storage systems are vulnerable to bit rot (the gradual degradation of stored data over time) and other incidents that can corrupt or damage files. When such issues occur, users encounter errors while trying to access the affected files.
To maintain data consistency and availability, EOS has the FSCK (Filesystem Consistency Check) mechanism. It can detect missing or corrupted files and when possible, it attempts to repair these issues using healthy replicas of a file stored on other filesystems. For a full description of all detected errors and how they are handled, see the (EOS documentation)[https://eos-docs.web.cern.ch/diopside/manual/microservices.html#error-types-detected-by-fsck].
To activate the FSCK mechanism in EOS, first set the interval fsck will scan the filesystems to 300 seconds (5 min):
[root@mgm ~]# for name in 1 2 3 4 5 6 ; do eos fs config $name scaninterval=300; done
[root@mgm ~]# eos fs status 1 | grep scaninterval # check that scaninterval was set
scaninterval := 300
Next we need to activate collection and repair threads:
[root@mgm ~]# eos fsck config toggle-collect
Now the eos fsck stat command should show the two collections of threads as running:
[root@mgm ~]# eos fsck stat
Info: collection thread status -> enabled
Info: repair thread status -> disabled
Info: repair category -> all
Info: best effort -> false
251031 17:21:37 1761931297.078495 Start error collection
251031 17:21:37 1761931297.078519 Filesystems to check: 6
251031 17:21:37 1761931297.086203 Finished error collection
251031 17:21:37 1761931297.086207 Next run in 30 minutes
Now the FSCK mechanism will scan the filesystems every 5 minute to discover errors with files. The filesystems are scanned at defined intervals, in our case every 5 minutes, and the errors are collected locally on each FST node. A dedicated FSCK collection thread on the MGM then gathers these results at configured intervals (by default, every 30 minutes) and assembles a comprehensive error report. If the FSCK repair thread is enabled, the MGM will automatically trigger repair actions when necessary.
Exercise: EOS FSCK in action
Now running the eos fsck report command should show nothing since all files are healthy for now. To see this FSCK error discovery and repairing mechanisms in action, let's simulate a damaged file:
- Connect to one of the FSTs file is stored and delete one of the copies of a file we previously stored
- Now wait for the FSCK mechanism to discover the issue and the file should appear as missing in the FSCK report
Once the file appears in the FSCK report, let's enable the repairing mechanism and let FSCK handle the issue:
[root@mgm ~]# eos fsck config toggle-repair
[root@mgm ~]# eos fsck stat
Info: collection thread status -> enabled
Info: repair thread status -> enabled
Info: repair category -> all
Info: best effort -> false
251031 17:21:37 1761931297.078495 Start error collection
251031 17:21:37 1761931297.078519 Filesystems to check: 6
251031 17:21:37 1761931297.086203 Finished error collection
251031 17:21:37 1761931297.086207 Next run in 30 minutes
Since EOS FSCK repair is now activated, the file will be saved by duplicating the existing replica on the other filesystem.