No space left on device with existing FST nodes

Hello!
I’ve set up an EOS cluster with separate FST nodes:

[root@master1 ~]# eos node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬────────────────┬─────┐
│type      │                        hostport│          geotag│    status│   activated│  heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴────────────────┴─────┘
 nodesview                     fst1.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst2.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst3.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst4.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst5.eos:1095       local::geo     online           on                1     0 

but I see no FS and groups available:

[root@master1 ~]# eos group ls
[root@master1 ~]# eos fs ls

though, I have default space created:

[root@master1 ~]# eos space ls -l 
┌──────────┬────────────────┬────────────┬────────────┬──────┬─────────┬───────────────┬──────────────┬─────────────┬─────────────┬──────┐
│type      │            name│   groupsize│    groupmod│ N(fs)│ N(fs-rw)│ sum(usedbytes)│ sum(capacity)│ capacity(rw)│ nom.capacity│ quota│
└──────────┴────────────────┴────────────┴────────────┴──────┴─────────┴───────────────┴──────────────┴─────────────┴─────────────┴──────┘
 spaceview           default            0           24      0         0             0 B            0 B           0 B           0 B    off 
 ----------------------------------------------------------------------------------------------------------------------------------------

I can create directories, so metadata gets stored, but can’t create files:

[root@master1 ~]# eos mkdir /eos/dev/test/qqqq
[root@master1 ~]# eos cp /etc/passwd /eos/dev/test
error: target file open failed - errno=28 : No space left on device [[ERROR] Error response: no space left on device]
error: failed copying path=root://localhost//eos/dev/test/passwd
#WARNING [eos-cp] copied 0/1 files and 0 B in 0.05 seconds with 0 B/s
[root@master1 ~]# eos ls /eos/dev/test
qqqq

Please help me understand what I miss for storing data.

Yes,
on the FSTs you have to register filesystems local to the FST.
Here I register one filesystem in the ‘default’ space:

mkdir -p /data/01
eosfstregister -r master1.eos:1094 /data/01 default:1
###########################
# <eosfstregister> v1.0.0
###########################
/data/01 : uuid=b4d549cf-90a8-4965-a0b4-975f6949abb9 fsid=undef
success: mapped 'b4d549cf-90a8-4965-a0b4-975f6949abb9' <=> fsid=7
success: boot message sent to fst1.eos:/data/01

Still having a permission denied error:

[root@fst5 ~]# eosfstregister -r master1.eos:1094 /data/01 default:1
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LC_MEASUREMENT = "de_AT.UTF-8",
        LC_PAPER = "de_AT.UTF-8",
        LC_MONETARY = "de_AT.UTF-8",
        LC_NAME = "de_AT.UTF-8",
        LC_ADDRESS = "de_AT.UTF-8",
        LC_NUMERIC = "de_AT.UTF-8",
        LC_TELEPHONE = "de_AT.UTF-8",
        LC_IDENTIFICATION = "de_AT.UTF-8",
        LC_TIME = "C",
        LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
###########################
# <eosfstregister> v1.0.0
###########################
/data/01 : uuid=894b4e1e-4e1e-41ec-ae15-6c925230c13d fsid=undef
error: errc=3010 msg="[ERROR] Error response: permission denied"
error: errc=3010 msg="[ERROR] Error response: permission denied"

So far, I’ve got “u:eosnobody g:eosnobody n:eosnobody” in both /etc/eos.keytab and /etc/eos/fuse.sss.keytab, copied across all nodes, but on master1.eos I have no groups:

[root@master1 ~]# eos group ls

and an error in /var/log/eos/mgm/Clients.log:

240118 13:55:25 INFO  [00099/00099]                - ::IdMap            sec.prot=sss sec.name="eosnobody" sec.host="fst5.eos" sec.vorg="" sec.grps="eosnobody" sec.role="" sec.info="" sec.app="" sec.tident="root.1025:418@fst5" vid.uid=99 vid.gid=99 sudo=0 gateway=0 
240118 13:55:25 INFO  [00099/00099]        eosnobody ::open             op=read path=/proc/admin/ info=mgm.cmd.proto=MigSJhokNmI3ZDM3Y2YtMTY2ZC00YjhjLWI3YjUtNWQ3MTc0NGI4ZGZh 
240118 13:55:25 ERROR [00099/00099]        eosnobody ::Emsg             Unable to execute proc command - you don't have the requested permissions for that operation (2) /proc/admin/; Operation not permitted 

The main entry you need is for the ‘daemon’ in /etc/eos.keytab
The order in the file is also important. For the time being you can delete the eosnfsnobody entry and this should resolve the problem.

You can also try on the FST:
env XrdSecPROTOCOL=sss eos root://master1 whoami
And it should say, that you are daemon and not eosnfsnobody !

Looks more interesting:

[root@master1 mgm]# eos node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬────────────────┬─────┐
│type      │                        hostport│          geotag│    status│   activated│  heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴────────────────┴─────┘
 nodesview                     fst1.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst2.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst3.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst4.eos:1095       local::geo     online           on                1     0 
 nodesview                     fst5.eos:1095       local::geo    offline           on             6384     1 

[root@master1 mgm]# eos group ls
┌──────────┬────────────────┬────────────┬──────┬────────────┬────────────┬────────────┬──────────┬──────────┐
│type      │            name│      status│ N(fs)│ dev(filled)│ avg(filled)│ sig(filled)│ balancing│   bal-shd│
└──────────┴────────────────┴────────────┴──────┴────────────┴────────────┴────────────┴──────────┴──────────┘
 groupview         default.0          ???      1         0.00         0.00         0.00       idle          0 
 groupview         eosnobody           on      0         0.00         0.00         0.00       idle          0 

[root@master1 mgm]# eos space ls
┌──────────┬────────────────┬────────────┬────────────┬──────┬─────────┬───────────────┬──────────────┬─────────────┬─────────────┬──────────────┬──────┬──────┬──────────┬───────────┬───────────┬──────┬────────┬───────────┬──────┬────────┬───────────┐
│type      │            name│   groupsize│    groupmod│ N(fs)│ N(fs-rw)│ sum(usedbytes)│ sum(capacity)│ capacity(rw)│ nom.capacity│sched.capacity│ usage│ quota│ balancing│  threshold│  converter│   ntx│  active│        wfe│   ntx│  active│ intergroup│
└──────────┴────────────────┴────────────┴────────────┴──────┴─────────┴───────────────┴──────────────┴─────────────┴─────────────┴──────────────┴──────┴──────┴──────────┴───────────┴───────────┴──────┴────────┴───────────┴──────┴────────┴───────────┘
 spaceview           default            0           24      1         0             0 B            0 B           0 B           0 B            0 B   0.00    off        off          20         off      2        0         off      1        0         off 

after I removed eosnobody on FST node:

[root@fst5 ~]# env XrdSecPROTOCOL=sss eos root://master1.eos whoami
Virtual Identity: uid=2 (2,99) gid=2 (2,99) [authz:sss] host=fst5.eos domain=eos

Removed eosnobody from all /etc/eos.keytab files:

[root@master1 ~]# eos node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬────────────────┬─────┐
│type      │                        hostport│          geotag│    status│   activated│  heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴────────────────┴─────┘
 nodesview                     fst5.eos:1095       local::geo     online           on                1     1 

[root@master1 ~]# eos fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬──────┬────────┬────────────────┐
│host                    │port│    id│                            path│      schedgroup│          geotag│        boot│  configstatus│       drain│ usage│  active│          health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴────────────────┴────────────┴──────────────┴────────────┴──────┴────────┴────────────────┘
 fst5.eos                 1095      1                         /data/01        default.0       local::geo  bootfailure             rw      nodrain  15.00               no smartctl 

But having bootfailure on fs, and same “No space left on device” error.

Thanks @apeters !
I’ve cleaned all spaces and fs up, recreated them from the ground up and finally I can store data in EOS!