You have to take role 'root' to execute this command

barbet · June 15, 2020, 1:21pm

Hello,
On a test EOS cluster, I wanted to simulate the lost of one disk. There are filesystems that are unreacheable but, as I have 2 spare filesystems, I wanted to add one in the default.0 group, then drain one of the unreacheable filesystems to trigger re-replication (is that correct to think that this is the way to act in case of a failure of a filesysem (disk) ?)…
So, I tried to move an empty filesystem from the “spare” group to “default.0” group but I get the message:

EOS Console [root://localhost] |/> fs mv 5 default.0
error: you have to take role ‘root’ to execute this command
EOS Console [root://localhost] |/> whoami
Virtual Identity: uid=0 (2,99,3,0) gid=0 (99,4,0) [authz:sss] sudo* host=localhost domain=localdomain geo-location=NANTES

What am I supposed to do in this case, I am root on the server …
Thanks
JM

barbet · June 18, 2020, 6:28am

Hello, apparently, this is because I am logged on the slave manager. On the master, I get:

EOS Console [root://localhost] |/> fs mv 5 default.0
error: reached maximum number of file systems for group default.0

Which is another problem… Why can’t I move this (empty) filesystem to group default.0 ?

EOS Console [root://localhost] |/> fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬─────────────────┬────────────┬──────────────┬────────────┬────────┬────────────────┐
│host                    │port│    id│                            path│      schedgroup│           geotag│        boot│  configstatus│       drain│  active│          health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴─────────────────┴────────────┴──────────────┴────────────┴────────┴────────────────┘
 nanxrd15.in2p3.fr        1095      1                          /data01        default.0      NANTES::H002       booted             rw      nodrain   online              N/A 
 nanxrd16.in2p3.fr        1095      2                          /data01        default.0      NANTES::H002       booted             rw      nodrain   online      no smartctl 
 nanxrd17.in2p3.fr        1095      4                          /data02        default.0      NANTES::H002       booted             rw      nodrain   online      no smartctl 
 nanxrd15.in2p3.fr        1095      6                          /data02        default.0      NANTES::H002       booted             rw      nodrain   online              N/A 
 clr-testeos01.in2p3.fr   1095      7                          /data01        default.0 CLERMONT::RDC7002       booted             rw      nodrain  offline      no smartctl 
 clr-testeos01.in2p3.fr   1095      8                          /data02        default.0 CLERMONT::RDC7002       booted             rw      nodrain  offline      no smartctl 
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
 nanxrd17.in2p3.fr        1095      3                          /data01            spare      NANTES::H002       booted          empty      nodrain   online      no smartctl 
 nanxrd16.in2p3.fr        1095      5                          /data02            spare      NANTES::H002       booted          empty      drained   online      no smartctl

Thanks
JM

esindril · June 19, 2020, 2:22pm

Hi JM,

Can you also paste the output of eos space ls? I suspect the groupsize parameter is reached.

Cheers,
Elvin

barbet · June 22, 2020, 12:17pm

Hello Elvin,
You are right, at some point, there was non ull values for groupsize/groupmod and I admit having reset them to 0 because it was not clear what role they played. Now they are 0/0:

space ls
200622 14:04:58 19132 secgsi_ParseCAlist: nothing to parse
secgsi: unknown CA: cannot verify server certificate
┌──────────┬────────────────┬────────────┬────────────┬──────┬─────────┬───────────────┬──────────────┬─────────────┬─────────────┬──────┬──────────┬───────────┬───────────┬──────┬────────┬───────────┬──────┬────────┬───────────┐
│type │ name│ groupsize│ groupmod│ N(fs)│ N(fs-rw)│ sum(usedbytes)│ sum(capacity)│ capacity(rw)│ nom.capacity│ quota│ balancing│ threshold│ converter│ ntx│ active│ wfe│ ntx│ active│ intergroup│
└──────────┴────────────────┴────────────┴────────────┴──────┴─────────┴───────────────┴──────────────┴─────────────┴─────────────┴──────┴──────────┴───────────┴───────────┴──────┴────────┴───────────┴──────┴────────┴───────────┘
spaceview default 0 0 7 6 52.30 GB 10.99 TB 10.00 TB 0 B off off 20 on 2 0 off 1 0 off
spaceview spare 0 0 1 0 35.92 MB 999.50 GB 0 B 0 B off off 20 off 2 0 off 1 0 off

Now I have to go back to the original question:
How do you do when a filesystem becomes unavailable and you have files in EOS in RAID6 mode and having a chunk in this filesystem ? Do you add an empty filesystem in the group and trigger some kind of repair operation ?
Could you explain the role the the variables groupsize/groupmod ?
Thank you

JM

esindril · June 22, 2020, 2:01pm

Hi JM,

The groupmod is the maximum number of file systems you can have in a group. This should be larger or equal to number of file systems on your largest (in terms of disk) machine. Think you have a machine with 196 disk then this value needs to be >=196 otherwise you won’t be able to attach them to any group - since there is already a disk from that machine in every group.

The groupsize it the maximum number of nodes you can have in a scheduling group. This should always be > 4 (for replica and bigger than the number of RAIN stripes + 1 depending on the RAIN layout that you use) and then the upper value smaller or equal to the number of nodes (machines) you have in your cluster.

When one file system becomes unavailable, you put in drain mode and the drain engine will take care of recreating the missing stripe for that RAIN file on a new disk. Note that you do need to have one spare file system in that group for this to work.

Cheers,
Elvin

barbet · June 23, 2020, 12:28pm

Thank you Elvin,
Not sure I have understood 100%. Let’s take an example if you do not mind. Say I have bought 6 servers each having 8 1TB disks and want to create one single space with a RAIN 4+2 capability.
I would create 7 groups, each group having 6 disks (1 fs par disk), one in each server. Right ?
If I want to keep spare disks, I have to put them in another group named “spare” ?
I create the space “default” with the 7 groups default.x.
In this case, what would be the values for groupsize/groupmod ?
groupsize > 6 stripes+1, that is : 7 at a minimum ?
groupmod ? # of disks per server (take the largest) ? : 8 in our example ?
Is there a minimum EOS version for the recreation of missiing stripes to work ? It did not work well when I tried…
Thanks
JM

esindril · June 25, 2020, 8:11am

Hi JM,

In your particular example, since you have nodes with 8 disks then you should have at least 8 groups so that you are able to use all the disks.

For a more visual example, please have a look at this paper at page 5 at the top:
https://iopscience.iop.org/article/10.1088/1742-6596/608/1/012009/pdf

If you only have 6 servers (nodes) then you can only put at max 6 file systems in a group which is not good for RAIN(4+2). If a disk fails then you can not drain or recover it. Also you won’t be able to write new data. Therefore, 7 servers is the minimum I would advise in such a setup. So for a setup with 7 servers and 8 disks each the values would be:
groupsize >= 7
groupmod >= 8

Any 4.7 release should have the draining working properly for any RAIN layout.

Cheers,
Elvin

dpugnere · September 3, 2020, 3:00pm

Hi @esindril @barbet,

I have also now this problem :

[root@lyoeosmgm1 ~]# eos -b fs add `uuidgen` lyostorage18.in2p3.fr:1095 /disk1/eos default.1
error: scheduling group default.1 is full
error: no group available for file system

I never had this before, I’m now in eos-server-4.8.10-1.el7

[root@lyoeosmgm1 ~]# eos space ls
┌──────────┬────────────────┬────────────┬────────────┬──────┬─────────┬───────────────┬──────────────┬─────────────┬─────────────┬──────┬──────────┬───────────┬───────────┬──────┬────────┬───────────┬──────┬────────┬───────────┐
│type      │            name│   groupsize│    groupmod│ N(fs)│ N(fs-rw)│ sum(usedbytes)│ sum(capacity)│ capacity(rw)│ nom.capacity│ quota│ balancing│  threshold│  converter│   ntx│  active│        wfe│   ntx│  active│ intergroup│
└──────────┴────────────────┴────────────┴────────────┴──────┴─────────┴───────────────┴──────────────┴─────────────┴─────────────┴──────┴──────────┴───────────┴───────────┴──────┴────────┴───────────┴──────┴────────┴───────────┘
 spaceview           default            0           24      2         2         2.22 GB       60.00 TB      60.00 TB           0 B     on        off          20         off      4        0         off      1        0         off

How change groupsize and groupmod ?

Cheers,
Denis

crystal · September 3, 2020, 8:13pm

Hi Denis,

You should be able to do that with the space define command: http://eos-docs.web.cern.ch/eos-docs/clicommands/space.html

specifically

space define <space-name> [<groupsize> [<groupmod>]] : define how many filesystems can end up in one scheduling group <groupsize> [ default=0 ]
  => <groupsize>=0 means that no groups are built within a space, otherwise it should be the maximum number of nodes in a scheduling group
  => <groupmod> maximum number of groups in the space, which should be at least equal to the maximum number of filesystems per node

dpugnere · September 4, 2020, 1:44pm

Thanks a lot Crystal,
I had missed this command.
Cheers,
Denis

CERN Accelerating science

You have to take role 'root' to execute this command