Namespace inconsistency and risk of data loss

anfeng · March 10, 2024, 6:47pm

Hi,

We are facing an inconsistency in our EOS storage. We have four nodes (Let’s call them A, B, C and D) with MGM, MQ, FST and QDB running, forming an HA cluster. The following is some of the problems we are experiencing:

file reading

On node A, we can see our files normally

➜  ~ eos ls -alh /eos/lhcb/
Secsss (getKeyTab): Unable to open /etc/eos.keytab; permission denied

Unable to open keytab file.
drwxr-xr-x   1 root     root           1.23 T Nov  7 15:59 .
drwxrwxr-x   1 root     root           1.43 T Nov  7 00:49 ..
drwxr-xr-+   1 laf      lhcb          14.80 G Feb 18 03:57 laf
drwxr-xr-x   1 qinning  lhcb         700.20 M Jan 18 16:25 qinning
drwxr-xr-x   1 yinghua  lhcb           1.22 T Jan 18 18:48 yinghua

➜  ~ eos ls -alh /eos/lhcb/laf
Secsss (getKeyTab): Unable to open /etc/eos.keytab; permission denied

Unable to open keytab file.
drwxr-xr-+   1 laf      lhcb          14.80 G Feb 18 03:57 .
drwxr-xr-x   1 root     root           1.23 T Nov  7 15:59 ..
drwxr-sr-+   1 laf      lhcb          56.28 M Dec 11 04:56 Analysis
drwxr-sr-+   1 laf      lhcb          60.93 M Dec 20 17:07 AnalysisDev_v41r15
drwxr-sr-+   1 laf      lhcb          34.74 M Dec 11 07:00 davinci_run3_test
drwxr-sr-+   1 laf      lhcb           8.59 G Dec  5 16:30 layout_test
drwxr-xr-+   1 laf      lhcb           3.70 G Nov  9 01:26 ssmb_log
drwxr-sr-+   1 laf      lhcb          62.67 M Dec 29 01:22 wangjq_debug
drwxr-sr-+   1 laf      lhcb           2.29 G Jan 22 08:29 zfit_test

However, on node B, C or D, we can only see empty top-level directories with incorrect metadata

➜  ~ eos ls -alh /eos/lhcb/
Secsss (getKeyTab): Unable to open /etc/eos.keytab; permission denied

Unable to open keytab file.
drwxr-xr-x   1 root     root                0 Mar 11 00:47 .
drwxr-xr-x   1 root     root          20.48 k Jan  1  1970 ..
drwxr-xr-x   1 root     root                0 Jan  1  1970 laf
drwxr-xr-x   1 root     root                0 Jan  1  1970 qinning
drwxr-xr-x   1 root     root                0 Jan  1  1970 yinghua

➜  ~ eos ls -alh /eos/lhcb/laf
Secsss (getKeyTab): Unable to open /etc/eos.keytab; permission denied

Unable to open keytab file.
drwxr-xr-x   1 root     root                0 Jan  1  1970 .
drwxr-xr-x   1 root     root                0 Mar 11 00:47 ..

file operations

When node A is the master MGM, on node A, I can normally move files into and out of EOS, but the changes cannot be seen as described in the above section (only seeing empty directories).

When node B, C or D is the master MGM, on that node, I can add new directories, change the owner of them or move file into them. These changes are synced among B, C and D, but cannot be seen on node A (only seeing the original files). The namespace looks somehow split.

The QDB status looks good

➜  ~ sudo eos daemon config qdb qdb info
[putenv] EOS_USE_MQ_ON_QDB=1
[putenv] EOS_XROOTD=/opt/eos/xrootd/
[putenv] GEO_TAG=local
[putenv] INSTANCE_NAME=eosdev
[putenv] LD_LIBRARY_PATH=/opt/eos/xrootd//lib64:/opt/eos/grpc/lib64
[putenv] LD_PRELOAD=/usr/lib64/libjemalloc.so
[putenv] QDB_CLUSTER_ID=eosdev
[putenv] QDB_HOST=hepfarm40.hep.tsinghua.edu.cn
[putenv] QDB_NODE=hepfarm40.hep.tsinghua.edu.cn:7777
[putenv] QDB_NODES=hepfarm40.hep.tsinghua.edu.cn:7777
[putenv] QDB_PATH=/var/lib/qdb
[putenv] QDB_PORT=7777
[putenv] SERVER_HOST=hepfarm40.hep.tsinghua.edu.cn
 1) TERM 28
 2) LOG-START 0
 3) LOG-SIZE 17674580
 4) LEADER hepfarm40.hep.tsinghua.edu.cn:7777
 5) CLUSTER-ID eosdev
 6) COMMIT-INDEX 17674579
 7) LAST-APPLIED 17674579
 8) BLOCKED-WRITES 0
 9) LAST-STATE-CHANGE 2058205 (23 days, 19 hours, 43 minutes, 25 seconds)
10) ----------
11) MYSELF hepfarm40.hep.tsinghua.edu.cn:7777
12) VERSION 5.2.14.1
13) STATUS LEADER
14) NODE-HEALTH GREEN
15) JOURNAL-FSYNC-POLICY sync-important-updates
16) ----------
17) MEMBERSHIP-EPOCH 72376
18) NODES hepfarm41.hep.tsinghua.edu.cn:7777,hepfarm40.hep.tsinghua.edu.cn:7777,hepfarm30.hep.tsinghua.edu.cn:7777,hepfarm21.hep.tsinghua.edu.cn:7777
19) OBSERVERS
20) QUORUM-SIZE 3
21) ----------
22) REPLICA hepfarm21.hep.tsinghua.edu.cn:7777 | ONLINE | UP-TO-DATE | LOG-SIZE 17674580 | VERSION 5.2.14.1
23) REPLICA hepfarm30.hep.tsinghua.edu.cn:7777 | ONLINE | UP-TO-DATE | LOG-SIZE 17674580 | VERSION 5.2.17.1
24) REPLICA hepfarm41.hep.tsinghua.edu.cn:7777 | ONLINE | UP-TO-DATE | LOG-SIZE 17674580 | VERSION 5.2.17.1
info: run 'export REDISCLI_AUTH=`cat /etc/eos.keytab`; redis-cli -p `cat /var/run/eos/xrd.cf.qdb|grep xrd.port | cut -d ' ' -f 2` <<< raft-info' retc=0

These inconsistencies did not arise until recent weeks. I am not sure if it is caused by the upgrade.
The EOS_SERVER_VERSION of node A is 5.2.14, while on node B, C or D the EOS_SERVER_VERSION is 5.2.17 or 5.2.18. Since only on node A can we read our files, we are keeping the version on node A to avoid losing our data, and see if there’s any advice on resolving this issue.

esindril · March 20, 2024, 10:31am

Hi Anfeng,

The inconsistencies between the different MGM daemons are more or less expected since we do not yet provide a full HA model for the MGM daemons. The way we currently run at CERN is that there is only one MGM process that is also the leader/master MGM and this is the only point of truth when it comes to the namespace information.

Therefore, what I would recommend is to follow a similar setup and have just one MGM running - for the time being. We do plan to have an HA setup also for the MGM but this requires a few more steps like dropping the current MQ daemon and actually looking at the corner cases when it comes to synchronizing info between the MGM follower daemons.

What is surprising in your setup is that things work as expected for nodes B, C, D but not for A. Could you send me the output of the following command for each of the MGM daemons on all 4 nodes?
eos ns
eos access ls

To make sure all the information is properly stored in QuarkDB, you can use the eos-ns-inspect scan command to check that the contents of a certain reference directory matches your expectations. Then, I would move to a one MGM deployment setup and things should be consistent. At this point, I don’t have an explanation why node A behaves differently.

Cheers,
Elvin

anfeng · March 23, 2024, 4:50pm

Hi Elvin,

Many thanks for your clarification on the issue! The command outputs are as follows:

On node A,

➜  ~ sudo eos ns
# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            325372 [booted] (0s)
ALL      Directories                      8765
ALL      Total boot time                  0 s
ALL      Contention                       write: 0.00 % read:0.00 %
# ------------------------------------------------------------------------------------
ALL      Replication                      is_master=false master_id=hepfarm41.hep.tsinghua.edu.cn:1094
# ------------------------------------------------------------------------------------
ALL      files created since boot         67
ALL      container created since boot     78
# ------------------------------------------------------------------------------------
ALL      current file id                  326061
ALL      current container id             16180
# ------------------------------------------------------------------------------------
ALL      eosxd caps                       0 c: 0 cc: 0 cic: 0 ic: 0
ALL      eosxd clients                    0
ALL      eosxd active clients             0
ALL      eosxd locked clients             0
# ------------------------------------------------------------------------------------
ALL      File cache max num               0
ALL      File cache occupancy             0
ALL      In-flight FileMD                 0
ALL      Container cache max num          0
ALL      Container cache occupancy        1
ALL      In-flight ContainerMD            0
# ------------------------------------------------------------------------------------
ALL      eosViewRWMutex status            available (0s)
ALL      eosViewRWMutex peak-latency      0ms (last) 0ms (1 min) 0ms (2 min) 0ms (5 min)
ALL      eosViewRWMutex locked for 0.00% of the penultimate second
# ------------------------------------------------------------------------------------
ALL      QClient overall RTT              0ms (min)  0ms (avg)  105ms (max)
ALL      QClient recent peak RTT          0ms (1 min) 0ms (2 min) 0ms (5 min)
# ------------------------------------------------------------------------------------
ALL      memory virtual                   7.56 GB
ALL      memory resident                  1.82 GB
ALL      memory share                     74.15 MB
ALL      memory growths                   4.42 GB
ALL      threads                          469
ALL      fds                              491
ALL      uptime                           3258372
# ------------------------------------------------------------------------------------
ALL      drain info                       pool=drain          min=10  max=100  size=10   queue_sz=0
ALL      fsck info                        pool=fsck           min=2   max=20   size=2    queue_sz=0
ALL      converter info                   pool=converter      min=64  max=100  size=64   queue_sz=0
ALL      balancer info                    pool=balance        min=10  max=100  size=10   queue_sz=0 space=default
# ------------------------------------------------------------------------------------
ALL      tracker info                     tracker=fsck size=0
# ------------------------------------------------------------------------------------
┌────────┬───────┬────────┬───────┬──────┬─────────┬────────────────┐
│     uid│threads│sessions│  limit│stalls│stalltime│          status│
└────────┴───────┴────────┴───────┴──────┴─────────┴────────────────┘
        0       1        1 65.54 K      0         1          user-OK

➜  ~ sudo eos access ls
# ....................................................................................
# Redirection Rules ...
# ....................................................................................
[ 01 ]                         ENOENT:* => hepfarm41.hep.tsinghua.edu.cn
[ 02 ]                              w:* => hepfarm41.hep.tsinghua.edu.cn
➜  ~

On node B,

➜  ~ sudo eos ns
# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            325372 [booted] (0s)
ALL      Directories                      8765
ALL      Total boot time                  1 s
ALL      Contention                       write: 0.00 % read:0.00 %
# ------------------------------------------------------------------------------------
ALL      Replication                      is_master=false master_id=hepfarm41.hep.tsinghua.edu.cn:1094
# ------------------------------------------------------------------------------------
ALL      files created since boot         6
ALL      container created since boot     6
# ------------------------------------------------------------------------------------
ALL      current file id                  326061
ALL      current container id             16180
# ------------------------------------------------------------------------------------
ALL      eosxd caps                       0 c: 0 cc: 0 cic: 0 ic: 0
ALL      eosxd clients                    0
ALL      eosxd active clients             0
ALL      eosxd locked clients             0
# ------------------------------------------------------------------------------------
ALL      File cache max num               0
ALL      File cache occupancy             0
ALL      In-flight FileMD                 0
ALL      Container cache max num          0
ALL      Container cache occupancy        1
ALL      In-flight ContainerMD            0
# ------------------------------------------------------------------------------------
ALL      eosViewRWMutex status            available (0s)
ALL      eosViewRWMutex peak-latency      0ms (last) 0ms (1 min) 0ms (2 min) 0ms (5 min)
ALL      eosViewRWMutex locked for 0.00% of the penultimate second
# ------------------------------------------------------------------------------------
ALL      QClient overall RTT              0ms (min)  0ms (avg)  18ms (max)
ALL      QClient recent peak RTT          0ms (1 min) 0ms (2 min) 0ms (5 min)
# ------------------------------------------------------------------------------------
ALL      memory virtual                   6.06 GB
ALL      memory resident                  1.21 GB
ALL      memory share                     72.67 MB
ALL      memory growths                   2.90 GB
ALL      threads                          523
ALL      fds                              399
ALL      uptime                           1355438
# ------------------------------------------------------------------------------------
ALL      drain info                       pool=drain          min=10  max=100  size=10   queue_sz=0
ALL      fsck info                        pool=fsck           min=2   max=20   size=2    queue_sz=0
ALL      converter info                   pool=converter      min=144 max=144  size=144  queue_sz=0
ALL      balancer info                    pool=balance        min=10  max=100  size=10   queue_sz=0 space=default
# ------------------------------------------------------------------------------------

# ------------------------------------------------------------------------------------
┌────────┬───────┬────────┬───────┬──────┬─────────┬────────────────┐
│     uid│threads│sessions│  limit│stalls│stalltime│          status│
└────────┴───────┴────────┴───────┴──────┴─────────┴────────────────┘
        0       1        1 65.54 K      0         1          user-OK

➜  ~ sudo eos access ls
# ....................................................................................
# Redirection Rules ...
# ....................................................................................
[ 01 ]                         ENOENT:* => hepfarm41.hep.tsinghua.edu.cn
[ 02 ]                              w:* => hepfarm41.hep.tsinghua.edu.cn
➜  ~

On node C,

➜  ~ sudo eos ns
# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            325372 [booted] (0s)
ALL      Directories                      8765
ALL      Total boot time                  0 s
ALL      Contention                       write: 0.00 % read:0.00 %
# ------------------------------------------------------------------------------------
ALL      Replication                      is_master=false master_id=hepfarm41.hep.tsinghua.edu.cn:1094
# ------------------------------------------------------------------------------------
ALL      files created since boot         6
ALL      container created since boot     6
# ------------------------------------------------------------------------------------
ALL      current file id                  326061
ALL      current container id             16180
# ------------------------------------------------------------------------------------
ALL      eosxd caps                       0 c: 0 cc: 0 cic: 0 ic: 0
ALL      eosxd clients                    0
ALL      eosxd active clients             0
ALL      eosxd locked clients             0
# ------------------------------------------------------------------------------------
ALL      File cache max num               0
ALL      File cache occupancy             0
ALL      In-flight FileMD                 0
ALL      Container cache max num          0
ALL      Container cache occupancy        1
ALL      In-flight ContainerMD            0
# ------------------------------------------------------------------------------------
ALL      eosViewRWMutex status            available (0s)
ALL      eosViewRWMutex peak-latency      0ms (last) 0ms (1 min) 0ms (2 min) 0ms (5 min)
ALL      eosViewRWMutex locked for 0.00% of the penultimate second
# ------------------------------------------------------------------------------------
ALL      QClient overall RTT              0ms (min)  0ms (avg)  7ms (max)
ALL      QClient recent peak RTT          0ms (1 min) 0ms (2 min) 0ms (5 min)
# ------------------------------------------------------------------------------------
ALL      memory virtual                   6.46 GB
ALL      memory resident                  425.21 MB
ALL      memory share                     47.32 MB
ALL      memory growths                   3.29 GB
ALL      threads                          571
ALL      fds                              399
ALL      uptime                           1109080
# ------------------------------------------------------------------------------------
ALL      drain info                       pool=drain          min=10  max=100  size=10   queue_sz=0
ALL      fsck info                        pool=fsck           min=2   max=20   size=2    queue_sz=0
ALL      converter info                   pool=converter      min=192 max=192  size=192  queue_sz=0
ALL      balancer info                    pool=balance        min=10  max=100  size=10   queue_sz=0 space=default
# ------------------------------------------------------------------------------------

# ------------------------------------------------------------------------------------
┌────────┬───────┬────────┬───────┬──────┬─────────┬────────────────┐
│     uid│threads│sessions│  limit│stalls│stalltime│          status│
└────────┴───────┴────────┴───────┴──────┴─────────┴────────────────┘
        0       1        1 65.54 K      0         1          user-OK

➜  ~ sudo eos access ls
# ....................................................................................
# Redirection Rules ...
# ....................................................................................
[ 01 ]                         ENOENT:* => hepfarm41.hep.tsinghua.edu.cn
[ 02 ]                              w:* => hepfarm41.hep.tsinghua.edu.cn
➜  ~

On node D (whose hostname is hepfarm41.hep.tsinghua.edu.cn),

➜  ~ sudo eos ns
# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            325374 [booted] (0s)
ALL      Directories                      8765
ALL      Total boot time                  0 s
ALL      Contention                       write: 0.00 % read:0.00 %
# ------------------------------------------------------------------------------------
ALL      Replication                      is_master=true master_id=hepfarm41.hep.tsinghua.edu.cn:1094
# ------------------------------------------------------------------------------------
ALL      files created since boot         4
ALL      container created since boot     6
# ------------------------------------------------------------------------------------
ALL      current file id                  326059
ALL      current container id             16180
# ------------------------------------------------------------------------------------
ALL      eosxd caps                       2 c: 2 cc: 2 cic: 2 ic: 1
ALL      eosxd clients                    0
ALL      eosxd active clients             0
ALL      eosxd locked clients             0
# ------------------------------------------------------------------------------------
ALL      File cache max num               40000000
ALL      File cache occupancy             588
ALL      In-flight FileMD                 0
ALL      Container cache max num          5000000
ALL      Container cache occupancy        67
ALL      In-flight ContainerMD            0
# ------------------------------------------------------------------------------------
ALL      eosViewRWMutex status            available (0s)
ALL      eosViewRWMutex peak-latency      0ms (last) 0ms (1 min) 0ms (2 min) 0ms (5 min)
ALL      eosViewRWMutex locked for 0.10% of the penultimate second
# ------------------------------------------------------------------------------------
ALL      QClient overall RTT              0ms (min)  0ms (avg)  14ms (max)
ALL      QClient recent peak RTT          0ms (1 min) 0ms (2 min) 0ms (5 min)
# ------------------------------------------------------------------------------------
ALL      memory virtual                   7.22 GB
ALL      memory resident                  524.56 MB
ALL      memory share                     30.39 MB
ALL      memory growths                   4.05 GB
ALL      threads                          602
ALL      fds                              487
ALL      uptime                           1355306
# ------------------------------------------------------------------------------------
ALL      drain info                       pool=drain          min=10  max=100  size=10   queue_sz=0
ALL      fsck info                        pool=fsck           min=2   max=20   size=2    queue_sz=0
ALL      converter info                   pool=converter      min=192 max=192  size=192  queue_sz=0
ALL      balancer info                    pool=balance        min=10  max=100  size=10   queue_sz=0 space=default
# ------------------------------------------------------------------------------------
ALL      tracker info                     tracker=fsck size=0
# ------------------------------------------------------------------------------------
┌────────┬───────┬────────┬───────┬──────┬─────────┬────────────────┐
│     uid│threads│sessions│  limit│stalls│stalltime│          status│
└────────┴───────┴────────┴───────┴──────┴─────────┴────────────────┘
        0       1        1 65.54 K      0         1          user-OK

➜  ~ sudo eos access ls
➜  ~

Surprisingly, the outputs of eos-ns-inspect scan are the same for different qdb members. For example, for qdb member hepfarm21.hep.tsinghua.edu.cn:7777 (which is node A), the output is

➜  ~ sudo eos-ns-inspect scan --members hepfarm21.hep.tsinghua.edu.cn:7777 --path /eos/ --password-file /etc/eos.keytab
[QCLIENT - INFO - getNext:57] Received redirection to hepfarm40.hep.tsinghua.edu.cn:7777
cid=16109 ctime=1708024661.326694744 flags=0 gid=0 mode=40755 mtime=0.0 name=eos parent_id=1 path=/eos/ stime=0.0 tree_size=20480 uid=0
cid=16114 ctime=1708024661.389783555 flags=0 gid=0 mode=40755 mtime=0.0 name=dev parent_id=16109 path=/eos/dev/ stime=0.0 tree_size=20480 uid=0
cid=16115 ctime=1708024661.389903953 flags=0 gid=0 mode=40755 mtime=0.0 name=proc parent_id=16114 path=/eos/dev/proc/ stime=0.0 tree_size=20480 uid=0
atime=1708027150.712828920 ctime=1708027150.712828707 fid=326043 flags=0 gid=0 layout_id=0 link_name= locations= mtime=1708027150.712829229 name=master path=/eos/dev/proc/master pid=16115 size=4096 stime=0.0 uid=0 unlink_locations= xs=
atime=1708027150.710055042 ctime=1708027150.710054833 fid=326041 flags=0 gid=0 layout_id=0 link_name= locations= mtime=1708027150.710055380 name=quota path=/eos/dev/proc/quota pid=16115 size=4096 stime=0.0 uid=0 unlink_locations= xattr.sys.proc=mgm.cmd=quota&mgm.subcmd=lsuser&mgm.format=fuse xs=
atime=1708027150.712697921 ctime=1708027150.712697575 fid=326042 flags=0 gid=0 layout_id=0 link_name= locations= mtime=1708027150.712698374 name=reconnect path=/eos/dev/proc/reconnect pid=16115 size=4096 stime=0.0 uid=0 unlink_locations= xs=
atime=1708027150.709967053 ctime=1708027150.709966662 fid=326040 flags=0 gid=0 layout_id=0 link_name= locations= mtime=1708027150.709967515 name=who path=/eos/dev/proc/who pid=16115 size=4096 stime=0.0 uid=0 unlink_locations= xattr.sys.proc=mgm.cmd=who&mgm.format=fuse xs=
atime=1708027150.707150928 ctime=1708027150.707149781 fid=326039 flags=0 gid=0 layout_id=0 link_name= locations= mtime=1708027150.707152448 name=whoami path=/eos/dev/proc/whoami pid=16115 size=4096 stime=0.0 uid=0 unlink_locations= xattr.sys.proc=mgm.cmd=whoami&mgm.format=fuse xs=
cid=16158 ctime=1708027150.695804580 flags=0 gid=2 mode=40770 mtime=0.0 name=archive parent_id=16115 path=/eos/dev/proc/archive/ stime=0.0 tree_size=0 uid=2
cid=16159 ctime=1708027150.698147171 flags=0 gid=2 mode=40755 mtime=0.0 name=clone parent_id=16115 path=/eos/dev/proc/clone/ stime=0.0 tree_size=0 uid=2
cid=16156 ctime=1708027150.693008202 flags=0 gid=2 mode=40770 mtime=0.0 name=conversion parent_id=16115 path=/eos/dev/proc/conversion/ stime=0.0 tree_size=0 uid=2
cid=16157 ctime=1708027150.695624315 flags=0 gid=0 mode=40770 mtime=0.0 name=devices parent_id=16115 path=/eos/dev/proc/devices/ stime=0.0 tree_size=0 uid=0
cid=16116 ctime=1708024661.389971358 flags=1 gid=0 mode=40755 mtime=0.0 name=recycle parent_id=16115 path=/eos/dev/proc/recycle/ stime=0.0 tree_size=0 uid=0
cid=16163 ctime=1708027150.700888867 flags=0 gid=0 mode=40755 mtime=0.0 name=tape-rest-api parent_id=16115 path=/eos/dev/proc/tape-rest-api/ stime=0.0 tree_size=0 uid=0
cid=16164 ctime=1708027150.700914227 flags=0 gid=0 mode=40755 mtime=0.0 name=bulkrequests parent_id=16163 path=/eos/dev/proc/tape-rest-api/bulkrequests/ stime=0.0 tree_size=0 uid=0
cid=16165 ctime=1708027150.700934288 flags=0 gid=2 mode=40700 mtime=0.0 name=evict parent_id=16164 path=/eos/dev/proc/tape-rest-api/bulkrequests/evict/ stime=0.0 tree_size=0 uid=2
cid=16166 ctime=1708027150.703913151 flags=0 gid=2 mode=40700 mtime=0.0 name=stage parent_id=16164 path=/eos/dev/proc/tape-rest-api/bulkrequests/stage/ stime=0.0 tree_size=0 uid=2
cid=16162 ctime=1708027150.700733507 flags=0 gid=0 mode=40700 mtime=0.0 name=token parent_id=16115 path=/eos/dev/proc/token/ stime=0.0 tree_size=0 uid=0
cid=16161 ctime=1708027150.698501079 flags=0 gid=0 mode=40700 mtime=0.0 name=tracker parent_id=16115 path=/eos/dev/proc/tracker/ stime=0.0 tree_size=0 uid=2
cid=16160 ctime=1708027150.698428202 flags=0 gid=0 mode=40700 mtime=0.0 name=workflow parent_id=16115 path=/eos/dev/proc/workflow/ stime=0.0 tree_size=0 uid=2
cid=16110 ctime=1710089290.646543836 flags=0 gid=0 mode=40755 mtime=1710090580.323991953 name=lhcb parent_id=16109 path=/eos/lhcb/ stime=0.0 tree_size=0 uid=0 xattr.sys.forced.blocksize=4k xattr.sys.forced.checksum=adler xattr.sys.forced.layout=replica xattr.sys.forced.nstripes=2 xattr.sys.forced.space=default
cid=16117 ctime=1708024661.399660340 flags=1 gid=0 mode=40755 mtime=0.0 name=laf parent_id=16110 path=/eos/lhcb/laf/ stime=0.0 tree_size=0 uid=0
cid=16113 ctime=1708024661.376943929 flags=1 gid=0 mode=40755 mtime=0.0 name=qinning parent_id=16110 path=/eos/lhcb/qinning/ stime=0.0 tree_size=0 uid=0
cid=16112 ctime=1708024661.356247242 flags=1 gid=0 mode=40755 mtime=0.0 name=yinghua parent_id=16110 path=/eos/lhcb/yinghua/ stime=0.0 tree_size=0 uid=0

Changing the --member in the above command yields the same output, with only empty top-level directories shown.

And I have one further question. Does this issue means that the “state” of an EOS instance is not only stored in QDB but also (part of the state) stored somewhere in MGM? Before this issue I thought that as long as QDBs are synced, all information should be the same.

esindril · March 25, 2024, 2:19pm

Hi Anfeng,

No, there is not actual state that is saved in the MGM and not in QDB, unless there was some issue with the communication between the MGM and the QDB. So, at this point does the information in node D which is currently the MGM master match the contents for QDB?
I can’t easily figure that out from the above output.

To simplify things and make sure everything is working as expected please do the following on node D:
eos mkdir /eos/opstest/
eos ls -lrta /eos/opstest/
eos-ns-inspect scan --path /eos/ --maxdept 1

I would expect to see the opstest directory in both cases.

Cheers,
Elvin

anfeng · March 25, 2024, 4:35pm

Hi Elvin,

Thanks for the clarification. Indeed there might be some issue with the communication between the MGM and the QDB.

The process of executing the proposed commands is as follows:

➜  ~ sudo eos mkdir /eos/opstest
➜  ~ sudo eos ls -lrta /eos/opstest/
drwxr-xr-x   1 root     root                0 Mar 26 00:16 .
drwxr-xr-x   1 root     root            20480 Mar 26 00:16 ..
➜  ~ sudo eos-ns-inspect scan --path /eos/ --maxdepth 1
--members is required
Run with --help for more information.
➜  ~ sudo eos-ns-inspect scan --members hepfarm41.hep.tsinghua.edu.cn:7777 --path /eos/ --maxdepth 1 --password-file /etc/eos.keytab
[QCLIENT - INFO - getNext:57] Received redirection to hepfarm40.hep.tsinghua.edu.cn:7777
cid=16109 ctime=1708024661.326694744 flags=0 gid=0 mode=40755 mtime=1711383408.44324158 name=eos parent_id=1 path=/eos/ stime=0.0 tree_size=20480 uid=0
➜  ~ sudo eos-ns-inspect scan --members hepfarm40.hep.tsinghua.edu.cn:7777 --path /eos/ --maxdepth 1 --password-file /etc/eos.keytab
cid=16109 ctime=1708024661.326694744 flags=0 gid=0 mode=40755 mtime=1711383408.44324158 name=eos parent_id=1 path=/eos/ stime=0.0 tree_size=20480 uid=0
➜  ~

where I need to provide --members and --password-file parameters.

The current MGM master is node D (hepfarm41.hep.tsinghua.edu.cn) while the current QDB leader is node C (hepfarm40.hep.tsinghua.edu.cn):

➜  ~ sudo eos daemon config qdb qdb info
[putenv] EOS_USE_MQ_ON_QDB=1
[putenv] EOS_XROOTD=/opt/eos/xrootd/
[putenv] GEO_TAG=local
[putenv] INSTANCE_NAME=eosdev
[putenv] LD_LIBRARY_PATH=/opt/eos/xrootd//lib64:/opt/eos/grpc/lib64
[putenv] LD_PRELOAD=/usr/lib64/libjemalloc.so
[putenv] QDB_CLUSTER_ID=eosdev
[putenv] QDB_HOST=hepfarm40.hep.tsinghua.edu.cn
[putenv] QDB_NODE=hepfarm40.hep.tsinghua.edu.cn:7777
[putenv] QDB_NODES=hepfarm40.hep.tsinghua.edu.cn:7777
[putenv] QDB_PATH=/var/lib/qdb
[putenv] QDB_PORT=7777
[putenv] SERVER_HOST=hepfarm40.hep.tsinghua.edu.cn
 1) TERM 28
 2) LOG-START 0
 3) LOG-SIZE 19734487
 4) LEADER hepfarm40.hep.tsinghua.edu.cn:7777
 5) CLUSTER-ID eosdev
 6) COMMIT-INDEX 19734486
 7) LAST-APPLIED 19734486
 8) BLOCKED-WRITES 0
 9) LAST-STATE-CHANGE 3345882 (1 months, 8 days, 17 hours, 24 minutes, 42 seconds)
10) ----------
11) MYSELF hepfarm40.hep.tsinghua.edu.cn:7777
12) VERSION 5.2.14.1
13) STATUS LEADER
14) NODE-HEALTH GREEN
15) JOURNAL-FSYNC-POLICY sync-important-updates
16) ----------
17) MEMBERSHIP-EPOCH 72376
18) NODES hepfarm41.hep.tsinghua.edu.cn:7777,hepfarm40.hep.tsinghua.edu.cn:7777,hepfarm30.hep.tsinghua.edu.cn:7777,hepfarm21.hep.tsinghua.edu.cn:7777
19) OBSERVERS
20) QUORUM-SIZE 3
21) ----------
22) REPLICA hepfarm21.hep.tsinghua.edu.cn:7777 | ONLINE | UP-TO-DATE | LOG-SIZE 19734487 | VERSION 5.2.14.1
23) REPLICA hepfarm30.hep.tsinghua.edu.cn:7777 | ONLINE | UP-TO-DATE | LOG-SIZE 19734487 | VERSION 5.2.17.1
24) REPLICA hepfarm41.hep.tsinghua.edu.cn:7777 | ONLINE | UP-TO-DATE | LOG-SIZE 19734487 | VERSION 5.2.17.1
info: run 'export REDISCLI_AUTH=`cat /etc/eos.keytab`; redis-cli -p `cat /var/run/eos/xrd.cf.qdb|grep xrd.port | cut -d ' ' -f 2` <<< raft-info' retc=0

The output of eos-ns-inspect scan indeed mismatches with the eos ls output. And when I do

sudo eos-ns-inspect scan --members hepfarm40.hep.tsinghua.edu.cn:7777 --path /eos/ --maxdepth 1 --password-file /etc/eos.keytab

on node D (hepfarm41.hep.tsinghua.edu.cn), the QDB log /var/log/eos/qdb/xrdlog.qdb on QDB leader node C (hepfarm40.hep.tsinghua.edu.cn) is added the following lines:

[1711384205105] INFO: New link from hepfarm41.hep.tsinghua.edu.cn [83b6bf07-6ca0-4411-a0f4-fb8ce462833e]
[1711384205118] INFO: Shutting down link from hepfarm41.hep.tsinghua.edu.cn [83b6bf07-6ca0-4411-a0f4-fb8ce462833e]

Best regards,
Anfeng

CERN Accelerating science

Namespace inconsistency and risk of data loss

file reading

file operations