Hello,
I upgraded from helm chart 0.1.2 to 0.1.6. I am using image gitlab-registry.cern.ch/dss/eos/eos-all:5.0.31
With 0.1.6 I see a critical status:
[root@eos-mgm-0 /]# eos status
inserting eos-fst-0.eos-fst.eos.svc.kermes-dev.local:1095 0.00 0.00
inserting eos-fst-1.eos-fst.eos.svc.kermes-dev.local:1095 0.00 0.00
inserting eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 0.00 0.00
instance: eos-kermes-dev
health: CRIT crit-contention:1 crit-groups:1
nodes: fst 3 online on
versions: mgm 1 5.0.31-1
qdb 3 5.0.3.5.0.31
fst 3 5.0.31-1
services:
eos-mgm-0.eos-mgm.eos.svc.kermes-dev.local:1094 (active)
namespace [booted] [0 s]
qdb [GREEN]
storage: data: default (8.57 PB total / 2.74 PB used 5.83 PB free / 5.83 PB avail )
meta-data: 5 files 15 directories
groups: 3 default on
filesystems: 1 default booted empty
2 default booted rw
scheduler: 80% (fill limit)
clients: 7 clients
auth: 4 sss (XRoot)
io: IN OUT
fuse : 0 clients (0 active) caps 0 locked 0
v:
t:
h:
And log messages
[root@eos-mgm-0 /]# grep -ir crit /var/log/eos
/var/log/eos/mgm/xrdlog.mgm:230324 22:44:06 time=1679697846.196750 func=SetupGlobalConfig level=CRIT logid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx unit=mgm@eos-mgm-0.eos-mgm.eos.svc.kermes-dev.local:1094 tid=00007f7b82bfc700 source=XrdMgmOfsConfigure:2459 tident=<single-exec> sec= uid=0 gid=0 name= geo="" msg="cannot add global config queue" qpath="/config/eos-kermes-dev/mgm/"
/var/log/eos/mgm/xrdlog.mgm:230324 22:44:06 time=1679697846.196784 func=SetupGlobalConfig level=CRIT logid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx unit=mgm@eos-mgm-0.eos-mgm.eos.svc.kermes-dev.local:1094 tid=00007f7b82bfc700 source=XrdMgmOfsConfigure:2467 tident=<single-exec> sec= uid=0 gid=0 name= geo="" msg="cannot add global config queue" qpath="/config/eos-kermes-dev/all/"
/var/log/eos/mgm/xrdlog.mgm:230324 22:44:06 time=1679697846.196795 func=SetupGlobalConfig level=CRIT logid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx unit=mgm@eos-mgm-0.eos-mgm.eos.svc.kermes-dev.local:1094 tid=00007f7b82bfc700 source=XrdMgmOfsConfigure:2475 tident=<single-exec> sec= uid=0 gid=0 name= geo="" msg="cannot add global config queue" qpath="/config/eos-kermes-dev/fst/"
/var/log/eos/mgm/Clients.log:230324 22:44:06 CRIT [00000/00000] ::SetupGlobalConfig msg="cannot add global config queue" qpath="/config/eos-kermes-dev/mgm/"
/var/log/eos/mgm/Clients.log:230324 22:44:06 CRIT [00000/00000] ::SetupGlobalConfig msg="cannot add global config queue" qpath="/config/eos-kermes-dev/all/"
/var/log/eos/mgm/Clients.log:230324 22:44:06 CRIT [00000/00000] ::SetupGlobalConfig msg="cannot add global config queue" qpath="/config/eos-kermes-dev/fst/"
Does it indicate a QDB problem? @ebocchi
It looks like the helm chart 0.1.5 also works fine.
I noticed the persistence options changed format, I am using
fst:
replicaCount: 3
persistence:
enabled: true
storageClass: "-"
accessModes:
- ReadWriteOnce
size: 1E
qdb:
clusterID: "eos-kermes-dev"
podAssignment:
enablePodAntiAffinity: true
persistence:
enabled: true
and the PVCs are bound and mounted correctly.