CERN Accelerating science

EOS instance name is not set on FSTs

Hello,

We’ve set up test 3-node EOS quarkDB/MGM/FST cluster. In the /etc/sysconfig/eos_env
we have: EOS_INSTANCE_NAME=eos-tier1-dev

When eos@fst starts, this instance name appears to be initially set to this value but then is reset to “eosdev” after which the process aborts. Please see below the pasted output from FST xrd.log

Many thanks,

George

210225 14:17:55 25007 Starting on Linux 3.10.0-1160.11.1.el7.x86_64
Copr. 2004-2012 Stanford University, xrd version v4.11.3
++++++ xrootd fst@host-172-16-103-55.nubes.stfc.ac.uk initialization started.
Config using configuration file /etc/xrd.cf.fst
=====> xrd.network keepalive
=====> xrd.port 1095
Config maximum number of connections restricted to 65000
Copr. 2012 Stanford University, xrootd protocol 4.0.0 version v4.11.3
++++++ xrootd protocol initialization started.
=====> xrootd.fslib -2 libXrdEosFst.so
=====> xrootd.async off nosf
=====> xrootd.redirect host-172-16-103-55.nubes.stfc.ac.uk:1094 chksum
=====> xrootd.seclib libXrdSec.so
=====> all.export / nolock
Config exporting /
Plugin loaded
++++++ Authentication system initialization started.
Plugin loaded
=====> sec.protocol unix
Plugin loaded
=====> sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
=====> sec.protbind * only sss
Config 3 authentication directives processed in /etc/xrd.cf.fst
------ Authentication system initialization completed.
++++++ Protection system initialization started.
Config warning: Security level is set to none; request protection disabled!
Config Local protection level: none
Config Remote protection level: none
------ Protection system initialization completed.
Config Routing for 172.16.103.55: local pub4 prv4
Config Route all4: 172.16.103.55 Dest=[::172.16.103.55]:1095
Plugin loaded
++++++ © 2010 CERN/IT-DSS FstOfs (Object Storage File System) 4.7.7
++++++ File system initialization started.
=====> ofs.persist off
=====> ofs.osslib libEosFstOss.so
=====> ofs.tpc pgm /usr/bin/xrdcp
Plugin No such file or directory loading osslib libEosFstOss-4.so
Config Falling back to using libEosFstOss.so
Plugin loaded
210225 14:17:55 time=1614262675.895514 func=Configure level=INFO logid=4136ce68-7774-11eb-82a6-facaad167418 unit=fstoss@localhost tid=00007f2fa57d4740 source=XrdFstOss:170 tident= sec= uid=0 gid=0 name= geo="" preread depth=0, queue_size=0 and bytes=0
Config effective /etc/xrd.cf.fst ofs configuration:
all.role server
ofs.maxdelay 60
ofs.persist off hold 600
ofs.trace 0
ofs.osslib libEosFstOss.so
------ File system server initialization completed.
=====> fstofs enforces SSS authentication for XROOT clients
=====> fstofs.qdbcluster : host-172-16-112-243.nubes.stfc.ac.uk:9999 host-172-16-112-247.nubes.stfc.ac.uk:9999 host-172-16-103-55.nubes.stfc.ac.uk:9999
=====> fstofs.qdbpassword length : 281
=====> fstofs.autoboot : true
=====> fstofs.broker : root://localhost:1097//eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst
=====> eoscp-log : /var/log/eos/fst/eoscp.log
=====> fstofs.defaultreceiverqueue : /eos/*/mgm
=====> fstofs.authdir : /var/eos/auth/
210225 14:17:55 time=1614262675.916192 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:110 tident= sec= uid=0 gid=0 name= geo="" starting scrubbing thread
210225 14:17:55 time=1614262675.916272 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:121 tident= sec= uid=0 gid=0 name= geo="" starting trim thread
210225 14:17:55 time=1614262675.916312 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:131 tident= sec= uid=0 gid=0 name= geo="" starting deletion thread
210225 14:17:55 time=1614262675.916325 func=Scrub level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f884ee700 source=Scrub:39 tident= sec= uid=0 gid=0 name= geo="" msg=“create scrubbing pattern …”
210225 14:17:55 time=1614262675.916354 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:141 tident= sec= uid=0 gid=0 name= geo="" starting report thread
210225 14:17:55 time=1614262675.916386 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:151 tident= sec= uid=0 gid=0 name= geo="" starting error report thread
210225 14:17:55 time=1614262675.916413 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:161 tident= sec= uid=0 gid=0 name= geo="" starting verification thread
210225 14:17:55 time=1614262675.916447 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:171 tident= sec= uid=0 gid=0 name= geo="" starting filesystem communication thread
210225 14:17:55 time=1614262675.916480 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:181 tident= sec= uid=0 gid=0 name= geo="" starting daemon supervisor thread
210225 14:17:55 time=1614262675.916510 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:191 tident= sec= uid=0 gid=0 name= geo="" starting filesystem publishing thread
210225 14:17:55 time=1614262675.916539 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:202 tident= sec= uid=0 gid=0 name= geo="" starting filesystem balancer thread
210225 14:17:55 time=1614262675.916569 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:212 tident= sec= uid=0 gid=0 name= geo="" starting filesystem transaction cleaner thread
210225 14:17:55 time=1614262675.916600 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:222 tident= sec= uid=0 gid=0 name= geo="" starting mgm synchronization thread
210225 14:17:55 time=1614262675.916629 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:233 tident= sec= uid=0 gid=0 name= geo="" starting /var/ partition monitor thread …
210225 14:17:55 time=1614262675.916658 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:243 tident= sec= uid=0 gid=0 name= geo="" enabling net/io load monitor
210225 14:17:55 time=1614262675.916689 func=Storage level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=Storage:245 tident= sec= uid=0 gid=0 name= geo="" enabling local disk S.M.A.R.T attribute monitor
210225 14:17:55 time=1614262675.916752 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f882ec700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Remover …”
210225 14:17:55 time=1614262675.917046 func=Monitor level=INFO logid=413a1956-7774-11eb-9489-facaad167418 unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f821f6700 source=MonitorVarPartition:81 tident= sec= uid=0 gid=0 name= geo="" msg=“fst partition monitor activated”
=====> fstofs.metalogdir : /var/eos/md/
210225 14:17:55 time=1614262675.917161 func=Supervisor level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f82dfb700 source=Supervisor:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Supervisor activated …
210225 14:17:55 time=1614262675.917402 func=Publish level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f82cfa700 source=Publish:416 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“publisher activated”
210225 14:17:55 time=1614262675.917476 func=AddBroker level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=XrdMqClient:173 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“add broker” url=“root://localhost:1097//eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst?xmqclient.advisory.status=0&xmqclient.advisory.query=0&xmqclient.advisory.flushbacklog=0”
210225 14:17:55 time=1614262675.917732 func=Cleaner level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f823f8700 source=Cleaner:36 tident= sec= uid=0 gid=0 name= geo="" msg=“start cleaner”
210225 14:17:55 time=1614262675.918656 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f823f8700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Cleaner …”
210225 14:17:55 time=1614262675.918683 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:389 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“starting communicator thread”
210225 14:17:55 time=1614262675.919141 func=Balancer level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f824f9700 source=Balancer:290 tident= sec=(null) uid=99 gid=99 name=- geo="" Start Balancer …
210225 14:17:55 time=1614262675.919179 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f824f9700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Balancer …”
210225 14:17:55 time=1614262675.919209 func=StartNotifyCurrentThread level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=XrdMqSharedObject:1875 tident= sec=(null) uid=99 gid=99 name=- geo="" Starting notification
210225 14:17:55 time=1614262675.919538 func=SomListener level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f817f3700 source=XrdMqSharedObject:2021 tident= sec=(null) uid=99 gid=99 name=- geo="" mgm=“starting SOM listener”
210225 14:17:55 time=1614262675.919601 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst/gw/txqueue/txq"
210225 14:17:55 time=1614262675.920749 func=Subscribe level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=XrdMqClient:620 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“successfully subscribed to broker” url=“root://localhost:1097//eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst?xmqclient.advisory.status=0&xmqclient.advisory.query=0&xmqclient.advisory.flushbacklog=0”

mq messaging: starting thread

210225 14:17:55 time=1614262675.920886 func=RequestBroadcasts level=NOTE logid=41367a44-7774-11eb-82a6-facaad167418 unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=XrdFstOfs:1842 tident= sec= uid=0 gid=0 name= geo="" msg=“requesting broadcasts”
210225 14:17:55 time=1614262675.921774 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210225 14:17:55 time=1614262675.921808 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:465 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“no action on subject creation” qpath="
/host-172-16-103-55.nubes.stfc.ac.uk:1095" own_id="/eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst"
210225 14:17:55 time=1614262675.921823 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst/gw/txqueue/txq"
210225 14:17:55 time=1614262675.921836 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst/
"
210225 14:17:55 time=1614262675.922699 func=Configure level=NOTE logid=41367a44-7774-11eb-82a6-facaad167418 unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2fa57d4740 source=XrdFstOfs:858 tident= sec= uid=0 gid=0 name= geo="" FST_HOST=host-172-16-103-55.nubes.stfc.ac.uk FST_PORT=1095 FST_HTTP_PORT=8001 VERSION=4.7.7 RELEASE=1 KEYTABADLER=b7c74b39
210225 14:17:55 time=1614262675.922893 func=Scrub level=INFO logid=FstOfsStorage unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f884ee700 source=Scrub:48 tident= sec= uid=0 gid=0 name= geo="" msg=“start scrubbing”
Config warning: asynchronous I/O has been disabled!
Config warning: sendfile I/O has been disabled!
Config warning: ‘xrootd.prepare logdir’ not specified; prepare tracking disabled.
------ xrootd protocol initialization completed.
------ xrootd fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 initialization completed.
210225 14:17:55 time=1614262675.924877 func=RequestBroadcasts level=NOTE logid=41367a44-7774-11eb-82a6-facaad167418 unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f80ff2700 source=XrdFstOfs:1842 tident= sec= uid=0 gid=0 name= geo="" msg=“requesting broadcasts”
210225 14:17:55 time=1614262675.939782 func=Publish level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f82cfa700 source=Publish:426 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“publish networkspeed=1.00 GB/s”
210225 14:17:55 time=1614262675.939823 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f82cfa700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Publish …”
210225 14:17:56 time=1614262676.922879 func=Run level=NOTE logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f537fe700 source=HttpServer:98 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“starting http server” mode=“thread-per-connection”
210225 14:17:56 time=1614262676.923166 func=Run level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f537fe700 source=HttpServer:162 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“start of micro httpd succeeded [port=8001]”
210225 14:17:57 time=1614262677.917043 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f882ec700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Remover …”
210225 14:17:57 time=1614262677.918851 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f823f8700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Cleaner …”
210225 14:17:57 time=1614262677.919309 func=getFstNodeConfigQueue level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f824f9700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“waiting for config queue in Balancer …”
210225 14:17:57 time=1614262677.928819 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210225 14:17:57 time=1614262677.929005 func=set level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=InstanceName:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Setting global instance name => eos-tier1-dev
210225 14:17:57 time=1614262677.929093 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:461 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“storing config queue name” qpath="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210225 14:17:57 time=1614262677.929193 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=2 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095;debug.level"
210225 14:17:57 time=1614262677.929294 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:163 tident= sec=(null) uid=99 gid=99 name=- geo="" FST configuration change - key=debug.level, value=info
210225 14:17:57 time=1614262677.929341 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=2 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095;gw.ntx"
210225 14:17:57 time=1614262677.929395 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:163 tident= sec=(null) uid=99 gid=99 name=- geo="" FST configuration change - key=gw.ntx, value=10
210225 14:17:57 time=1614262677.929427 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:235 tident= sec=(null) uid=99 gid=99 name=- geo="" cmd=set gw.ntx=10
210225 14:17:57 time=1614262677.929470 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=2 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095;gw.rate"
210225 14:17:57 time=1614262677.929536 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:163 tident= sec=(null) uid=99 gid=99 name=- geo="" FST configuration change - key=gw.rate, value=120
210225 14:17:57 time=1614262677.929573 func=Process level=INFO logid=FstOfsMessaging unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f80ff2700 source=Messaging:78 tident= sec= uid=0 gid=0 name= geo="" msg=“no pairs in message body” body=“mqsh.cmd=bcreply&mqsh.subject=/eos/host-172-16-103-55.nubes.stfc.ac.uk:1095/fst/gw/txqueue/txq&mqsh.type=queue&mqsh.pairs=”
210225 14:17:57 time=1614262677.929633 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:227 tident= sec=(null) uid=99 gid=99 name=- geo="" cmd=set gw.rate=120
210225 14:17:57 time=1614262677.929677 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=2 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095;manager"
210225 14:17:57 time=1614262677.929771 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:163 tident= sec=(null) uid=99 gid=99 name=- geo="" FST configuration change - key=manager, value:1094=host-172-16-103-55.nubes.stfc.ac.uk
210225 14:17:57 time=1614262677.929805 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:172 tident= sec=(null) uid=99 gid=99 name=- geo="" manager:1094=host-172-16-103-55.nubes.stfc.ac.uk
210225 14:17:57 time=1614262677.929866 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=2 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095;symkey"
210225 14:17:57 time=1614262677.929919 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:163 tident= sec=(null) uid=99 gid=99 name=- geo="" FST configuration change - key=symkey, value=tbobnU9k377G5YZ6xGcCE0+asbc=
210225 14:17:57 time=1614262677.929951 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:166 tident= sec=(null) uid=99 gid=99 name=- geo="" symkey=tbobnU9k377G5YZ6xGcCE0+asbc=
210225 14:17:57 time=1614262677.930128 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=2 subject="/config/eos-tier1-dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095;txgw"
210225 14:17:57 time=1614262677.930184 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:163 tident= sec=(null) uid=99 gid=99 name=- geo="" FST configuration change - key=txgw, value=off
210225 14:17:57 time=1614262677.930215 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:208 tident= sec=(null) uid=99 gid=99 name=- geo="" txgw=off
210225 14:17:57 time=1614262677.930285 func=ProcessFstConfigChange level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:213 tident= sec=(null) uid=99 gid=99 name=- geo="" Stopping transfer multiplexer
210225 14:17:57 time=1614262677.930316 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/config/eosdev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210225 14:17:57 time=1614262677.930377 func=set level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007f2f835fc700 source=InstanceName:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Setting global instance name => eosdev
assertion violation in static void eos::common::InstanceName::set(const string&) at /root/rpmbuild/BUILD/eos-4.7.7-1/common/InstanceName.cc:40, condition is not true: mInstanceName.empty()

Hi George,

Could you check if you have the following env variable defined in /etc/sysconfig/eos_env? You should remove it as it enables some experimental functionality which is not yet finished:
EOS_USE_MQ_ON_QDB=1

Cheers,
Elvin

Hi Elvin,

Thanks for this. We dont have this env variable. These are the contents /etc/sysconfig/eos_env (from one of the nodes)

EOS_BROKER_URL=root://localhost:1097//eos/
EOS_GEOTAG=“rack1”
EOS_INSTANCE_NAME=eos-tier1-dev
EOS_MGM_ALIAS=host-172-16-103-55.nubes.stfc.ac.uk
EOS_QUARKDB_HOSTPORT=host-172-16-103-55:9999
EOS_QUARKDB_PASSWD=/etc/eos.keytab
EOS_USE_QDB_MASTER=1
XRD_ROLES=“mq mgm fst quarkdb”

Actually, we started with “eosdev” as instance name in which case the eos@fst did start (we registered filesystems, groups, etc) but at some point we changed it to eos-tier1-dev and the above error appeared. Has something not been cleaned up from before?

George

Hi,

Sorry for the hassle. Do you have any thoughts on this? I removed the EOS_MGM_ALIAS from the /etc/sysconfig/eos_env but still no difference.

Not sure if this is relevant at all, but when I enter the EOS environment, I see the that the instance name is different.

---------------------------------------------------------------------------

EOS Copyright © 2011-2019 CERN/Switzerland

This program comes with ABSOLUTELY NO WARRANTY; for details type `license’.

This is free software, and you are welcome to redistribute it

under certain conditions; type `license’ for details.

---------------------------------------------------------------------------

EOS_INSTANCE=testinstance
EOS_SERVER_VERSION=4.7.7 EOS_SERVER_RELEASE=1
EOS_CLIENT_VERSION=4.7.7 EOS_CLIENT_RELEASE=1
EOS Console [root://localhost] |/>

What does this mean? Do I need to recreate the space or sth?

Thanks again,

George

Hi George,

The name of the instance should only contain characters/numbers and always start with eos. Therefore, as a first step I would suggest to use a different name for the instance like eostestdev1.

Cheers,
Elvin

Thanks Elvin. I changed the instance name to eostier1dev, updated the sss keytabs and restarted the eos services. I also set “mgmofs.instance” directive to this value. The MGM sets the instance name

=====> mgmofs.instance : eostier1dev
210226 13:20:02 time=1614345602.814813 func=set level=INFO logid=static… unit=mgm@host-172-16-103-55.nubes.stfc.ac.uk:1094 tid=00007fbb3bf0f740 source=InstanceName:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Setting global instance name => eostier1dev

I tried to restart the FST (master MGM node) and again it appears to set name to eostier1dev and then somehow reverts to eosdev… and cannot start (see below)

Why the FST still “remembers” the eosdev value? How can I clean it up and force the new one? Is it because the space/filesystems/groups were created with the old value and need to be recreated?

226 13:24:22 time=1614345862.213587 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007efdc67f3700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/config/eostier1dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210226 13:24:22 time=1614345862.213764 func=set level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007efdc67f3700 source=InstanceName:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Setting global instance name => eostier1dev
210226 13:24:22 time=1614345862.213867 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007efdc67f3700 source=Communicator:461 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“storing config queue name” qpath="/config/eostier1dev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210226 13:24:22 time=1614345862.213949 func=Communicator level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007efdc67f3700 source=Communicator:441 tident= sec=(null) uid=99 gid=99 name=- geo="" msg=“shared object notification” type=0 subject="/config/eosdev/node/host-172-16-103-55.nubes.stfc.ac.uk:1095"
210226 13:24:22 time=1614345862.214008 func=set level=INFO logid=static… unit=fst@host-172-16-103-55.nubes.stfc.ac.uk:1095 tid=00007efdc67f3700 source=InstanceName:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Setting global instance name => eosdev
assertion violation in static void eos::common::InstanceName::set(const string&) at /root/rpmbuild/BUILD/eos-4.7.7-1/common/InstanceName.cc:40, condition is not true: mInstanceName.empty()

Hi George,

Given that you what to start fresh, I would also recommend cleaning up the QuarkDB backend. You can simply connect with the redis client and do a del * - this will wipe out everything in the instance, so do this only if you are fine with that.

Cheers,
Elvin

Hi Elvin,

Many thanks for the suggestion, it did work. Actually, I wasn’t sure how exactly to connect to the redis-client and do "del *, and instead I deleted and created again the /var/lib/quarkdb/node-1 dirs.

Best,

George