Hello,
I am stuck with a MGM+MQ and FST two-node setup, similar to one described at the EOS admin configuration — EOS CITRINE documentation page. The MGM+MQ server seems to run, but the FST server fails to connect to the MGM+MQ server.
In the following logs, grid05
is MGM+MQ and nfs10
is FST.
Each node can connect to the buddy’s 1094
, 1095
or 1097
port when EOS is up:
[grid05 ~]$ nc -z nfs10.aligrid.hiroshima-u.ac.jp 1095; echo $?
0
[nfs10 ~]$ nc -z grid05.aligrid.hiroshima-u.ac.jp 1094; echo $?
0
[nfs10 ~]$ nc -z grid05.aligrid.hiroshima-u.ac.jp 1097; echo $?
0
The content of /etc/sysconfig/eos_dev
is based on the example in the configuration page:
DAEMON_COREFILE_LIMIT=unlimited
XRD_ROLES="fst"
LD_PRELOAD=/usr/lib64/libjemalloc.so.1
EOS_BROKER_URL=root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/
# Not mentioned in https://eos-docs.web.cern.ch/quickstart/admin/configure.html#setup-fst
# At least EOS_MGM_ALIAS and EOS_GEOTAG seem mandatory
EOS_INSTANCE_NAME=eosalice
EOS_GEOTAG="::EOS"
EOS_MGM_ALIAS=grid05.aligrid.hiroshima-u.ac.jp
EOS_MAIL_CC="***@***" # a mailing list address, actually
EOS_NOTIFY="mail -s `date +%s`-`hostname`-eos-notify $EOS_MAIL_CC"
EOS_NS_ACCOUNTING=1
EOS_SYNCTIME_ACCOUNTING=1
EOS_USE_SHARED_MUTEX=1
#EOS_FST_NO_SSS_ENFORCEMENT=1
EOS_HTTP_THREADPOOL="epoll"
EOS_HTTP_THREADPOOL_SIZE=16
EOS_HTTP_CONNECTION_MEMORY_LIMIT=4194304
The content of /etc/xrd.cf.fst
is not changed from original.
The content of /var/log/eos/xrdlog.fst
210402 11:42:43 9531 Starting on Linux 3.10.0-1160.21.1.el7.x86_64
Copr. 2004-2012 Stanford University, xrd version v4.12.5
++++++ xrootd fst@nfs10.aligrid.hiroshima-u.ac.jp initialization started.
Config using configuration file /etc/xrd.cf.fst
=====> xrd.network keepalive
=====> xrd.port 1095
Config maximum number of connections restricted to 65000
Copr. 2012 Stanford University, xrootd protocol 4.0.0 version v4.12.5
++++++ xrootd protocol initialization started.
=====> xrootd.fslib -2 libXrdEosFst.so
=====> xrootd.async off nosf
=====> xrootd.redirect grid05.aligrid.hiroshima-u.ac.jp:1094 chksum
=====> xrootd.seclib libXrdSec.so
=====> all.export / nolock
Config exporting /
Plugin loaded
++++++ Authentication system initialization started.
Plugin loaded
=====> sec.protocol unix
Plugin loaded
=====> sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
=====> sec.protbind * only unix sss
Config 3 authentication directives processed in /etc/xrd.cf.fst
------ Authentication system initialization completed.
++++++ Protection system initialization started.
Config warning: Security level is set to none; request protection disabled!
Config Local protection level: none
Config Remote protection level: none
------ Protection system initialization completed.
Config Routing for nfs10.aligrid.hiroshima-u.ac.jp: local pub4 prv4
Config Route all4: nfs10.aligrid.hiroshima-u.ac.jp Dest=[::133.41.115.112]:1095
Plugin loaded
++++++ (c) 2010 CERN/IT-DSS FstOfs (Object Storage File System) 4.8.31
++++++ File system initialization started.
=====> ofs.persist off
=====> ofs.osslib libEosFstOss.so
=====> ofs.tpc pgm /usr/bin/xrdcp
Plugin No such file or directory loading osslib libEosFstOss-4.so
Config Falling back to using libEosFstOss.so
Plugin loaded
210402 11:42:44 time=1617331364.036562 func=Configure level=INFO logid=19e127f0-935d-11eb-8081-90e2ba9a1550 unit=fstoss@localhost tid=00007f78bde91780 source=XrdFstOss:170 tident= sec= uid=0 gid=0 name= geo="" preread depth=0, queue_size=0 and bytes=0
Config effective /etc/xrd.cf.fst ofs configuration:
all.role server
ofs.maxdelay 60
ofs.persist off hold 600
ofs.trace 0
ofs.osslib libEosFstOss.so
------ File system server initialization completed.
=====> fstofs enforces SSS authentication for XROOT clients
=====> fstofs.autoboot : true
=====> fstofs.broker : root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst
=====> eoscp-log : /var/log/eos/fst/eoscp.log
=====> fstofs.defaultreceiverqueue : /eos/*/mgm
=====> fstofs.authdir : /var/eos/auth/
210402 11:42:44 time=1617331364.073819 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:110 tident= sec= uid=0 gid=0 name= geo="" starting scrubbing thread
210402 11:42:44 time=1617331364.073946 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:121 tident= sec= uid=0 gid=0 name= geo="" starting trim thread
210402 11:42:44 time=1617331364.073998 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:131 tident= sec= uid=0 gid=0 name= geo="" starting deletion thread
210402 11:42:44 time=1617331364.074081 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:141 tident= sec= uid=0 gid=0 name= geo="" starting report thread
210402 11:42:44 time=1617331364.074962 func=Scrub level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a3bff700 source=Scrub:39 tident= sec= uid=0 gid=0 name= geo="" msg="create scrubbing pattern ..."
210402 11:42:44 time=1617331364.075034 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:151 tident= sec= uid=0 gid=0 name= geo="" starting error report thread
210402 11:42:44 time=1617331364.075916 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a39fd700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Remover ..."
210402 11:42:44 time=1617331364.076634 func=Scrub level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a3bff700 source=Scrub:48 tident= sec= uid=0 gid=0 name= geo="" msg="start scrubbing"
210402 11:42:44 time=1617331364.076681 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:161 tident= sec= uid=0 gid=0 name= geo="" starting verification thread
210402 11:42:44 time=1617331364.077307 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:171 tident= sec= uid=0 gid=0 name= geo="" starting filesystem communication thread
210402 11:42:44 time=1617331364.077858 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:182 tident= sec= uid=0 gid=0 name= geo="" starting daemon supervisor thread
210402 11:42:44 time=1617331364.077925 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:192 tident= sec= uid=0 gid=0 name= geo="" starting filesystem publishing thread
210402 11:42:44 time=1617331364.078032 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:196 tident= sec= uid=0 gid=0 name= geo="" starting filesystem balancer thread
210402 11:42:44 time=1617331364.078674 func=Communicator level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=Communicator:400 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="starting communicator thread"
210402 11:42:44 time=1617331364.079065 func=Supervisor level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2ffc700 source=Supervisor:39 tident= sec=(null) uid=99 gid=99 name=- geo="" Supervisor activated ...
210402 11:42:44 time=1617331364.079228 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:207 tident= sec= uid=0 gid=0 name= geo="" starting mgm synchronization thread
210402 11:42:44 time=1617331364.079286 func=StartNotifyCurrentThread level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=XrdMqSharedObject:1883 tident= sec=(null) uid=99 gid=99 name=- geo="" Starting notification
210402 11:42:44 time=1617331364.079926 func=Publish level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Publish:416 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="publisher activated"
210402 11:42:44 time=1617331364.080715 func=Balancer level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2efb700 source=Balancer:290 tident= sec=(null) uid=99 gid=99 name=- geo="" Start Balancer ...
210402 11:42:44 time=1617331364.080789 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:218 tident= sec= uid=0 gid=0 name= geo="" starting /var/ partition monitor thread ...
210402 11:42:44 time=1617331364.080871 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2efb700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Balancer ..."
210402 11:42:44 time=1617331364.081595 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:228 tident= sec= uid=0 gid=0 name= geo="" enabling net/io load monitor
210402 11:42:44 time=1617331364.081684 func=Storage level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=Storage:230 tident= sec= uid=0 gid=0 name= geo="" enabling local disk S.M.A.R.T attribute monitor
210402 11:42:44 time=1617331364.085755 func=Monitor level=INFO logid=19e83fc2-935d-11eb-9a60-90e2ba9a1550 unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a0ffd700 source=MonitorVarPartition:81 tident= sec= uid=0 gid=0 name= geo="" msg="fst partition monitor activated"
=====> fstofs.metalogdir : /var/eos/md/
210402 11:42:44 time=1617331364.087989 func=AddBroker level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=XrdMqClient:173 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="add broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst?xmqclient.advisory.status=0&xmqclient.advisory.query=0&xmqclient.advisory.flushbacklog=0"
210402 11:42:44 time=1617331364.088738 func=SomListener level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789e3ff700 source=XrdMqSharedObject:2029 tident= sec=(null) uid=99 gid=99 name=- geo="" mgm="starting SOM listener"
210402 11:42:44 time=1617331364.089671 func=Communicator level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=Communicator:452 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="shared object notification" type=0 subject="/eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst/gw/txqueue/txq"
210402 11:42:44 time=1617331364.107569 func=GetNetSpeed level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Publish:97 tident= sec=(null) uid=99 gid=99 name=- geo="" ethtool:networkspeed=10.00 GB/s
210402 11:42:44 time=1617331364.107819 func=Publish level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Publish:426 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="publish networkspeed=10.00 GB/s"
210402 11:42:44 time=1617331364.107845 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Publish ..."
210402 11:42:44 time=1617331364.111174 func=Subscribe level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=XrdMqClient:621 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to subscribe to broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst?xmqclient.advisory.status=0&xmqclient.advisory.query=0&xmqclient.advisory.flushbacklog=0"
###### mq messaging: starting thread
210402 11:42:44 time=1617331364.111370 func=RequestBroadcasts level=NOTE logid=19e0bce8-935d-11eb-8081-90e2ba9a1550 unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=XrdFstOfs:1810 tident= sec= uid=0 gid=0 name= geo="" msg="requesting broadcasts"
210402 11:42:44 time=1617331364.111524 func=Communicator level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=Communicator:452 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="shared object notification" type=0 subject="*/nfs10.aligrid.hiroshima-u.ac.jp:1095"
210402 11:42:44 time=1617331364.111581 func=Communicator level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=Communicator:476 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="no action on subject creation" qpath="*/nfs10.aligrid.hiroshima-u.ac.jp:1095" own_id="/eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst"
210402 11:42:44 time=1617331364.114830 func=Communicator level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=Communicator:452 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="shared object notification" type=0 subject="*/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst/gw/txqueue/txq"
210402 11:42:44 time=1617331364.114955 func=Communicator level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a1fff700 source=Communicator:452 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="shared object notification" type=0 subject="/eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst/*"
210402 11:42:44 time=1617331364.121722 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0&xmqclient.advisory.query=0&xmqclient.advisory.status=0"
210402 11:42:44 time=1617331364.124077 func=Configure level=NOTE logid=19e0bce8-935d-11eb-8081-90e2ba9a1550 unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78bde91780 source=XrdFstOfs:870 tident= sec= uid=0 gid=0 name= geo="" FST_HOST=nfs10.aligrid.hiroshima-u.ac.jp FST_PORT=1095 FST_HTTP_PORT=8001 VERSION=4.8.31 RELEASE=1 KEYTABADLER=ba732327
Config warning: asynchronous I/O has been disabled!
Config warning: sendfile I/O has been disabled!
Config warning: 'xrootd.prepare logdir' not specified; prepare tracking disabled.
------ xrootd protocol initialization completed.
------ xrootd fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 initialization completed.
210402 11:42:45 time=1617331365.088079 func=SendMessage level=ERROR logid=19e0140a-935d-11eb-8081-90e2ba9a1550 unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a33ff700 source=XrdMqClient:269 tident= sec= uid=0 gid=0 name= geo="" msg="failed to send message" dst="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst?xmqclient.advisory.status=0&xmqclient.advisory.query=0&xmqclient.advisory.flushbacklog=0" msg="/eos/*/errorreport?xrdmqmessage.header=1a8032a0-935d-11eb-a535-90e2ba9a1550^^/eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst^^^/eos/*/errorreport^errorreport^1617331365^78302000^0^0^0^0^^^^0^0^&xrdmqmessage.body=210402 11:42:44 time=1617331364.121722 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0#and#xmqclient.advisory.query=0#and#xmqclient.advisory.status=0"&xrdmqmessage.mon=1"
210402 11:42:45 9576 FstOfs_SendMessage: Unable to ; success
210402 11:42:45 time=1617331365.098053 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a33ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0&xmqclient.advisory.query=0&xmqclient.advisory.status=0"
210402 11:42:45 time=1617331365.098119 func=ErrorReport level=ERROR logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a33ff700 source=ErrorReport:91 tident= sec= uid=0 gid=0 name= geo="" msg="cannot send errorreport broadcast"
210402 11:42:45 time=1617331365.124989 func=Run level=NOTE logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789a3ff700 source=HttpServer:113 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="starting http server" mode="epoll" threads=16
210402 11:42:45 time=1617331365.126792 func=Run level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789a3ff700 source=HttpServer:157 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="start of micro httpd succeeded [port=8001]"
210402 11:42:46 time=1617331366.077562 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a39fd700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Remover ..."
210402 11:42:46 time=1617331366.081858 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2efb700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Balancer ..."
210402 11:42:46 time=1617331366.107979 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Publish ..."
210402 11:42:46 time=1617331366.131734 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0&xmqclient.advisory.query=0&xmqclient.advisory.status=0"
210402 11:42:48 time=1617331368.077740 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a39fd700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Remover ..."
210402 11:42:48 time=1617331368.082040 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2efb700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Balancer ..."
210402 11:42:48 time=1617331368.108143 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Publish ..."
210402 11:42:48 time=1617331368.141637 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0&xmqclient.advisory.query=0&xmqclient.advisory.status=0"
210402 11:42:49 time=1617331369.082583 func=MgmSyncer level=INFO logid=FstOfsStorage unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2dfa700 source=MgmSyncer:63 tident= sec= uid=0 gid=0 name= geo="" msg="waiting to know manager"
210402 11:42:50 time=1617331370.077898 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a39fd700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Remover ..."
210402 11:42:50 time=1617331370.082205 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2efb700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Balancer ..."
210402 11:42:50 time=1617331370.108315 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Publish ..."
210402 11:42:50 time=1617331370.151491 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0&xmqclient.advisory.query=0&xmqclient.advisory.status=0"
210402 11:42:52 time=1617331372.078055 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a39fd700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Remover ..."
210402 11:42:52 time=1617331372.082374 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a2efb700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Balancer ..."
210402 11:42:52 time=1617331372.108487 func=getFstNodeConfigQueue level=INFO logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f78a17fe700 source=Config:44 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting for config queue in Publish ..."
210402 11:42:52 time=1617331372.161358 func=RefreshBrokersEndpoints level=ERROR logid=static.............................. unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdMqClient:495 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="failed to contact broker" url="root://grid05.aligrid.hiroshima-u.ac.jp:1097//eos/nfs10.aligrid.hiroshima-u.ac.jp:1095/fst_mq_test?xmqclient.advisory.flushbacklog=0&xmqclient.advisory.query=0&xmqclient.advisory.status=0"
210402 11:42:53 time=1617331373.266713 func=RequestBroadcasts level=NOTE logid=19e0bce8-935d-11eb-8081-90e2ba9a1550 unit=fst@nfs10.aligrid.hiroshima-u.ac.jp:1095 tid=00007f789d3ff700 source=XrdFstOfs:1810 tident= sec= uid=0 gid=0 name= geo="" msg="requesting broadcasts"
@@@@@@ 00:00:00 op=shutdown msg="shutdown timedout after 0 seconds, signal=15
@@@@@@ 00:00:00 op=shutdown status=forced-complete
On MGM+MQ node, there are a plenty of logs in /var/log/eos/mgm
so I am not sure which one is relevant but I can provide them if requested, of course.
Where should I check in this situation? Any suggestion will be highly appropriated.
Best,
Masanori