Dear Experts!
Sorry to distract with a possibly nubed question, but couldn’t find a solution on my own:
I’m running EOS 5.2.4 on my testbed cluster.
I encountered a problem.
Read or Write a file from EOS instance results in an error..
[telecast@vm4-ui1 ~]$ gfal-cat https://vm-eos.jinr.ru:8443/eos/user/t/telecast/file gfal-cat error: 112 (Host is down) - Result Could not connect to server after 1 attempts [telecast@vm4-ui1 ~]$ gfal-copy file https://vm-eos.jinr.ru:8443/eos/user/t/telecast/file_2 Copying file:///home/telecast/file [FAILED] after 0s gfal-copy error: 6 (No such device or address) - TRANSFER ERROR: Copy failed (streamed). Last attempt: Could not connect to server (destination) [telecast@vm4-ui1 ~]$ XrdSecPROTOCOL=krb5 /opt/eos/xrootd/bin/xrdcp file root://vm-eos.jinr.ru//eos/user/t/telecast/file_222 [0B/0B][100%][==================================================][0B/s] Run: [ERROR] Server responded with an error: [3018] Unable to create file - (O_EXCL) /eos/user/t/telecast/file_222; File exists (destination) [telecast@vm4-ui1 ~]$ curl -k -L -X PUT -H "Authorization: Bearer $(cat /tmp/bt_u333)" --upload-file file http s://vm-eos.jinr.ru:8443/eos/user/t/telecast/file_111 curl: (7) NSS: client certificate not found (nickname not specified) EOS Console [root://localhost] |/eos/user/t/telecast/> ls -la drwxrwxr-+ 1 telecast telecast 48 Feb 2 13:22 . drwxrwxr-x 1 root root 48 Aug 30 17:47 .. -rw-rw-r-- 2 telecast telecast 24 Feb 2 12:37 file -rw-r--r-- 2 telecast telecast 24 Feb 2 13:22 file2 -rw-r--r-- 0 telecast telecast 0 Feb 2 13:14 file_111 -rw-r--r-- 0 telecast telecast 0 Feb 2 13:16 file_222 -rw-r--r-- 0 telecast telecast 0 Feb 2 13:16 file_2222
But I can delete files or get list of them
[telecast@vm4-ui1 ~]$ XrdSecPROTOCOL=krb5 /opt/eos/xrootd/bin/xrdfs root://vm-eos.jinr.ru ls /eos/user/t/telecast/ /eos/user/t/telecast/file /eos/user/t/telecast/file_2 [telecast@vm4-ui1 ~]$ gfal-rm https://vm-eos.jinr.ru:8443/eos/user/t/telecast/file_2 https://vm-eos.jinr.ru:8443/eos/user/t/telecast/file_2 DELETED [telecast@vm4-ui1 ~]$ gfal-ls https://vm-eos.jinr.ru:8443/eos/user/t/telecast/ file [telecast@vm4-ui1 ~]$ XrdSecPROTOCOL=krb5 /opt/eos/xrootd/bin/xrdfs root://vm-eos.jinr.ru:1094 prepare -s /eos/user/t/telecast/file 04469f5ddddc:af2db81c.65bba2cd:10:1706800878 [telecast@vm4-ui1 ~]$ gfal-ls https://vm-eos.jinr.ru:8443/eos/user/t/telecast/ file [telecast@vm4-ui1 ~]$ XrdSecPROTOCOL=krb5 /opt/eos/xrootd/bin/xrdfs root://vm-eos.jinr.ru:1094 ls /eos/user/t/telecast/ /eos/user/t/telecast/file
Also I can perform any operation (write/read/delete) succeeds via fuse mounts the directory
All FS in the cluster has RW mode, and also hasn’t any errors.
Some info about my environment:
My cluster consists from 3 VM:
VM1:mgm,sync,qdb,mq,fst
VM2:mgm,sync,qdb,mq,fst
VM3:fst,qdb
MGM config
cat /etc/xrd.cf.mgm | grep -Ev "^#|^[[:space:]]$" XrdSecDEBUG=6 xrootd.fslib libXrdEosMgm.so xrootd.seclib libXrdSec.so xrootd.async off nosf xrootd.chksum adler32 xrd.sched mint 16 maxt 256 idle 300 all.export / nolock all.role manager oss.fdlimit 16384 32768 sec.protocol unix sec.protocol sss -c /etc/eos.client.keytab -s /etc/eos.keytab sec.protocol krb5 /etc/krb5.keytab host/vm1-eos-mgm1.jinr.ru@JINR.RU sec.protbind localhost.localdomain unix sss sec.protbind localhost unix sss sec.protbind * only krb5 sss unix mgmofs.fs / mgmofs.targetport 1095 mgmofs.trace all debug mgmofs.broker root://vm-eos.jinr.ru:1097//eos/ mgmofs.instance eosdev mgmofs.metalog /var/eos/md mgmofs.txdir /var/eos/tx mgmofs.authdir /var/eos/auth mgmofs.archivedir /var/eos/archive mgmofs.qosdir /var/eos/qos mgmofs.reportstorepath /var/eos/report mgmofs.autoloadconfig default mgmofs.qoscfg /var/eos/qos/qos.conf mgmofs.cfgtype quarkdb mgmofs.alias vm-eos.jinr.ru mgmofs.fstgw vm1-eos-mgm1.jinr.ru:3001 mgmofs.nslib /usr/lib64/libEosNsQuarkdb.so mgmofs.qdbcluster vm1-eos-db1.jinr.ru:7777 vm2-eos-db2.jinr.ru:7777 vm3-eos-db3.jinr.ru:7777 mgmofs.qdbpassword_file /etc/xrootd/eos.keytab mgmofs.centraldrain true xrd.protocol XrdHttp:8443 /usr/lib64/libXrdHttp.so http.cadir /etc/grid-security/certificates/ http.cert /etc/grid-security/daemon/hostcert.pem http.key /etc/grid-security/daemon/hostkey.pem http.gridmap /etc/grid-security/grid-mapfile http.secxtractor libXrdVoms.so http.exthandler xrdtpc /usr/lib64/libXrdHttpTPC.so http.exthandler EosMgmHttp /usr/lib64/libEosMgmHttp.so eos::mgm::http::redirect-to-https=0 mgmofs.macaroonslib /usr/lib64/libXrdMacaroons.so /usr/lib64/libXrdAccSciTokens.so scitokens.trace all macaroons.secretkey /etc/eos.macaroon.secret all.sitename vm-eos ofs.tpc redirect delegated vm-eos.jinr.ru:1094
FST config
[root@vm1-eos-mgm1 mgm]# cat /etc/xrd.cf.fst | grep -Ev "^#|^[[:space:]]$" set MGM=$EOS_MGM_ALIAS xrootd.fslib -2 libXrdEosFst.so xrootd.async off nosf xrd.network keepalive xrootd.redirect $(MGM):1094 chksum xrootd.seclib libXrdSec.so sec.protocol unix sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab sec.protbind * only unix sss all.export / nolock all.trace none all.manager localhost 2131 xrd.port 1095 ofs.persist off ofs.osslib libEosFstOss.so ofs.tpc pgm /opt/eos/xrootd/bin/xrdcp fstofs.broker root://localhost:1097//eos/ fstofs.autoboot true fstofs.quotainterval 10 fstofs.metalog /var/eos/md/ fstofs.qdbcluster vm1-eos-db1.jinr.ru:7777 vm2-eos-db2.jinr.ru:7777 vm3-eos-db3.jinr.ru:7777 fstofs.qdbpassword_file /etc/eos.keytab-qdb xrd.tls /etc/grid-security/daemon/hostcert.pem /etc/grid-security/daemon/hostkey.pem xrd.tlsca certdir /etc/grid-security/certificates/ xrd.protocol XrdHttp:8444 libXrdHttp.so http.exthandler EosFstHttp /usr/lib64/libEosFstHttp.so none http.exthandler xrdtpc libXrdHttpTPC.so http.trace all
eos vid ls
geotag:"default" => "RU::JINR::LITVM" gsi:"":gid => root gsi:"":uid => root hostmatch:"protocol=* pattern=t2*.jinr.ru https:"":gid => root https:"":uid => root krb5:"":gid => root krb5:"":uid => root krb5:"eexprt":gid => root krb5:"eexprt":uid => root publicaccesslevel: => 1024 sss:"":gid => root sss:"":uid => root sss:"daemon":gid => root sss:"daemon":uid => root sss:"eos_exporter":gid => root sss:"eos_exporter":uid => root sudoer => uids(daemon,eexprt) tident:"*@t2*.jinr.ru":gid => root tident:"*@t2*.jinr.ru":uid => root tident:"*@vm1-eos-mgm1.jinr.ru":gid => root tident:"*@vm1-eos-mgm1.jinr.ru":uid => root tident:"*@vm2-eos-mgm2.jinr.ru":gid => root tident:"*@vm2-eos-mgm2.jinr.ru":uid => root tident:"*@vm223-1.jinr.ru":gid => root tident:"*@vm223-1.jinr.ru":uid => root tident:"*@vm3-eos-fst0.jinr.ru":gid => root tident:"*@vm3-eos-fst0.jinr.ru":uid => root tident:"*@vm4-ui1.jinr.ru":gid => root tident:"*@vm4-ui1.jinr.ru":uid => root tokensudo => always
attributes
[root@vm1-eos-mgm1 mgm]# eos acl -l /eos/user/t/telecast g:telecast:rwcmxq [root@vm1-eos-mgm1 mgm]# eos attr ls /eos/user/t/telecast sys.acl="g:1002:rwcmxq" sys.eos.btime="1681994080.542939441" sys.forced.blocksize="4k" sys.forced.checksum="adler" sys.forced.layout="replica" sys.forced.nstripes="2" sys.forced.space="default" sys.mask="770" sys.owner.auth="*" user.acl=""
Software
[root@vm1-eos-mgm1 mgm]# rpm -qa | grep eos eos-client-5.2.4-1.el7.cern.x86_64 eos-grpc-gateway-0.1-1.el7.x86_64 eos-ns-inspect-5.2.4-1.el7.cern.x86_64 eos-quarkdb-5.2.4-1.el7.cern.x86_64 eos-libmicrohttpd-0.9.38-eos.el7.cern.x86_64 eos-xrootd-5.6.4-1.el7.cern.x86_64 eos-server-5.2.4-1.el7.cern.x86_64 eos-folly-deps-2019.11.11.00-1.el7.cern.x86_64 eos-folly-2019.11.11.00-1.el7.cern.x86_64 eos-grpc-1.56.1-3.el7.x86_64 eos-grpc-devel-1.56.1-3.el7.x86_64 [root@vm1-eos-mgm1 mgm]# rpm -qa | grep xrootd xrootd-scitokens-5.6.4-1.el7.x86_64 xrootd-client-libs-5.6.4-1.el7.x86_64 xrootd-voms-5.6.4-1.el7.x86_64 xrootd-libs-5.6.4-1.el7.x86_64 eos-xrootd-5.6.4-1.el7.cern.x86_64 xrootd-server-5.6.4-1.el7.x86_64 xrootd-server-libs-5.6.4-1.el7.x86_64 [root@vm1-eos-mgm1 mgm]# cat /etc/*release NAME="Scientific Linux" VERSION="7.9 (Nitrogen)" [root@vm1-eos-mgm1 mgm]# uname -a Linux vm1-eos-mgm1.jinr.ru 3.10.0-1160.108.1.el7.x86_64 #1 SMP Wed Jan 24 08:37:16 CST 2024 x86_64 x86_64 x86_64 GNU/Linux
I see an errors in the logs, but I don’t understand how it consisted with my issue:
240202 13:43:20 time=1706870600.357315 func=Emsg level=ERROR logid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx unit=mgm@vm1-eos-mgm1.jinr.ru:1094 tid=00007fde3d93f700 source=XrdMgmOfs:856 tident=<single-ex
ec> sec= uid=0 gid=0 name= geo="" Unable to set attribute /eos/dev/proc/recycle/uid:1002/2024/02/02/0/#:#eos#:#user#:#t#:#telecast#:#file_N.00000000001029db; Operation not permitted
240202 13:43:20 time=1706870600.357352 func=_rem level=ERROR logid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx unit=mgm@vm1-eos-mgm1.jinr.ru:1094 tid=00007fde3d93f700 source=Rm:415 tident=<single-ex
ec> sec=https uid=1002 gid=1002 name=telecast geo="RU::JINR::LITVM" msg="failed to set attribute on recycle path" path=/eos/dev/proc/recycle/uid:1002/2024/02/02/0/#:#eos#:#user#:#t#:#telecast#:#file_N.00000000001029db
full logs (xrdlog.mgm) of failed operation.
Any help would be greatly appreciated.
Thanks in advance.