Kolkata::EOS : Xrootd has timed out and was killed after 300s

Dear All,

We recently Upgrade the EOS for APMON INSTANCE ALICE::Kolkata::EOS from Aquamrine to Citrine.

As per [http://alimonitor.cern.ch/stats?page=SE/table](http://Monalisa Page of SE and EOS), it say
++++++
The following command has timed out and was killed after 300s: [/home/monalisa/xrootd/bin/xrdcp, --tpc, only, --force, --path, --posc, --nopbar, -ODeos.app=transfer-3rd, root:// eosalice.cern.ch:1094 //04/36435/05422528-a162-11e8-b5f9-a310691b2def?authz=-----BEGIN SEALED CIPHER-----
N2AYET+3iqIECPdLLxgdLPnrjsOT4jUVtKqqaHmCjc-TEic8eLB5b1nRkMam6+yMSAOpSZ-wmOLL
3B7gY5qTy7dV1x9xMbLsAVa+1dEwWcvHxH1Kt3R-hIoMIm2KmXKYoXU4BmHaxN-mA0kTnPx4P5AV
KGAW8LxQrDu0LAAo6sY=


++++++
To check the xrdcp, we copy a file from eos to local disk (/dev/null) by xrdcp, but it take too long time and after wait some time it time out. But those file is available in Kolkata EOS.


[root@eos ~]# xrdcp -v -d2 -f root://eos.tier2-kol.res.in:1094//01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c /dev/null
[2019-01-30 20:54:07.426166 +0530][Debug ][Utility ] CopyProcess: 2 jobs to prepare
[2019-01-30 20:54:07.426494 +0530][Debug ][Poller ] Available pollers: built-in
[2019-01-30 20:54:07.426519 +0530][Debug ][Poller ] Attempting to create a poller according to preference: built-in
[2019-01-30 20:54:07.426535 +0530][Debug ][Poller ] Creating poller: built-in
[2019-01-30 20:54:07.426558 +0530][Debug ][Poller ] Creating and starting the built-in poller…
[2019-01-30 20:54:07.426778 +0530][Debug ][Poller ] Using 1 poller threads
[2019-01-30 20:54:07.426806 +0530][Debug ][TaskMgr ] Starting the task manager…
[2019-01-30 20:54:07.426873 +0530][Debug ][TaskMgr ] Task manager started
[2019-01-30 20:54:07.426894 +0530][Debug ][JobMgr ] Starting the job manager…
[2019-01-30 20:54:07.427007 +0530][Debug ][JobMgr ] Job manager started, 3 workers
[2019-01-30 20:54:07.427035 +0530][Debug ][TaskMgr ] Registering task: “FileTimer task” to be run at: [2019-01-30 20:54:07 +0530]
[2019-01-30 20:54:07.427066 +0530][Debug ][File ] 5 0 0 1548849935
[2019-01-30 20:54:07.427170 +0530][Debug ][Utility ] Creating a classic copy job, from root://eos.tier2-kol.res.in:1094//01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c to file://localhost/dev/null
[2019-01-30 20:54:07.427240 +0530][Debug ][Utility ] Monitor library name not set. No monitoring
[2019-01-30 20:54:07.427388 +0530][Debug ][Utility ] Opening root://eos.tier2-kol.res.in:1094//01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c for reading
[2019-01-30 20:54:07.427470 +0530][Debug ][File ] [0x23409b0@root://eos.tier2-kol.res.in:1094//01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c] Sending an open command
[2019-01-30 20:54:07.427564 +0530][Debug ][PostMaster ] Creating new channel to: eos.tier2-kol.res.in:1094 1 stream(s)
[2019-01-30 20:54:07.427619 +0530][Debug ][PostMaster ] [eos.tier2-kol.res.in:1094 #0] Stream parameters: Network Stack: IPAuto, Connection Window: 120, ConnectionRetry: 5, Stream Error Window: 1800
[2019-01-30 20:54:07.427673 +0530][Debug ][TaskMgr ] Registering task: “TickGeneratorTask for: eos.tier2-kol.res.in:1094” to be run at: [2019-01-30 20:54:22 +0530]
[2019-01-30 20:54:07.427854 +0530][Debug ][PostMaster ] [eos.tier2-kol.res.in:1094] Found 1 address(es): [::ffff:144.16.112.17]:1094
[2019-01-30 20:54:07.427911 +0530][Debug ][AsyncSock ] [eos.tier2-kol.res.in:1094 #0.0] Attempting connection to [::ffff:144.16.112.17]:1094
[2019-01-30 20:54:07.428011 +0530][Debug ][Poller ] Adding socket 0x2346390 to the poller
[2019-01-30 20:54:07.428155 +0530][Debug ][AsyncSock ] [eos.tier2-kol.res.in:1094 #0.0] Async connection call returned
[2019-01-30 20:54:07.428241 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Sending out the initial hand shake + kXR_protocol
[2019-01-30 20:54:07.428450 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Got the server hand shake response (type: manager [], protocol version 310)
[2019-01-30 20:54:07.428508 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] kXR_protocol successful (type: manager [], protocol version 310)
[2019-01-30 20:54:07.428625 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Sending out kXR_login request, username: root, cgi: ?xrd.cc=in&xrd.tz=6&xrd.appname=xrdcp&xrd.info=&xrd.hostname=eos.tier2-kol.res.in&xrd.rn=v4.8.5, dual-stack: false, private IPv4: false, private IPv6: false
[2019-01-30 20:54:07.428788 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Logged in, session: 7e06000015560000980000007e060000
[2019-01-30 20:54:07.428812 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Authentication is required: &P=sss,0.13:/etc/eos.keytab&P=unix
[2019-01-30 20:54:07.428827 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Sending authentication data
[2019-01-30 20:54:07.430306 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Trying to authenticate using sss
[2019-01-30 20:54:07.430682 +0530][Debug ][XRootDTransport ] [eos.tier2-kol.res.in:1094 #0.0] Authenticated with sss.
[2019-01-30 20:54:07.430719 +0530][Debug ][PostMaster ] [eos.tier2-kol.res.in:1094 #0] Stream 0 connected.
[2019-01-30 20:54:07.437806 +0530][Debug ][TaskMgr ] Registering task: “WaitTask for: 0x0x2340b20” to be run at: [2019-01-30 20:55:47 +0530]
[2019-01-30 20:55:47.437124 +0530][Debug ][TaskMgr ] Done with task: “WaitTask for: 0x0x2340b20”
[2019-01-30 20:55:47.437816 +0530][Debug ][TaskMgr ] Registering task: “WaitTask for: 0x0x2340b20” to be run at: [2019-01-30 20:57:27 +0530]

[root@eos ~]# eos ls -al /eos/kolkataalice/grid/01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c
-rw-rw-r-- 6 10367 1395 155204001 Nov 25 2017 b793d378-d17b-11e7-8ef0-fb9b42643b0c
[root@eos ~]#
[root@eos ~]# eos fileinfo /eos/kolkataalice/grid/01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c
File: ‘/eos/kolkataalice/grid/01/65264/b793d378-d17b-11e7-8ef0-fb9b42643b0c’ Flags: 0664
Size: 155204001
Modify: Sat Nov 25 06:28:16 2017 Timestamp: 1511571496.852547000
Change: Sat Nov 25 06:28:15 2017 Timestamp: 1511571495.904413952
CUid: 10367 CGid: 1395 Fxid: 0026ecdc Fid: 2551004 Pid: 4087 Pxid: 00000ff7
XStype: adler XS: 90 3e 1c 1c ETAGs: “684779921997824:903e1c1c”
raid6 Stripes: 6 Blocksize: 1M LayoutId: 20640542
#Rep: 6
┌───┬──────┬────────────────────────┬────────────────┬────────────────┬──────────┬──────────────┬────────────┬────────┬────────────────────────┐
│no.│ fs-id│ host│ schedgroup│ path│ boot│ configstatus│ drainstatus│ active│ geotag│
└───┴──────┴────────────────────────┴────────────────┴────────────────┴──────────┴──────────────┴────────────┴────────┴────────────────────────┘
0 6 eos01.tier2-kol.res.in default.2 /edata3 booted rw nodrain online Kolkata-EOS
1 18 eos02.tier2-kol.res.in default.2 /edata3 booted rw nodrain online Kolkata-EOS
2 30 eos03.tier2-kol.res.in default.2 /edata3 booted rw nodrain online Kolkata-EOS
3 5 eos01.tier2-kol.res.in default.2 /edata2 booted rw nodrain online Kolkata-EOS
4 17 eos02.tier2-kol.res.in default.2 /edata2 booted rw nodrain online Kolkata-EOS
5 29 eos03.tier2-kol.res.in default.2 /edata2 booted rw nodrain online Kolkata-EOS


[root@eos ~]#

========
We check the permission of eos.keytab and host certificate in mgm. It’s ok. Also, check the xrd.cf.mgm and think it may be ok. The output is below:-

[root@eos ~]# ls -l /etc/eos.keytab
-r-------- 1 daemon daemon 135 Jan 30 18:25 /etc/eos.keytab
[root@eos ~]# ls -l /etc/grid-security/daemon/host
hostcert.pem hostkey.pem
[root@eos ~]# ls -l /etc/grid-security/daemon/host*
-rw------- 1 daemon daemon 1432 Jan 29 10:55 /etc/grid-security/daemon/hostcert.pem
-r-------- 1 daemon daemon 1708 Jan 29 10:55 /etc/grid-security/daemon/hostkey.pem
[root@eos ~]#
[root@eos ~]# eos whoami
Virtual Identity: uid=0 (2,99,3,0) gid=0 (99,4,0) [authz:sss] sudo* host=localhost
[root@eos ~]# eos vid ls -U
https:"":uid => root
krb5:"":uid => root
sss:"":uid => root
tident:"*@eos":uid => root
unix:"":uid => 10367
[root@eos ~]# eos vid ls -a
auth=https
auth=krb5
auth=sss
[root@eos ~]# grep -E “sec.(protocol|protbind)” /etc/xrd.cf.mgm
sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
#sec.protocol krb5 -exptkn:/var/eos/auth/krb5# host/@CERN.CH
#sec.protocol krb5 host/@CERN.CH
#sec.protocol gsi -crl:0 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gridmap:/etc/grid-security/grid-mapfile -d:0 -gmapopt:2 -vomsat:1 -moninfo:1 -exppxy:/var/eos/auth/gsi#
#sec.protocol gsi -crl:0 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gridmap:/etc/grid-security/grid-mapfile -d:0 -gmapopt:2 -vomsat:1 -moninfo:1
sec.protbind localhost.localdomain sss unix
sec.protbind localhost sss unix
sec.protbind * only sss unix
[root@eos ~]#

Kindly suggest to resolve it.