CERN Accelerating science

XRootD TPC with EOS

Hello,

Can you please clarify if it is possible to directly enable TPC (with delegated credentials) in EOS and if so what is the typical example of the ofs.tpc directive that we need to include in the /etc/xrd.cf.mgm? (I am aware of the need to set EOS_FST_NO_SSS_ENFORCEMENT=1 in /etc/sysconfig/eos_env)

According to HTTP(XrdHttp) and XRootD TPC with delegated credentials — EOS CITRINE documentation, the way to set up TPC transfers in EOS is through a “XRootD Proxy service that will act as a gate. way for incoming TPC traffic”. Is this still the case? We are running 4.8.37-1 version.

Many thanks,

George

Hi George,

EOS can do delegated xrootd TPC transfers and indeed it needs a XRootD proxy service in front that will handle the traffic. The EOS_FST_NO_SSS_ENFORCEMENT=1 is not needed in case you use the proxy service in front. This env can be used for instances that you control (both source and destination) but is not a recommended way for doing TPC with external sites.

All the steps to set this up are already laid out in the link that you pasted.

Cheers,
Elvin

Hi Elvin,

Thanks for your reply. Almost everything is set up except one last thing that I would like to double check with you.

The MGM correctly redirects the TPC to the XRootD proxy service (short hostname: cta-eos13) but then it looks like the daemon user on cta-eos13 (that is running the XRootD proxy service) is trying to open a file in the EOS name space via unix auth and it fails (plese see below). Does this mean that we need to have a vid tident mapping of daemon@cta-eos13 for every local EOS user (e.g. dteam, atlas, etc)?

eos vid set map -tident daemon@cta-eos13 vuid:1000 vgid:1000

210915 10:03:44 time=1631696624.847180 func=open level=INFO logid=d48e9532-1603-11ec-9485-1c34da4b6afc unit=mgm@cta-eos14.scd.rl.ac.uk:1094 tid=00007fcdf44f5700 source=XrdMgmOfsFile:498 tident=daemon.205568:379@cta-eos13 sec=unix uid=99 gid=99 name=daemon geo="" op=write trunc=512 path=/eos/antaresdev/dteam/tape/lcgcclient02.tar info=oss.asize=7157760
210915 10:03:44 time=1631696624.847386 func=open level=INFO logid=d48e9532-1603-11ec-9485-1c34da4b6afc unit=mgm@cta-eos14.scd.rl.ac.uk:1094 tid=00007fcdf44f5700 source=XrdMgmOfsFile:621 tident=daemon.205568:379@cta-eos13 sec=unix uid=99 gid=99 name=daemon geo="" msg=“rewrote symlinks” sym-path=/eos/antaresdev/dteam/tape/lcgcclient02.tar realpath=/eos/antaresdev/dteam/tape/lcgcclient02.tar
210915 10:03:44 time=1631696624.847974 func=open level=INFO logid=d48e9532-1603-11ec-9485-1c34da4b6afc unit=mgm@cta-eos14.scd.rl.ac.uk:1094 tid=00007fcdf44f5700 source=XrdMgmOfsFile:1037 tident=daemon.205568:379@cta-eos13 sec=unix uid=99 gid=99 name=daemon geo="" acl=1 r=0 w=0 wo=0 egroup=0 shared=0 mutable=1 facl=0
210915 10:03:44 time=1631696624.848045 func=Emsg level=ERROR logid=d48e9532-1603-11ec-9485-1c34da4b6afc unit=mgm@cta-eos14.scd.rl.ac.uk:1094 tid=00007fcdf44f5700 source=XrdMgmOfsFile:3227 tident=daemon.205568:379@cta-eos13 sec=unix uid=99 gid=99 name=daemon geo="" Unable to open file /eos/antaresdev/dteam/tape/lcgcclient02.tar; Operation not permitted

For reference, the ACL of the EOS dir (which has 755 mode) I am trying to write to is

sys.acl=“u:1000:rwx+dp,z:!u,u:0:+u”

and I have added daemon to the list of sudoers

Hi George,

The proxy service should contact the MGM with the delegated proxy certificate (gsi) so there is something not right on the proxy configuration. You definitely don’t need to add such mappings manually but only rely on the identity in the certificate. Could you paste the xrootd-tpc.cfg file that you use?

Thanks,
Elvin

Also what is the command that you use on the client side to instruct it to use delegated credentials?

Please see below he contents of /etc/xrootd/xrdcp-tpc.sh and /etc/xrootd/xrootd-proxy.cfg

The command I use on the client side is
xrdcp --tpc delegate only root://ceph-gw1.gridpp.rl.ac.uk//dteam:georgep/lcgcclient02.tar root://cta-eos14.scd.rl.ac.uk//eos/antaresdev/dteam/tape/lcgcclient02.tar


/usr/bin/xrdcp --server -f $1 root://$XRDXROOTD_PROXY/$2


all.export /eos/

all.adminpath /var/spool/xrootd
all.pidpath /var/run/xrootd

ofs.tpc autorm fcreds gsi =X509_USER_PROXY ttl 60 60 xfr 9 pgm /etc/xrootd/xrdcp-tpc.sh

ofs.osslib /usr/lib64/libXrdPss.so
ofs.ckslib * /usr/lib64/libXrdPss.so
pss.origin cta-eos14.scd.rl.ac.uk:1094

xrootd.seclib /usr/lib64/libXrdSec.so
sec.protparm gsi -vomsfun:/usr/lib64/libXrdSecgsiVOMS.so -vomsfunparms:certfmt=pem|grps=/atlas,/atlas/uk,/cms,/dteam|grpopt=usefirst|dbg

sec.protocol gsi -dlgpxy:request -exppxy:=creds -crl:try -cert:/etc/grid-security/daemon/gridftp-cert.pem -key:/etc/grid-security/daemon/gridftp-key.pem -gmapopt:null -d:1

sec.protbind * only gsi

Hi George,

Where did you get the -dlgpxy:request parameter from? I don’t see this in the documentation:
https://xrootd.slac.stanford.edu/doc/dev49/sec_config.htm#_Toc517294098

For reference this is who our configuration of such a proxy looks:

ofs.osslib  libXrdPss.so
ofs.ckslib  * libXrdPss.so
xrootd.chksum  adler32
xrootd.seclib  libXrdSec.so
pss.origin  eosatlas.cern.ch:1094
all.export  /eos/
all.adminpath  /var/spool/xrootd
all.pidpath  /var/run/xrootd
sec.protocol  gsi -dlgpxy:1 -exppxy:=creds -crl:1 -moninfo:1 -cert:/etc/grid-security/daemon/gridftp-cert.pem -key:/etc/grid-security/daemon/gridftp-key.pem -gridmap:/etc/grid-security/grid-mapfile -d:1 -gmapopt:2
sec.protbind  * gsi
ofs.tpc  autorm fcreds gsi =X509_USER_PROXY ttl 60 60 xfr 9 pgm /usr/local/bin/xrootd-third-party-copy.sh

By the looks of it, delegation is not properly enabled on your proxy and that is why it’s using unix.

Hope it helps,
Elvin

Hi Elvin.

Thanks for reference config. I am trying to use the XRootD 5 compliant values for the gsi params. So -dlgpxy:request is the equivalent of -dlgpxy:1 (Scalla Extension: Security). I will switch back to the number values.

I was wondering if I need to add sth like this in the /etc/xrdf.mgm

sec.protbind cta-eos13.scd.rl.ac.uk only gsi

to force gsi auth between the MGM and the proxy, My current MGM protbinds are

sec.protbind * only gsi
sec.protbind *.scd.rl.ac.uk sss unix
sec.protbind localhost.localdomain sss unix
sec.protbind localhost sss unix

Best,

George

Hi George,

I think you need to remove this line:

sec.protbind *.scd.rl.ac.uk sss unix

as this already matches your proxy host and only supports sss and unix.

Cheers,
Elvin

Hi Elvin,

I cannot remove

sec.protbind *.scd.rl.ac.uk sss unix

because this will break the EOS instance where FSTs running on other *scd.rl.ac.uk hosts
rely on SSS to authenticate to the MGM.

I did add after the above line the following

sec.protbind cta-eos13.scd.rl.ac.uk gsi

and as a result (?) we got rid of the log lines containing “tident=daemon.205568:379@cta-eos13 sec=unix uid=99 gid=99…”

but unfortunatelly we still get an auth error ton writting

210916 15:04:54 275408 ofs_Run: /etc/xrootd/xrdcp-tpc.sh ended with status 52
210916 15:04:54 275408 ofs_TPC: georgep.102632:25@lcgui05.gridpp.rl.ac.uk /eos/antaresdev/dteam/tape/lcgcclient02.tar [FATAL] Auth failed: (destination)

Best,

George

Hi Elvin,

Just to add that the following order of protbinds in the MGM config finally seemed to work

sec.protbind cta-eos13.scd.rl.ac.uk gsi
sec.protbind *.scd.rl.ac.uk sss

The file is written into EOS space (can see it with eos file info). One th thing I can’t understant is why the progress bar doesn’t appear (even though the transfer is completed)

xrdcp --tpc delegate only --verbose root://ceph-gw1.gridpp.rl.ac.uk//dteam:georgep/random_32MB root://cta-eos14.scd.rl.ac.uk//eos/antaresdev/dteam/random_32MB
[0B/32MB][  0%][>                                                 ][0B/s]

Thanks again,

George

Hi George,

This is an issue that was fixed in XRootD 5.3.0 - the basic problem is that the xrootd proxy can not relay any info about the progress of the TPC job to the client. You can find more details about the fix in the release notes:

There is also some specific documentation on this:
https://xrootd.slac.stanford.edu/doc/dev53/pss_config.htm#_Toc75537967

And the corresponding commits:

Therefore, this behavior will work as expected once the client/server/proxy are updated to XRootD 5.3.0.

Cheers,
Elvin

Hi Elvin,

Many thanks for this info. All the three client/server/proxy need to run XRootD 5.3.0?

Right now, we have XRootD 4.12.8 on the proxy and on the MGM (which runs EOS 4.8.37-1) assuming that the XRootD versions need to match. Is it possible to update the proxy but not the MGM? Is EOS 5 available by the way as a stable release?

Best,

George

Hi George,

You would need the client and the proxy to run XRootD5. Yes, we have EOS5 available, you can grab the rpms from this location:
https://storage-ci.web.cern.ch/storage-ci/eos/diopside/tag/testing/el-7/x86_64/

5.0.1 is the latest but we plan a new release soon.

Cheers,
Elvin

Hi Elvin,

Great, thanks for this. When about is the new release going to be out?
Is it going to support XRootD 5.3.1?

Best,

George

Hi George,

We’ll probably have a new eos5 release next week and yes it will be based on the latest XRootD5 release.

Cheers,
Elvin

Hi Elvin,

Can you please let me know how do you run the XRootD TPC proxy for the EOS disk instance at CERN: do you run it on a seperate/dedicated hardware (if so, how many
nodes and with what NIC specs) or on the EOS nodes themselves?

We seem to be hitting a performance bootleneck with a single node (NIC 25Gb/s) running as a TPC proxy.

Thanks,

George

Hi George,

For example, for EOSATLAS we have roughly 15 machines with 25GB/s each deployed, acting as gateways. The gateways run on separate VMs.

Cheers,
Elvin

Hi Elvin,

Thanks for the information. How do you balance load between your gateways, are you using some sort of cmsd setup, or something else?

Cheers,
Tom