Possible metadata delay with gridftp + eos fusex

Hi all,

We’re setting up a gridftp for CMS Phedex transfers. For several transfers we’ve gotten errors about checksum mismatch. The phedex tools do in fact see differing checksums.

However, when we transfer files manually, we get the correct checksums.
In fact: the checksums in EOS are always correct. We’re wondering if there is a timing issue at play, where the fusex mount does not see the full write of the file yet, causing those errors.

We checked already, that the zmq connection between MGM and the fusex client is there. This seems in order. Now we are wondering: are there any other tunings to consider for the fuse mount, or should we use a different config altogether, not fuse, but … ?

Best,
Erich

You should not run gridFTP on top of FUSEX mount. There is a DSI plug-in for gridftp, which is much better, because the checksum check is integrated.

Hi @apeters,

and thanks for the advice. We’ve successfully switched to the xrootd-dsi plugin, taken from here,
https://storage-ci.web.cern.ch/storage-ci/xrootd-dsi/tag/cc-7/x86_64/

The virtual mountpoint for our EOS instance seems correct with

export XROOTD_VMP="eos.grid.vbc.ac.at:1094:/eos/vbc=/eos/vbc"
export XROOTD_DSI_EOS=1         # enable ALL the EOS specifics

We’re now getting successful file transfers, it appears the checksum issue is gone.
However, we noticed all gridftp transfers are now mapped as root, even though the spawned gridftp processes are running with their appropriate uids (from the mapping through LCMAPS).

# in eos mgm log
# example 1
200720 16:40:02 time=1595256002.392300 func=IdMap                    level=INFO  logid=static.............................. unit=mgm@mgm-1.eos.grid.vbc.ac.at:1094 tid=00007f4fe71ea700 source=Mapping:993                    tident= sec=(null) uid=99 gid=99 name=- geo="" sec.prot=sss sec.name="root" sec.host="gridftp-1.grid.vbc.ac.at" sec.vorg="" sec.grps="root" sec.role="" sec.info="" sec.app="eos/gridftp" sec.tident="grid.cms.31198:390@gridftp-1.grid.vbc.ac.at" vid.uid=0 vid.gid=0

#example 2
200720 16:52:31 228429 XrootdXeq: grid.cms.32876:458@gridftp-1.grid.vbc.ac.at pvt IPv4 login as root
200720 16:52:31 time=1595256751.689879 func=IdMap                    level=INFO  logid=static.............................. unit=mgm@mgm-1.eos.grid.vbc.ac.at:1094 tid=00007f4fe39f6700 source=Mapping:993                    tident= sec=(null) uid=99 gid=99 name=- geo="" sec.prot=sss sec.name="root" sec.host="gridftp-1.grid.vbc.ac.at" sec.vorg="" sec.grps="root" sec.role="" sec.info="" sec.app="eos/gridftp" sec.tident="grid.cms.32876:458@gridftp-1.grid.vbc.ac.at" vid.uid=0 vid.gid=0

# in gridftp log
[30902] Mon Jul 20 16:36:04 2020 :: DN /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=ouruser/CN=123456/CN=Our User successfully authorized.
[30902] Mon Jul 20 16:36:04 2020 :: User grid.cms.prod successfully authorized.
[30902] Mon Jul 20 16:36:05 2020 :: Starting to transfer "/eos/vbc/experiments/cms/store/PhEDEx_LoadTest07/LoadTest07_Debug_ES_PIC/AT_Vienna/81/LoadTest07_PIC_D5_uXl4W6dhlONtzZiw_81".

I’m wondering if we are missing some vid mappings in EOS. Currently there is an /etc/eos.keytab on the host, that has keys for “daemon/daemon” and “anybody/anygroup”.

We’ve also tried the vid map

tident:"*@gridftp-1.grid.vbc.ac.at":uid => root

But this also does not work.
Now we are not sure, what the next steps are and if this is an issue on the EOS or the gridftp side.
Any more help is greatly appreciated,

Best,
Erich

Edit: add more mgm logs for clarification

Hi all,
We’ve found a solution by removing the /etc/eos.keytab and thereby switching to unix mapping.
This gives the desired result for us, now we have the correct users and groups showing up in eos.

200722 17:01:46 195238 XrootdXeq: grid.cms.19646:108@gridftp-1.grid.vbc.ac.at pvt IPv4 login as grid.cms.prod
[...]
200722 17:01:48 time=1595430108.087080 func=IdMap                    level=INFO  logid=static.............................. unit=mgm@mgm-1.eos.grid.vbc.ac.at:1094 tid=00007f4fd9df9700 source=Mapping:993                    tident= sec=(null) uid=99 gid=99 name=- geo="" sec.prot=unix sec.name="grid.cms.prod" sec.host="gridftp-1.grid.vbc.ac.at" sec.vorg="" sec.grps="role.grid.cms.prod" sec.role="" sec.info="" sec.app="eos/gridftp" sec.tident="grid.cms.19646:108@gridftp-1

Best,
Erich