Definitely do not set the env variables to 0. Can you send me your configuration files? /etc/xrd.cf.mgm//etc/xrd.cf.fst/etc/sysconfig/eos_env/ and paste any systemd customizations that you are using. Also restart one FST and set me the logs to understand exactly on which port the HTTP plugin is binding to.
Then issue once more the transfer with HTTP and send me the trace of that transfer from both the MGM and the FST to which it gets redirected. It looks like your FST is crashing when getting such a request.
I see from the logs that you have the EOS_FST_ASYNC_CLOSE functionality enabled. Unfortunately, this only works correctly in the latest EOS version 5.0.29 which comes which a new XRootD version that fixes a bug related to this. Could you please update and retry your transfer? By the way, does a simple xrdcp work correctly against your instance?
I also saw a warning in the FST logs when the service is starting, namely: Config warning: HTTPS functionality was not configured.
I don’t think this is critical but more an issue of using old configuration options. Nevertheless, please have a look at this post where you can find a sample configuration and maybe you can replace the http.ca/key directives with the new ones supported in XRootD 5: https://eos-community.web.cern.ch/t/scitokens-authorization-done-but-no-username-found/783/8?u=esindril
A more distilled config for an FST you can find below:
from your recommendation (the line in /etc/xrd.cf.fst):
{{{
xrd.tls /etc/grid-security/daemon/hostcert.pem /etc/grid-security/daemon/hostkey.pem
}}}
I realized that all FSTs must have host certificate.
But in my case, only 2 MGMs has certificates.
I understand that the certificate on FST is required only for http/webdav?
xrdcp (without TPC) works without this.
{{{
dvl-ui01:~ > date ; gfal-copy -f file:///etc/group
root://dvl-eos.jinr.ru//eos/tests/cms/test-03
Wed Aug 3 13:58:03 MSK 2022
Copying file:///etc/group [DONE] after 0s
}}}
EOS has already been updated to 5.0.29 and the corresponding xroot version:
{{{
dvl-eos-m01:~ # rpm -qa *eos* *xroot* | sort
eos-client-5.0.29-1.el7.cern.x86_64
eos-folly-2019.11.11.00-1.el7.cern.x86_64
eos-folly-deps-2019.11.11.00-1.el7.cern.x86_64
eos-fusex-5.0.29-1.el7.cern.x86_64
eos-fusex-core-5.0.29-1.el7.cern.x86_64
eos-fusex-selinux-5.0.29-1.el7.cern.x86_64
eos-grpc-1.41.0-1.el7.x86_64
eos-grpc-devel-1.41.0-1.el7.x86_64
eos-libmicrohttpd-0.9.38-eos.el7.cern.x86_64
eos-librichacl-1.12-14.el7.cern.x86_64
eos-ns-inspect-5.0.29-1.el7.cern.x86_64
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64
eos-quarkdb-5.0.29-1.el7.cern.x86_64
eos-richacl-1.12-14.el7.cern.x86_64
eos-rocksdb-6.2.4-1.el7.cern.x86_64
eos-server-5.0.29-1.el7.cern.x86_64
eos-xrootd-5.4.7-1.el7.cern.x86_64
xrootd-client-libs-5.4.3-1.el7.x86_64
xrootd-libs-5.4.3-1.el7.x86_64
xrootd-scitokens-5.4.3-1.el7.x86_64
xrootd-server-5.4.3-1.el7.x86_64
xrootd-server-libs-5.4.3-1.el7.x86_64
xrootd-voms-5.4.3-1.el7.x86_64
}}}
I will request certificates for FSTs but I’m afraid it’s not a fast process.
Yes, for HTTPS one needs certificates also on the FSTs. Also certificates are a requirement for any token based access, otherwise the tokens are sent in clear text over the wire. TLS support is available also for the XRootD protocol and again a requirement if you plan to use tokens.
Let me know how it goes once you install the certificates also on the FSTs.
I turned off the FST nodes without certificates.
We also have 5 FST on each of 2 nodes with MGM.
Now gfal-copy from local file to davs works.
I test working with EOS_FST_ASYNC_CLOSE=1,
I can’t say for sure, but it seems in this case the error occurs again.
We also need a working davs with TPC.
Hope this will work too. I will test soon.
Thank you for the confirmation. This makes perfect sense, since the HTTP protocol is less versatile than the XRootD one and indeed the async functionality can not work properly over HTTP. I will fix this for the next release, by disabling async close by default for HTTP.
Indeed, there was a problem. The HTTP layer was not properly populating a field on which I was relying to detect that this was an http access. This is fixed now. I will tag 5.0.31. Thanks for the notification!
It turned out that there is one problem in HTTPS.
Using:
{{{
gfal-copy --copy-mode push …
}}}
copying does not work even from local to local EOS.
I have now tested the versions:
{{{
dvl-eos-m02:~ # rpm -qa eos-* xrootd-* | sort
eos-client-5.1.1-1.el7.cern.x86_64
eos-folly-2019.11.11.00-1.el7.cern.x86_64
eos-folly-deps-2019.11.11.00-1.el7.cern.x86_64
eos-fusex-5.1.1-1.el7.cern.x86_64
eos-fusex-core-5.1.1-1.el7.cern.x86_64
eos-fusex-selinux-5.1.1-1.el7.cern.x86_64
eos-grpc-1.41.0-1.el7.x86_64
eos-grpc-devel-1.41.0-1.el7.x86_64
eos-libmicrohttpd-0.9.38-eos.el7.cern.x86_64
eos-librichacl-1.12-14.el7.cern.x86_64
eos-ns-inspect-5.1.1-1.el7.cern.x86_64
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64
eos-quarkdb-5.1.1-1.el7.cern.x86_64
eos-richacl-1.12-14.el7.cern.x86_64
eos-rocksdb-6.2.4-1.el7.cern.x86_64
eos-server-5.1.1-1.el7.cern.x86_64
eos-xrootd-5.5.1-1.el7.cern.x86_64
xrootd-client-libs-5.5.0-1.el7.x86_64
xrootd-libs-5.5.0-1.el7.x86_64
xrootd-scitokens-5.5.0-1.el7.x86_64
xrootd-server-5.5.0-1.el7.x86_64
xrootd-server-libs-5.5.0-1.el7.x86_64
xrootd-voms-5.5.0-1.el7.x86_64
}}}
We want to work without using /etc/grid-security/grid-mapfile, only with VOMS. But copying with “push” doesn’t work like that,
I get an error:
{{{
Copy failed (3rd push). Last attempt: [gfal_http_third_party_copy] Transfer failure: Remote side failed with status code 403
}}}
With “pull” everything works.
The xrootd protocol works with both “push” and “pull”.
When using /etc/grid-security/grid-mapfile, https works with “push” as well.
Could you please show me the contents of your certificate including the VOMS group information that you are using when doing the transfer?
Also could you please paste the eos vid rules that you are using to enforce the VOMS mapping?
I tried configuring our pre-production instance to skip the grid-map file for HTTPS requests and this works as expected for me in the sense that the vid mapping is respected.
The only modification I have done in the /etc/xrd.cf.mgm configuration is to remove/comment the following line:
just uploading files works without problems.
TPC in pull mode also works without problems.
Doesn’t work without authorization via grid-mapfile only TPC in push mode,
this mode only works with my certificate in grid-mapfile.
{{{
dvl-ui01:~ > gfal-copy --copy-mode pull
davs://se-wbdv.jinr-t1.ru:2880//pnfs/jinrt1.ru/data/cms/vvm-test-01/5GB-000
davs://dvl-eos.jinr.ru:8443//eos/tests/cms/5GB-020
Copying davs://se-wbdv.jinr-t1.ru:2880//pnfs/jinr-t1.ru/data/cms/vvm-test-01/5GB-000 [DONE] after 57s
dvl-ui01:~ >
dvl-ui01:~ > gfal-copy --copy-mode push
davs://se-wbdv.jinr-t1.ru:2880//pnfs/jinr-t1.ru/data/cms/vvm-test-01/5GB-000
davs://dvl-eos.jinr.ru:8443//eos/tests/cms/5GB-021
Copying davs://se-wbdv.jinr-t1.ru:2880//pnfs/jinr-t1.ru/data/cms/vvm-test-01/5GB-000 [FAILED] after 0s
gfal-copy error: 5 (Input/output error) - TRANSFER ERROR: Copy failed (3rd push). Last attempt: Transfer failure: rejected PUT: 403 FORBIDDEN\n
}}}