Sorry, I missed this one. It looks like you have enabled both the libmicrohttpd and XrdHttp on the FST side and by design the FST can only publish the port for one of them. Therefore, what you need to do is set the following environment variable for the FST daemons so that it matched the port for XrdHttp - the xrootd daemon will make sure to first bind XrdHttp to that port and then libmicrohttp will fail to bind to it so it will be practically disabled. The good thing will be that the correct port will be advertised to the MGM, namely 8443.
Therefore, please set:
[Service]
Environment=EOS_FST_HTTP_PORT=8443
in the FST systemd customization file for the corresponding daemon eg.: /usr/lib/systemd/system/eos@fst1.service.d/custom.conf and then restart the FST services.
Try again and things should look better.
We have both MGM and FST running on the same virtual machine. Can I set FST to port 8444 when MGM has port 8443?
We only run one FST per four partitions on a machine,
In this case, the port is defined in the file:
/usr/lib/systemd/system/eos@fst.service.d/custom.conf
or still in:
/usr/lib/systemd/system/eos@fst1.service.d/custom.conf
?
I created both the same.
Since in my case the file was on machines other than MGM,
reading exactly this file worked. But the write fails, probably the new file is redirected to the MGM machine.
I also do not understand why the MGM and FST ports are defined in three places:
/etc/sysconfig/eos_env
/etc/xrd.cf.mgm, /etc/xrd.cf.fst
/usr/lib/systemd/system/eos@fst.service.d/custom.conf
Which definition takes precedence if ports are defined in all 3 places?
Yes, you can use any port as long as things are configured properly. My customization script was just an example, in my case I have an FST which is called fst1. You can have of course multiple FSTs running on the same machine i.e fst1, fst2, etc.
There is a historical reason for the multiple places where things are defined. Before there was any XrdHttp the way to configure HTTP access was to use the libmicrohttpd implementation. This needs an env variable to contain the port that it should bind to. Then, there was XrdHttp developed, which comes from the XRootD framework and the configuration for the port needs to be in /etc/xrd.cf.fst.
Now, if you run like me for example, a full cluster on one machine then putting the env variable in /etc/sysconfig/eos_env is not enough since all the daemon will load this environment and therefore you now need the customization per daemon. I want for example fst1 to run http on port 8001 and I need fst2 to run on 8002 - no two daemon can bind the same port. There is no other way to achieve this without the customization scripts.
The added trick as I mentioned earlier, is that in the case of FSTs you can only have one http implementation running: either libmicrohttpd or XrdHttp. You need XrdHttp running so you need to make sure that the http port in /etc/xrd.cf.fst and the EOS_FST_HTTP_PORT match so that only XrdHttp will successfully start when you start your FST.
Once you have this successfully running, then I believe also the transfers will work.
Definitely do not set the env variables to 0. Can you send me your configuration files? /etc/xrd.cf.mgm//etc/xrd.cf.fst/etc/sysconfig/eos_env/ and paste any systemd customizations that you are using. Also restart one FST and set me the logs to understand exactly on which port the HTTP plugin is binding to.
Then issue once more the transfer with HTTP and send me the trace of that transfer from both the MGM and the FST to which it gets redirected. It looks like your FST is crashing when getting such a request.
I see from the logs that you have the EOS_FST_ASYNC_CLOSE functionality enabled. Unfortunately, this only works correctly in the latest EOS version 5.0.29 which comes which a new XRootD version that fixes a bug related to this. Could you please update and retry your transfer? By the way, does a simple xrdcp work correctly against your instance?
I also saw a warning in the FST logs when the service is starting, namely: Config warning: HTTPS functionality was not configured.
I don’t think this is critical but more an issue of using old configuration options. Nevertheless, please have a look at this post where you can find a sample configuration and maybe you can replace the http.ca/key directives with the new ones supported in XRootD 5: https://eos-community.web.cern.ch/t/scitokens-authorization-done-but-no-username-found/783/8?u=esindril
A more distilled config for an FST you can find below:
from your recommendation (the line in /etc/xrd.cf.fst):
{{{
xrd.tls /etc/grid-security/daemon/hostcert.pem /etc/grid-security/daemon/hostkey.pem
}}}
I realized that all FSTs must have host certificate.
But in my case, only 2 MGMs has certificates.
I understand that the certificate on FST is required only for http/webdav?
xrdcp (without TPC) works without this.
{{{
dvl-ui01:~ > date ; gfal-copy -f file:///etc/group
root://dvl-eos.jinr.ru//eos/tests/cms/test-03
Wed Aug 3 13:58:03 MSK 2022
Copying file:///etc/group [DONE] after 0s
}}}
EOS has already been updated to 5.0.29 and the corresponding xroot version:
{{{
dvl-eos-m01:~ # rpm -qa *eos* *xroot* | sort
eos-client-5.0.29-1.el7.cern.x86_64
eos-folly-2019.11.11.00-1.el7.cern.x86_64
eos-folly-deps-2019.11.11.00-1.el7.cern.x86_64
eos-fusex-5.0.29-1.el7.cern.x86_64
eos-fusex-core-5.0.29-1.el7.cern.x86_64
eos-fusex-selinux-5.0.29-1.el7.cern.x86_64
eos-grpc-1.41.0-1.el7.x86_64
eos-grpc-devel-1.41.0-1.el7.x86_64
eos-libmicrohttpd-0.9.38-eos.el7.cern.x86_64
eos-librichacl-1.12-14.el7.cern.x86_64
eos-ns-inspect-5.0.29-1.el7.cern.x86_64
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64
eos-quarkdb-5.0.29-1.el7.cern.x86_64
eos-richacl-1.12-14.el7.cern.x86_64
eos-rocksdb-6.2.4-1.el7.cern.x86_64
eos-server-5.0.29-1.el7.cern.x86_64
eos-xrootd-5.4.7-1.el7.cern.x86_64
xrootd-client-libs-5.4.3-1.el7.x86_64
xrootd-libs-5.4.3-1.el7.x86_64
xrootd-scitokens-5.4.3-1.el7.x86_64
xrootd-server-5.4.3-1.el7.x86_64
xrootd-server-libs-5.4.3-1.el7.x86_64
xrootd-voms-5.4.3-1.el7.x86_64
}}}
I will request certificates for FSTs but I’m afraid it’s not a fast process.
Yes, for HTTPS one needs certificates also on the FSTs. Also certificates are a requirement for any token based access, otherwise the tokens are sent in clear text over the wire. TLS support is available also for the XRootD protocol and again a requirement if you plan to use tokens.
Let me know how it goes once you install the certificates also on the FSTs.
I turned off the FST nodes without certificates.
We also have 5 FST on each of 2 nodes with MGM.
Now gfal-copy from local file to davs works.
I test working with EOS_FST_ASYNC_CLOSE=1,
I can’t say for sure, but it seems in this case the error occurs again.
We also need a working davs with TPC.
Hope this will work too. I will test soon.
Thank you for the confirmation. This makes perfect sense, since the HTTP protocol is less versatile than the XRootD one and indeed the async functionality can not work properly over HTTP. I will fix this for the next release, by disabling async close by default for HTTP.
Indeed, there was a problem. The HTTP layer was not properly populating a field on which I was relying to detect that this was an http access. This is fixed now. I will tag 5.0.31. Thanks for the notification!