Sec.protbind config for EOS

Hello,

I am working with an EOS config that was originally intended for use in a private clustered environment, but now I am exposing it to the internet for use as a grid SE with VOMS-based authn.

I frequently see default config examples that look like this:

sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab

sec.protbind localhost.localdomain unix sss
sec.protbind localhost unix sss
sec.protbind * only sss unix

However the way I (possibly naively) interpret this is it seems insecure, especially considering the unix protocol docs which say " Warning: unix protocol does not provide any significant level of security and should only be used in instances where security violations do not matter." IIUC the unix protocol just asks a remote client what UID it is running as which provides no assurance over a public network, and this config would allow any client to use the unix protocol.

Based on protbind docs it seems to me, if I understand correctly, that the following would be much more secure, and follow the guideline " Specify the most general hostpat first and the least general, last". (This also shows the GSI config I am trying for VOMS authn.)

sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
sec.protocol gsi -crl:use -moninfo:1 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gmapopt:nomap -vomsfun:default -d:1

sec.protbind * only gsi
sec.protbind *.eos.svc.cluster.local only unix sss
sec.protbind localhost.localdomain only unix sss
sec.protbind localhost only unix sss

eos.svc.cluster.local is the internal subdomain on the private cluster network, shared by all the EOS nodes.
The intention of this proposed configuration is to require everyone (external users) to use only GSI (VOMS), except for trusted systems (EOS) on the private network or localhost which must use only unix and sss.
Aside from the seemingly intrinsically insecure nature of the unix protocol, the other improvement here would be that if in the worst case the SSS secret was disclosed, external systems could not exploit it.

Am I on the right track? This looks similar to the config in EOS GSI and https configuration - #4 by georgep
My other question is if

sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
sec.protbind * only unix sss

is sufficient - and safe - on the FST nodes.

I can not get that to work, but this works:

sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
#sec.protocol gsi -crl:use -moninfo:1 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gmapopt:nomap -vomsfun:default -d:1

#sec.protbind * only gsi
sec.protbind *.eos.svc.cluster.local unix sss
sec.protbind localhost.localdomain unix sss
sec.protbind localhost unix sss

But if I uncomment the gsi lines it does not work, leading me to believe something is not matching the *.eos.svc.cluster.local pattern. There are no particularly revealing error messages that I can pick out, the process /opt/eos/xrootd/bin/xrootd -n mgm -c /etc/xrd.cf.mgm -m -b -l /var/log/eos/init/xrdlog.mgm -Rdaemon just runs indefinitely without the MGM coming online for ‘eos’ commands:

# eos ns
error: MGM root://localhost not online/reachable

@georgep any ideas?

With some further testing I found that the local connection on the MGM to itself is forced to only use GSI:

$ XRD_LOGLEVEL=Dump eos ns

[2024-08-27 22:14:00.355010 +0000][Debug  ][XRootDTransport   ] [localhost:1094.0] Trying to authenticate using gsi
[2024-08-27 22:14:00.362842 +0000][Debug  ][XRootDTransport   ] [localhost:1094.0] Cannot get credentials for protocol gsi: Secgsi: ErrParseBuffer: error getting user proxies: kXGS_init
[2024-08-27 22:14:00.362862 +0000][Error  ][XRootDTransport   ] [localhost:1094.0] No protocols left to try
[2024-08-27 22:14:00.362874 +0000][Error  ][AsyncSock         ] [localhost:1094.0] Socket error while handshaking: [FATAL] Auth failed
[2024-08-27 22:14:00.362880 +0000][Debug  ][AsyncSock         ] [localhost:1094.0] Closing the socket
[2024-08-27 22:14:00.362886 +0000][Debug  ][Poller            ] <[::ffff:127.0.0.1]:56580><--><[::ffff:127.0.0.1]:1094> Removing socket from the poller

So the ‘localhost’ directive to allow unix and sss is not taking effect for some reason. I also tried binding various addresses like [::ffff:127.0.0.1], [127.0.0.1], [*127.0.0.1], 127.0.0.1 , ::1 etc to unix and sss but this still did not work.

/etc/hosts has

127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback

so this should work…

Can you try:

eos root://127.0.0.1 whoami

Thanks for the tip @apeters !
Here is regular and verbose output for the two configurations that work and do not work.

Both cases show ‘Host Name: 127.0.0.1’ , and when whoami works it shows ‘host=localhost domain=localdomain’ as expected.
Strictly speaking 127.0.0.1 is obviously an IP address, not a host name, while the protbind documentation says it only applies to host name patterns. I anyway already tried numerous ways of allowing 127.0.0.1 as mentioned above but that did not work.

Is it possible the code behaves differently than documented, and a protbind directive on ‘*’ can not be overridden by any other protbind statement?

Here is also the full verbose output of ‘eos ns’ with the non-working config, which shows ‘Host Name: localhost’, which is different for some reason than ‘Host Name: 127.0.0.1’ shown by the eos whoami command in the same environment.

Is IPv6 involved in this?

No, all the EOS nodes have only IPv4 addresses. EOS is running in containers (the gitlab-registry.cern.ch/dss/eos/eos-all:5.2.22 image) but I’m not sure if that could affect this.

I also noticed this comment by @esindril that “In general, using the unix authentication is not a recommended setup unless you control the client machine.” which reinforces my interest in ensuring unix authn is only enabled for EOS nodes in the private network, not all nodes by default.