oh, silly me yup, that did it and the eos@mgm started …
i still have some error in the log:
Apr 07 09:50:07 mgm.spacescience.ro systemd[1]: Starting EOS mgm...
Apr 07 09:50:07 mgm.spacescience.ro systemd[1]: Started EOS mgm.
Apr 07 09:50:07 mgm.spacescience.ro sh[801]: /opt/eos/xrootd/bin/xrootd
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: Register objects provided by NsQuarkdbPlugin ...
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Public key in use is EXPORT PATH:/ VO:* ACCESS:ALLOW CERT:*
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Cannot load public key !
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Public key in use is RULE PATH:/ AUTHZ:delete|read|write|write-once| NOAUTHZ:| VO:*| CERT:I
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Cannot load public key !
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Public key in use is RULE PATH:/eos/alice/grid/ AUTHZ:| NOAUTHZ:delete|read|write|write-onc
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Cannot load public key !
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Public key in use is KEY VO:* PRIVKEY:/etc/grid-security/xrootd/privkey.pem PUBKEY:/etc/grid-se
Apr 07 09:50:08 mgm.spacescience.ro sh[801]: =====> XrdAliceTokenAcc: Cannot load public key !
but in the actual logs in /var/log/eos/mgm/xrdlog.mgm show nothing interesting:
++++++ (c) 2008 CERN/IT-DM-SMD AliceTokenAcc (Alice Token Access Authorization) v 1.0
=====> alicetokenacc.noauthzhost: localhost
=====> alicetokenacc.noauthzhost: localhost.localdomain
=====> alicetokenacc.truncateprefix: /eos/alice/grid
=====> XrdAliceTokenAcc: No Authorizationfile set via environment variable 'TTOKENAUTHZ_AUTHORIZATIONFILE'
=====> XrdAliceTokenAcc: Using Authorizationfile '/etc/grid-security/xrootd/TkAuthz.Authorization'!
------ AliceTokenAcc initialization completed
also, i’m not sure what eos@master is supposed to beside ExecStartPre=/bin/sh -c "/usr/sbin/eos_start_pre.sh eos-master"
i mean all this content can be moved to mgm service as pre and post
also, i found no explanation about what is the instance name… i tried with a made up name, but in the end, i found in some script hosted on our ALICE site that it is supposed to be of the form of eos which is not documented in the /etc/sysconfig/eos_env.example
So, what is the point that can confirm that the mgm is successfully set?
Also, i see that there are some steps related to groups, but our servers will not have individual drives, and each server will have a varied numbers of volumes… what should i do at this point? btw, the fst (the only one at this point) is not yes
It would probably be helpful to post the /etc/xrd.cf.mgm file but looking at the logs I would assume /etc/grid-security/xrootd/privkey.pem is owned by xrootd user rather than daemon which is the user under which EOS runs. You probably need to update your sec.protocol gsi directive to point to a location from where user daemon can read the files.
The instance name needs to be something that starts with “eos”, due to historical reason. For example “eosadrian”.
To confirm the mgm works you need to write and read a file from you instance. You can get some more info about configuration this from here: http://eos-docs.web.cern.ch/eos-docs/quickstart/admin/configure.html?highlight=groups
Hi! Thanks for looking into this!
The mgm conf files and logs can be seen here CERNBox
I would assume /etc/grid-security/xrootd/privkey.pem is owned by xrootd user rather than daemon which is the user under which EOS runs.
Well, yes, but the files are readable by everyone… in the plain xrootd setup we use the same files and the xrootd is ok with reading them … why would the eos xrootd behave any differently?
Beside, i just tried and chown to daemon and i have the same messages in systemd journal
You probably need to update your sec.protocol gsi directive to point to a location from where user daemon can read the files.
We do not use gsi in ALICE, i just kept unix and sss, the authorization is done only with AliceTokenAcc
You can get some more info about configuration this from here: http://eos-docs.web.cern.ch/eos-docs/quickstart/admin/configure.html?highlight=groups
well, things look bad :
[root@mgm mgm]# eos node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬──────┬──────────┬────────┬────────┬────────────────┬─────┐
│type │ hostport│ geotag│ status│ activated│ txgw│ gw-queued│ gw-ntx│ gw-rate│ heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴──────┴──────────┴────────┴────────┴────────────────┴─────┘
nodesview fst01.spacescience.ro:1095 ??? unknown ??? off 0 10 120 ~ 4
[root@mgm mgm]# eos health
┌────────────────────────────────┬────────┐
│hostport │ status│
└────────────────────────────────┴────────┘
fst01.spacescience.ro:1095 unknown
┌────────────┬────────────┬────────────┬────────┐
│group │offline used│ online free│ status│
└────────────┴────────────┴────────────┴────────┘
default.0 0 B 0 B full
default.1 0 B 0 B full
default.2 0 B 0 B full
default.3 0 B 0 B full
┌────────────┬────────┬────────┬──────────┬────────────────────────────────┐
│group │ free fs│ full fs│contention│ status│
└────────────┴────────┴────────┴──────────┴────────────────────────────────┘
default.0 0 1 100 % warning: Less than 4 fs in group
default.1 0 1 100 % warning: Less than 4 fs in group
default.2 0 1 100 % warning: Less than 4 fs in group
default.3 0 1 100 % warning: Less than 4 fs in group
┌──────┬──────┬──────┬──────────────┬───────────────┐
│ min│ avg│ max│ min placement│ critical group│
└──────┴──────┴──────┴──────────────┴───────────────┘
100 % 100 % 100 % 0 default.3
[root@mgm mgm]# nc -vz fst01.spacescience.ro 1095
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 2001:b30:4210:1::37:1095.
Ncat: 0 bytes sent, 0 bytes received in 0.04 seconds.
Did you manage to get this going?
We’re basically in the same spot, we have a working setup with gsi authentication, as soon as we enable the alice token stanzas, authentication seems to break completely.
@esindril: maybe to clarify, there are multiple .pem files at play, the alicetoken package brings the privkey/pubkey this one in referenced in the TkAuthz.Authorization file.
[root@mgm-1 etc]# rpm -ql xrootd-alicetokenacc
/etc/grid-security/xrootd/TkAuthz.Authorization
/etc/grid-security/xrootd/privkey.pem
/etc/grid-security/xrootd/pubkey.pem
[…]
For gsi authentication we have our host certificates (that work correctly for gsi auth without alice tokens enabled).
What we noticed is that authentication is breaking as soon as we enable " mgmofs.authorize 1" stanza.
We’d be grateful for any advice. From what I’ve found in the forum this setup should be possible (A single EOS instance for all LHC VOs?)
Hi, yes, everythings is working now … i do not remember what happened but eos usually needs restarts on all nodes if something happen (for example if i pull out a network cable)
for ALICE (exclusive) usage i have
Indeed, at CERN we have one EOS instance per VO, therefore we never got into this mixed authentication mode. I will let @apeters detail what needs to be done for the AliceToken library to be configured properly along side other authentication protocols.
Hi,
Great to hear this worked for your. basically this looks similar to our config - except for that we must also support gsi security for the CMS experiment.
One more question though, would you also give us a peek into the
/etc/grid-security/xrootd/TkAuthz.Authorization
As far as I understand, it maps token and cert authentication to the various paths,
Best
Erich
For your basic understanding:
if (protocol == “sss”) {
return XrdAccPriv_All;
}
if (protocol == “krb5”) {
return XrdAccPriv_All;
}
if (protocol == “gsi”) {
return XrdAccPriv_All;
}
That means, that if you come with sss, krb5 or gsi authentication the ALICE functionality is bypassed. E.g. this applies only to access which is not using strong authentication aka ‘unix’ or ‘host’.
If you need some debug information in the xrdlog you can do:
Hi Adrian,
I am working with Erich on the EOS installation.
I am a bit unsure how to interpret the contents of the TkAuthz.Authorization file specifically the RULE section.
The comment section says:
######################################################################
# RULES Section
######################################################################
#
# Syntax: RULE PATH:<path> AUTHZ:<tag1|tag2|...|> NOAUTHZ:<tag1|tag2|...|> VO:<vo1|vo2|....|> CERT:<IGNORE|*|cert>
#
# ------------------------------------------------------------------
# - PATH defines the namespace path
# - AUTHZ defines the actions which have to be authorized
# - NOAUTHZ defines the actions which don't have to be authorized
# - VO is a list of VO's, where this rule applies
# - CERT can be IGNORE,* or a specific certificate subject
# IGNORE means, that the envelope certificate must not match the
# USER certificate subject. * means, that the rule applies for any
# certificate and the certificate subjects have to match.
I read the 2 lines as follows:
For the root namespace / all operations require authorization and it applies to any VO and the cert is ignored.
For the /eos/alice/grid namespace we dont require any authorization for delete, read, write, write-once for any VO.
Shouldn’t this be the other way round ? I want to not do any token authorizsation for the root namespace and then enable it for the alice namespace ?
Thanks for the pointer to sss+keytab.
We’ve managed to enable sss with keytab for the fuse mountpoint, this seems to be working correctly, users get mapped to their respective unix ids, when doing ‘cat /eos/vbc/proc/whoami’
What we also noticed confusingly when testing the gsi authentication: it’s working fine with xrdfs/xrdcp tools. However, with the eos cli client, we get this:
[ebirngru@lxplus735 ~]$ eos root://eos.grid.vbc.ac.at
# ---------------------------------------------------------------------------
# EOS Copyright (C) 2011-2019 CERN/Switzerland
# This program comes with ABSOLUTELY NO WARRANTY; for details type `license'.
# This is free software, and you are welcome to redistribute it
# under certain conditions; type `license' for details.
# ---------------------------------------------------------------------------
error: errc=3010 msg="[ERROR] Error response: Permission denied" (errc=3010) (Unknown error 3010)
error: errc=3010 msg="[ERROR] Error response: Permission denied" (errc=3010) (Unknown error 3010)
EOS_CLIENT_VERSION=4.7.14 EOS_CLIENT_RELEASE=1
error: errc=3010 msg="[ERROR] Error response: Permission denied" (errc=3010) (Unknown error 3010)
EOS Console [root://eos.grid.vbc.ac.at] |/> ls
error: errc=3010 msg="[ERROR] Error response: Permission denied" (errc=3010) (Unknown error 3010)
EOS Console [root://eos.grid.vbc.ac.at] |/>
On the MGM we see in the logs, that authentication/user mapping seems to be going through ok, but then executing “/proc/user” fails?
200630 11:23:36 time=1593509016.681858 func=Emsg level=ERROR logid=605c9c24-bab3-11ea-afdd-3868dd28d0c0 unit=mgm@mgm-1.eos.grid.vbc.ac.at:1094 tid=00007fcddf5f8700 source=XrdMgmOfsFile:3094 tident=ebirngru.29536:407@lxplus735.cern.ch sec=gsi uid=10661 gid=1999 name=erich.birngruber geo="vbc" Unable to execute proc command - you don't have the requested permissions for that operation (1) /proc/user/; Operation not permitted
Some more lines for context show the gsi autg + user mapping are ok.
This only happens with Alice tokens enabled, specifically “mgmofs.authorize 1” - as I understand otherwise the default auth stack is used anyways. On the MGM node itself, the eos client works fine for root, I assue due to “alicetokenacc.noauthzhost localhost”.
Is this EOS client behavior expected with Alice token auth config?
I have no idea … it seemed strange to me too but given the upstream example
i would say that should be ok, but i have no idea why
and it true that this configuration would disable all authorisation operations on eod path… @apeters (or any other experts that see this) could you please explain a little bit why this is ok for eos?
Thanks a lot!!
L.E. maybe the paths are parsed sequentially and any external access should first get authorization for / and then internal data movement that use /eos/alice/something is free from other authorization ops??
Hi Erich, I missed that,
That is because in an instance with ALICE tokens, you cannot run eos CLI commands, if you don’t have either a kerberos credential for that instance, a grid certificate or an sss key. You send your command from lxplus, so you mostlikely have only a CERN kerberos token …