CERN Accelerating science

Docker MGM setup error on Centos8

Context: I’m a professor at Reykjavik University in Iceland. We are considering EOS as a prospective mechanism to prepare for long-term research data storage that is likely to be coming regulatory in the near future in Europe. I’m in charge of a pilot project to quickly test out options and write a document proposing which one to move forward on. I’m currently the administrator for the cs.ru.is OpenAFS cell and am familiar with Kerberos, LDAP, and AFS administration.

(In a unrelated question: is there a chat form such as FreeNode, Discord, or Slack for EOS?)

I was trying to follow the instructions to setup a Docker installation of EOS as described on EOS Docker Installation — EOS CITRINE documentation
I ran into an error on the MGM server setup that I’m not sure how to move forward on.
Do you have advice on where to look? I’ve never used Docker before.

This server (archive.ru.is) is part of a FreeIPA domain with a KDC on CS.RU.IS, but is otherwise freshly installed. I installed (then uninstalled) the RPM install method after realizing that Docker seemed simpler for a quick test.

uname -a:
Linux archive.ru.is 4.18.0-240.15.1.el8_3.x86_64 #1 SMP Mon Mar 1 17:16:16 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Shell output:
scripts/start_services.sh -i gitlab-registry.cern.ch/dss/eos:4.2.16 -n 16

*** Creation of the network for EOS cluster
314ff61eb6add90491196de481c183c9de988f554b38c2223b2895b51c3ddd59

*** Kerberos server creation and setup
Unable to find image ‘gitlab-registry.cern.ch/dss/eos:4.2.16’ locally
4.2.16: Pulling from dss/eos
371eb7684dcb: Pull complete
dcbc42c3b35d: Pull complete
a53628a72a34: Pull complete
cb27276aa064: Pull complete
17181c534c87: Pull complete
08248eb73db4: Pull complete
b140cb39bfee: Pull complete
1992def3b866: Pull complete
337b686f866a: Pull complete
57ab3c7c226c: Pull complete
b148e686dedf: Pull complete
ca900d10241d: Pull complete
10c9a006ed54: Pull complete
Digest: sha256:b35b999f5eb42f161f9c7d7cc716188481152cbae79fe1db10352206fc47c545
Status: Downloaded newer image for gitlab-registry.cern.ch/dss/eos:4.2.16
dfa719152bfdd73fc6a6c46725dffe9c81591dee7b3a78f016e1b7e360d1a718
Starting kdc… Done.
Initing kdc… Done.
Populating kdc… added admin1@TEST.EOS with password “REDACTED”
added host/eos-mgm-test.eoscluster.cern.ch@TEST.EOS with password “REDACTED”
Done.

*** MQ server creation and setup
6617e245498ac382fd7f42dd1a254b9130125b516e0454e4dcd2b6b1390e143c

*** MGM server creation
3aec51004adb363a5ad1025a209eb34a9b3f4ee01ab6bf41c0a8b9de2df7f999

*** Applying Kerberos keytab on EOS cluster

*** MGM server setup

error: errc=0 msg=""

I have just reinstalled the VM and skipped configuring it in our FreeIPA domain in case that was the issues. It’s still getting stuck on “MGM server setup”

Hi Joseph,

So you did a git clone of the eos-docker project from here: https://gitlab.cern.ch/eos/eos-docker/
And then you ran the following command:

cd eos-docker
./scripts/start_services.sh -i gitlab-registry.cern.ch/dss/eos:4.8.39

Below is the output that I get and everything seems to work as expected:

[esindril@esdss000 eos-docker]$ ./scripts/start_services.sh -i gitlab-registry.cern.ch/dss/eos:4.8.39


*** Creation of the network for EOS cluster
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.26/networks/create: dial unix /var/run/docker.sock: connect: permission denied


*** Kerberos server creation and setup
/usr/bin/docker-current: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.26/containers/create?name=eos-kdc: dial unix /var/run/docker.sock: connect: permission denied.
See '/usr/bin/docker-current run --help'.
[esindril@esdss000 eos-docker]$ sudo ./scripts/start_services.sh -i gitlab-registry.cern.ch/dss/eos:4.8.39                                                                                                                                                                                


*** Creation of the network for EOS cluster
Error response from daemon: network with name eoscluster.cern.ch already exists


*** Kerberos server creation and setup
Unable to find image 'gitlab-registry.cern.ch/dss/eos:4.8.39' locally
Trying to pull repository gitlab-registry.cern.ch/dss/eos ... 
4.8.39: Pulling from gitlab-registry.cern.ch/dss/eos
2d473b07cdd5: Already exists 
014d8f9fe684: Pull complete 
76f37ec97893: Pull complete 
6064560f9c21: Pull complete 
bb91a89d487d: Pull complete 
4333dd8ae179: Pull complete 
428f35b72769: Pull complete 
6bf434ee805b: Pull complete 
f891f0db807b: Pull complete 
add7119a7356: Pull complete 
c18436ed132b: Pull complete 
33a24adf66ef: Pull complete 
53a2c9fc43d1: Pull complete 
932643ba6f63: Pull complete 
0886800fd202: Pull complete 
531be5c1c70d: Pull complete 
8e74c5c80172: Pull complete 
f31baa5a2e57: Pull complete 
1e417f855522: Pull complete 
Digest: sha256:97e00aab7f6d554e37b420529b7f4b5e2bebb06118671893e784b34d953abf2a
Status: Downloaded newer image for gitlab-registry.cern.ch/dss/eos:4.8.39
6fc1194102d63b41e35a0b556b1cf427a82dfdf9aaf9d05fba8763778c7f764b
Starting kdc... Done.
Initing kdc... Done.
Populating kdc... added admin1@TEST.EOS with password "x=VehKwjkg"
added host/eos-mgm1.eoscluster.cern.ch@TEST.EOS with password "Jasxryp/Kf"
added eos-user@TEST.EOS with password "fzaqiIThi,"
Done.

*** MQ server creation and setup
e56e2db64b59f665542b0d5283fb321efd74b68944b504593f1e91b6ef02ef08


*** MGM server creation
8f453e7b9ad93d3f8c9998ae8d5a3d208e361cb57cbfa80b114a492e21c6d661


*** Applying Kerberos keytab on EOS cluster


*** MGM server setup
success: set vid [  eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.auth=sss mgm.vid.cmd=map mgm.vid.gid=0 mgm.vid.key=<key> mgm.vid.pattern=<pwd> mgm.vid.uid=0 ]
success: set vid [  eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.auth=krb5 mgm.vid.cmd=map mgm.vid.gid=0 mgm.vid.key=<key> mgm.vid.pattern=<pwd> mgm.vid.uid=0 ]
success: set vid [  eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.cmd=membership mgm.vid.key=2:root mgm.vid.source.uid=2 mgm.vid.target.sudo=true ]
success: mode of file/directory /eos/dockertest/ is now '2777'
success: set vid [  eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.cmd=membership mgm.vid.key=admin1:root mgm.vid.source.uid=admin1 mgm.vid.target.sudo=true ]
info: creating space 'default'


*** FST servers parallel creation
5b34c580892a1d749964e9cce0f7ab20c0d5a5d755550764c9fb22dc029ae020
09ba151a78cc036721e45ac09b38ce3f7f920fcd6259910eba1c6a9ef9d1340f
a336b0298038182ceb0cae25b83a1956045dd5bf80faba395bceebbb4b885919
a997dfc0bb87bd71206cd1c595281c690dc98bff39c8fd5e3a21bf14cebae44a
95e71ec9b696c65b1755711bbaffd075ad6479560a84b3400a17946f911d4b6e
31a79c1d07c99555a5c7be5937a2e1352e428e47abe1eb26cb45f9e6902a795c
b63ca096cab64779a3ab6ff80180306d1f4ff5c133f6b96cf3ae534ac20f7e10
a5f48b70f81388453230569994978660021309603d3e964202a3125e1ac6c27c


*** FST servers parallel setup
Starting fst1 ...
Starting fst2 ...
Starting fst3 ...
Starting fst4 ...
Configuration start for fst1 ...
Starting fst5 ...
Configuration start for fst2 ...
Starting fst6 ...
Configuration start for fst3 ...
Starting fst7 ...
Configuration start for fst5 ...
Configuration start for fst4 ...
Starting fst8 ...
Configuration start for fst6 ...
Configuration start for fst7 ...
Configuration start for fst8 ...
success: mapped 'fst1' <=> fsid=1
success: mapped 'fst2' <=> fsid=2
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 1 moved to group default.0
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 2 moved to group default.0
Configuration done for fst1
success: mapped 'fst3' <=> fsid=3
Configuration done for fst2
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 3 moved to group default.0
success: mapped 'fst5' <=> fsid=5
success: mapped 'fst4' <=> fsid=4
Configuration done for fst3
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 5 moved to group default.0
success: mapped 'fst6' <=> fsid=6
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 4 moved to group default.0
Configuration done for fst5
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 6 moved to group default.0
Configuration done for fst4
success: mapped 'fst7' <=> fsid=7
success: mapped 'fst8' <=> fsid=8
Configuration done for fst6
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 7 moved to group default.0
info: applying space config drainperiod=86400
info: applying space config graceperiod=86400
info: applying space config scaninterval=604800
info: applying space config scanrate=100
success: filesystem 8 moved to group default.0
Configuration done for fst7
Configuration done for fst8


*** Enabling default space with quota disabled and booting filesystems
Wait for FSTs to become online ...
All FSTs are online
success: boot message sent to eos-fst1.eoscluster.cern.ch:/home/data/eos1 eos-fst2.eoscluster.cern.ch:/home/data/eos2 eos-fst3.eoscluster.cern.ch:/home/data/eos3 eos-fst4.eoscluster.cern.ch:/home/data/eos4 eos-fst5.eoscluster.cern.ch:/home/data/eos5 eos-fst6.eoscluster.cern.ch:/hom
e/data/eos6 eos-fst7.eoscluster.cern.ch:/home/data/eos7 eos-fst8.eoscluster.cern.ch:/home/data/eos8
success: configuration successfully saved!
Wait for FSTs to boot ...
All FSTs are booted


*** Client servers creation and setup
3fb0630ad1eece40bf07cbfec3f4fd40100a52e65d6d6e7c01f90c1f1e665f3e
host/eos-mgm1.eoscluster.cern.ch@TEST.EOS: kvno = 1
host/eos-mgm1.eoscluster.cern.ch@TEST.EOS: kvno = 1

Please try again the above commands and let the know the outcome.

Thanks,
Elvin

Thanks! This was able to run and now I can test things.
Of note, the official directions on the quickstart documentation (EOS Docker Installation — EOS CITRINE documentation) still indicate to use 4.2.16 and the -n 16. Is there some way to get this updated so other people don’t stumble into the same version problem I did?
Also, how would I figure out what the latest docker image number is?

It looks like this is the right page. Releases — EOS CITRINE documentation
I was trying to find out where it would be indicated in Container Registry · dss / eos · GitLab in case that got out of date, but I’m not turning up anything.