Upgrade to EOS 5.1.28

Hi,

Trying to upgrade from EOS 4.8.105 to 5.1.28 (dev, single-node, instance)

The yum errrors reported in another thread (Eos-xrootd vs xrootd) have been resolved and all required EOS5 packages have been (I think) installed. All previously installed xrootd 4.12.8 packages were removed to block the upgrade to xrootd 5.2 and allow the use of xrootd libs provided only by eos-xrootd-5.5.10 as suggested by the EOS developers

eos@quarkdb.service has started without a problem but the eos@mgm.service, eos@mq.service and eos@fst.service have failed to start because the the EOS libs cannot be located (see below). I can clearly see that all these libs have been installed via the package eos-server-5.1.28-1.el7.cern.x86_64 so I dont quite understand.

I have LD_LIBRARY_PATH=/opt/eos/xrootd/lib64 in /etc/sysconfig/eos_env

Any help would be much appreciated. Many thanks

George

Plugin fslib libXrdEosMgm-5.so not found; falling back to using libXrdEosMgm.so
Plugin No such file or directory loading fslib libXrdEosMgm.so
231121 15:57:21 27891 XrootdConfig: Unable to load file system via libXrdEosMgm.so
231121 15:57:21 27891 XrootdConfig: Unable to load base file system using libXrdEosMgm.so

Plugin fslib libXrdMqOfs-5.so not found; falling back to using libXrdMqOfs.so
Plugin No such file or directory loading fslib libXrdMqOfs.so
231121 15:57:40 28481 XrootdConfig: Unable to load file system via libXrdMqOfs.so
231121 15:57:40 28481 XrootdConfig: Unable to load base file system using libXrdMqOfs.so

Plugin fslib libXrdEosFst-5.so not found; falling back to using libXrdEosFst.so
Plugin No such file or directory loading fslib libXrdEosFst.so
231121 15:58:03 29226 XrootdConfig: Unable to load file system via libXrdEosFst.so
231121 15:58:03 29226 XrootdConfig: Unable to load base file system using libXrdEosFst.so
.

Try ldd on these libraries and check if you miss some dependency library!

I had a similar issue when we upgraded our test environment. Check your /etc/xrd.cf.mgm file and make sure none of the lines that specify libraries contain the old xrood library paths.

Hi,

Thanks for the replies. Yes, I miss the following dependency library (not sure why is listed more than once)

[root@antares-eos14 ~]# ldd /usr/lib64/libXrdEosMgm-5.so | grep “not found”
libprotobuf.so.3 => not found
libprotobuf.so.3 => not found
libprotobuf.so.3 => not found
libprotobuf.so.3 => not found
libprotobuf.so.3 => not found

I notet that the following has been linked to /usr/lib64/libXrdEosMgm-5.so

libprotobuf.so.3.17.3 => /opt/eos/grpc/lib64/libprotobuf.so.3.17.3 (0x00007f31c2ce4000)

For my info, how does the missing dep prevent the EOS from locating /usr/lib64/libXrdEosMgm-5.so? Rather, it should not be able to run instead (even after finding it)

In the config files, I use the relative paths as per EOS team’s suggestions

xrootd.fslib libXrdMqOfs.so
xrootd.fslib -2 libXrdEosMgm.so
xrootd.fslib -2 libXrdEosFst.so

George

Hi,

Doing “ln -s /opt/eos/grpc/lib64/libprotobuf.so.3.17.3 /opt/eos/grpc/lib64/libprotobuf.so.3”

appears to remove the missing library dependency - all EOS services appear to be running now. I can run eos client commands, list the filesystems, files etc; I will run end-to-end tests later.

Hopefully, the upgrade to EO5 is done…

root@antares-eos14 george]# eos

EOS_INSTANCE=eosantaresdev
EOS_SERVER_VERSION=5.1.28 EOS_SERVER_RELEASE=1
EOS_CLIENT_VERSION=5.1.28 EOS_CLIENT_RELEASE=1

Although eos@mgm is running now, I noticed the following SSL errors
http://www-public.gridpp.rl.ac.uk/filelists/EOS5_MGM_ssl_errors.txt
which may or may not be important. Can you please confirm?

Thanks

Hi George,

For EOS 5.1.28 you need both the following packages installed: eos-protobuf3 and eos-grpc-1.41-1. This will be simplified in EOS 5.2.x, where you only need the eos-grpc-1.56 package.

If you have both these packages installed you don’t need to do the trick with the symbolic link. Normally, in the packages that we built for 5.1.x, you should have the following dependency:

ldd /usr/lib64/libXrdEosMgm-5.so | grep libprotobuf
libprotobuf.so.3 => /opt/eos/lib64/protobuf3/libprotobuf.so.3 (0x00007f2e0628d000)

As you can see, this is satisfied by the eos-protobuf3 package. Once you move to 5.2 then the dependency will look like this:

$ rpm -qa | grep "eos-grpc\|eos-server"
eos-server-5.2.2-1.el7.cern.x86_64
eos-grpc-devel-1.56.1-2.el7.x86_64
eos-grpc-1.56.1-2.el7.x86_64
eos-grpc-gateway-0.1-1.el7.x86_64

$ ldd /usr/lib64/libXrdEosMgm-5.so | grep libprotobuf
libprotobuf.so.23 => /opt/eos/grpc/lib64/libprotobuf.so.23 (0x00007fac90ffb000)

The ssl errors might point to some issue with the CRLs and the certificate directories. We normally get everything properly set up from the WLCG point of view by installing the following package that brings in all the CA dependencies: ca-policy-egi-core-1.124-1.noarch

Also we use the following configuration at the MGM for the gsi auth plugin and HTTP:

...
sec.protocol  gsi -crl:1 -moninfo:1 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gridmap:/etc/grid-security/grid-mapfile -d:1 -gmapopt:2
....
xrd.tls  /etc/grid-security/daemon/hostcert.pem /etc/grid-security/daemon/hostkey.pem
xrd.tlsca  certdir /etc/grid-security/certificates/
http.gridmap  /etc/grid-security/grid-mapfile

Cheers,
Elvin

Hi Elvin,

Thanks again for the detailed reply. Both eos-grpc-1.41.0-2.el7.x86_64 and
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64 are installed on my host: the first one is explicitely included the in profile and the second is installed as a dependency. So, I am somewhat puzzled why ldd shows no missing libs in your case.

Removing the link breaks again eos@mgm

In my case, eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64 provides
/opt/eos/lib64/protobuf3/libprotobuf.so.28 and not the one you show in your ldd output

/opt/eos/lib64/protobuf3/libprotobuf.so.3

Maybe this one is coming from the earlier version of eos-protobuf3 package that I see in the diopside dep repo,

eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64

So, I either have to version lock this particular eos-protobuf3
or create a link between (the real) /opt/eos/lib64/protobuf3/libprotobuf.so.28 and /opt/eos/lib64/protobuf3/libprotobuf.so.3

The last option doesnt look very promising. It removes the missing lib but eos@mgm keeps crashing

Best,

George

Sorry, I meant to say eos-protobuf3-3.5.1-5.el7.cern.eos.x86_64 as an earlier version that may need to be installed

By the way, if I remove all EOS packages and try to install only
eos-protobuf3-3.5.1-5.el7.cern.eos.x86_64

yum says that

Package eos-protobuf3 is obsoleted by eos-grpc, trying to install eos-grpc-1.56.1-3.el7.x86_64 instead
Resolving Dependencies
→ Running transaction check
—> Package eos-grpc.x86_64 0:1.56.1-3.el7 will be installed

Trying to install eos-protobuf3-3.5.1-5.el7.cern.eos.x86_64 and eos-grpc-1.41.0-2.el7.x86_64 yum fails with “Error: Multilib version problems found”

Attempted again the upgrade from scratch, and version locking the earlier
eos-protobuf3-3.5.1-5.el7.cern.eos.x86_64 did work (was already instaled for eos4) did work, but the the dependency for libprotobuf.so.3 is still not satisfied.

From which package you get the following on your systems?

/opt/eos/lib64/protobuf3/libprotobuf.so.3

This comes from the following:

rpm -qf /opt/eos/lib64/protobuf3/libprotobuf.so.3
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64

Cheers,
Elvin

Thanks Elvin, and sorry for taxing your patience but we must be looking at a different package…Accorrding to what I see, this eos-protobuf3 package is not prot providing any libprotobuf.so.3 but only the following

[root@antares-eos14 ~]# rpm -qil eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64 | grep lib64
/opt/eos/lib64/protobuf3/libprotobuf.so.28
/opt/eos/lib64/protobuf3/libprotobuf.so.28.0.3

The only way to get the missing lib (and a running eos@mgm service) is via a link to the library provided by the eos-grpc package

[root@antares-eos14 ~]# ldd /usr/lib64/libXrdEosMgm-5.so | grep libproto
libprotobuf.so.3 => /opt/eos/lib64/protobuf3/libprotobuf.so.3 (0x00007fcd5051a000)
[root@antares-eos14 ~]#
[root@antares-eos14 ~]# ls -lrt /opt/eos/lib64/protobuf3/libprotobuf.so.3
lrwxrwxrwx 1 root root 41 Nov 24 13:29 /opt/eos/lib64/protobuf3/libprotobuf.so.3 → /opt/eos/grpc/lib64/libprotobuf.so.3.17.3
[root@antares-eos14 ~]#

Hi George,

What is the date on the file provided by the rpm? Are you using the package from a snapshot or from our dependency repos? Are you using the following:
https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-7/x86_64/eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64.rpm

What I see in one of our instances:

$ rpm -qf /opt/eos/lib64/protobuf3/libprotobuf.so.3
eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64

$ ls -lrt /opt/eos/lib64/protobuf3/libprotobuf.so.3
lrwxrwxrwx. 1 root root 21 Jun 21  2022 /opt/eos/lib64/protobuf3/libprotobuf.so.3 -> libprotobuf.so.3.17.3

$ ls -lrt /opt/eos/lib64/protobuf3/libprotobuf.so.3*
-rwxr-xr-x. 2 root root 2840112 Mar 30  2022 /opt/eos/lib64/protobuf3/libprotobuf.so.3.17.3
lrwxrwxrwx. 1 root root      21 Jun 21  2022 /opt/eos/lib64/protobuf3/libprotobuf.so.3 -> libprotobuf.so.3.17.3

$ rpm -ql eos-protobuf3 | grep libproto
/opt/eos/lib64/protobuf3/libprotobuf.so.3
/opt/eos/lib64/protobuf3/libprotobuf.so.3.17.3

Hope it helps!
Elvin

Hi,

I am, indeed, using the https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-7/x86_64/eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64.rpm but taken from out of our local mirror of the repo storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-7/x86_64

The date of the rpm is the same: 2023-01-12 13:53 (in our mirror) and
2023-01-12 14:53 in yours with exactly the same size, 838K

The date of the provided library is different that yours though

[root@antares-eos14 ~]# rpm -qil eos-protobuf3-3.17.3-1.el7.cern.eos.x86_64 | grep lib64
/opt/eos/lib64/protobuf3/libprotobuf.so.28
/opt/eos/lib64/protobuf3/libprotobuf.so.28.0.3
[root@antares-eos14 ~]# ls -lrt /opt/eos/lib64/protobuf3/libprotobuf.so.28.0.3
-rwxr-xr-x 1 root root 2840112 Oct 25 2021 /opt/eos/lib64/protobuf3/libprotobuf.so.28.0.3
[root@antares-eos14 ~]#

The library provided by eos-grpc-1.41 is more recent

[root@antares-eos14 ~]# ls -lrt /opt/eos/grpc/lib64/libprotobuf.so.3.17.3
-rwxr-xr-x 1 root root 3674608 Jun 12 09:19 /opt/eos/grpc/lib64/libprotobuf.so.3.17.3
[root@antares-eos14 ~]#

which explains why if I try to install the above eos-protobuf3 when there are no EOS packages at all, yum is trying to get eos-grpc-1.56 being obsoleted by it

[root@antares-eos14 ~]# repoquery --obsoletes eos-grpc-1.56.1-3.el7.x86_64
eos-protobuf3 <= 3.17.3
eos-protobuf3-compiler <= 3.17.3
eos-protobuf3-debuginfo <= 3.17.3
eos-protobuf3-devel <= 3.17.3
eos-protobuf3-lite <= 3.17.3
eos-protobuf3-lite-devel <= 3.17.3
eos-protobuf3-lite-static <= 3.17.3
eos-protobuf3-static <= 3.17.3
[root@antares-eos14 ~]#

George

Just a sanity check. While upgrading the QuarkDB nodes one by one, does it matter that there are different QuarkDB versions in the cluster? .E.g.

7) MEMBERSHIP-EPOCH 0
18) NODES antares-eos97:9999,antares-eos98:9999,antares-eos99:9999
19) OBSERVERS
20) QUORUM-SIZE 2
21) ----------
22) REPLICA antares-eos98:9999 | ONLINE | UP-TO-DATE | NEXT-INDEX 8745851 | VERSION 0.4.2
23) REPLICA antares-eos99:9999 | ONLINE | UP-TO-DATE | NEXT-INDEX 8745851 | VERSION 5.1.2.5.1.28

No, but you should get to the same version in all of them eventually.

Cheers,
Elvin

Many thanks for the ocnfirmation.

Best,
George