Currently, we have two versions as we are adding new FSTs to our EOS cluster.
Major daemons like MGM are on version 5.1 and the new FST is on version 5.2.
We have noticed that when we drain a specific FST, the results of that drain are moving to the new FST. However, we are not seeing any data moving to the new FST at all, even though all of our groups are balanced. In other words, we are draining but not balancing.
I’m wondering if I need to upgrade all of my FSTs to resolve this or if there is more configuration that needs to be done.
The error varies from time to time, but the following errors seem to be the most relevant ones to report.
After adding the ATTR setting (below) to the existing FSTs on that system, restarting them all to make them ATTR, and then upgrading the MGM servers to version 5.2, the data is now load-balancing to the new disks just fine.
fstofs.filemd_handler attr
However, I am experiencing one issue and am reaching out to you for help.
Currently, I am constantly getting messages that the MGM master has been changed sequentially on each FST.
250109 09:19:29 time=1736414369.050678 func=ProcessCapOpaque level=WARN logid=d3dd0a06-ce6a-11ef-a347-b8599fa512f0 unit=fst@jbod-mgmt-08.sdfarm.kr:1095 tid=00007fa1345bb640 source=XrdFstOfsFile:2840 tident=3.14:72@jbod-mgmt-11 sec=(null) uid=99 gid=99 name=(null) geo=“” msg=“MGM master seems to have changed - adjusting global config” old-manager=“jbod-mgmt-04.sdfarm.kr:1094” new-manager=“jbod-mgmt-01.sdfarm.kr:1094”
250109 09:19:29 30661 XrootdXeq: 6.22:273@jbod-mgmt-06 disc 0:09:49
250109 09:19:29 30767 XrootdXeq: 7.20:301@jbod-mgmt-01 pub IP46 login as daemon
250109 09:19:29 time=1736414369.726071 func=ProcessCapOpaque level=WARN logid=d4496282-ce6a-11ef-a751-b8599f9c4330 unit=fst@jbod-mgmt-08.sdfarm.kr:1095 tid=00007fa2295fb640 source=XrdFstOfsFile:2840 tident=8.14:178@jbod-mgmt-11 sec=(null) uid=99 gid=99 name=(null) geo=“” msg=“MGM master seems to have changed - adjusting global config” old-manager=“jbod-mgmt-01.sdfarm.kr:1094” new-manager=“jbod-mgmt-04.sdfarm.kr:1094”
250109 09:19:29 time=1736414369.905068 func=ProcessCapOpaque level=WARN logid=d4529ffa-ce6a-11ef-b2d0-b8599fa512f0 unit=fst@jbod-mgmt-08.sdfarm.kr:1095 tid=00007fa12993c640 source=XrdFstOfsFile:2840 tident=6.14:69@jbod-mgmt-11 sec=(null) uid=99 gid=99 name=(null) geo=“” msg=“MGM master seems to have changed - adjusting global config” old-manager=“jbod-mgmt-04.sdfarm.kr:1094” new-manager=“jbod-mgmt-01.sdfarm.kr:1094”
On the MGM server, when I use the ns command, I see that the master is not changing.
I was wondering if you could tell me if you have encountered this issue and what I should do to find the cause?
Normally, these kind of messages should go away and they appear only after an MGM master-slave transition appears. Which is the current master MGM in you instance (which hostname) ? Do you have any messages like this “FST node configuration change” ?