Hi, we are now trying to upgrade EOS 4 to EOS 5, and has no idea how to set up MGM HA. We found some possible solution with LVS, but it seems not to be graceful nor work out. Is there any documents to explain how to set up? We also had test MGM Active/passive setup for EOS 4, but EOS 5 is different as we know.
BTW, does the client need to know all MGM addresses when access EOS cluster? or just the frontend address for MGMs?
My reply to the email from Haibo covers also this topic. For the moment MGM HA is not supported. This will happen once we drop the MQ daemon and refactor some other parts of the code.
May I ask how the migration from eos5 to the future version supporting HA will be?
We just upgraded our eos instance to 5 and wondering how much effort to migrate to the version supporting HA.
Having HA requires that we drop the current MQ daemon. This is planned for 5.2.0 release which will probably arrive in a couple of months. This will not solve all the issues since HA requires also quick DNS updates but will provide a reasonable level of HA. Nevertheless, the current setup with just one MGM that is able to restart is less than 2 minutes should not be problematic from an operational point of view. This is how we are currently running it at CERN.
How do you resolve when eos host down? Do you have a backup server?
In this case, we would start an MGM on one of the other QDB nodes in the cluster since our setup has 3 nodes that are identical - they have both QDB (running) and the MGM rpms (but no daemon running on the slaves). This also involves an update of the DN to point to the new hostname. This is a manual operation but we actually never had to do it our instances so far.