Hi,
I try to setup eos failover cluster with quarkdb backend
I have 3 quarkdb machines (b00,b01,b02), and 2 eos mgm nodes (m01,m02).
Master node has next strings in /etc/xrd.cf.mgm
EOS_MGM_HOST=m01.test.ru
EOS_MGM_HOST_TARGET=m02.test.ru
EOS_INSTANCE_NAME=eostest
EOS_MGM_MASTER1=m01.test.ru
EOS_MGM_MASTER2=m02.test.ru
#EOS_MGM_ALIAS=eos.test.ru
#EOS_PSS_MGM=$EOS_MGM_ALIAS:1094
EOS_BROKER_URL=root://eos.test.ru:1097//eos/
Slave node has next strings in /etc/xrd.cf.mgm
EOS_MGM_HOST=m02.test.ru
EOS_MGM_HOST_TARGET=m01.test.ru
EOS_INSTANCE_NAME=eostest
EOS_MGM_MASTER1=m01.test.ru
EOS_MGM_MASTER2=m02.test.ru
#EOS_MGM_ALIAS=eos.test.ru
#EOS_PSS_MGM=$EOS_MGM_ALIAS:1094
EOS_BROKER_URL=root://eos.test.ru:1097//eos/
I start eos in next order:
m01:# systemctl start eos@master
m01:# systemctl start eos@sync
m01:# systemctl start eos@mq
m01:# systemctl start eos@mgmm02:# systemctl start eos@master
m02:# systemctl start eos@sync
m02:# systemctl start eos@mq
m02:# systemctl start eos@mgm
After that I recive errors:
---- high rate error messages suppressed ----
181024 16:32:28 time=1540387948.217943 func=Supervisor
level=CRIT logid=27b61c8a-d791-11e8-b374-000af7e02290
unit=mgm@eos.test.ru:1094 tid=00007fc507dfc700
source=Master:412 tident= sec= uid=0
gid=0 name= geo=“” msg=“dual RW master setup detected”
---- high rate error messages suppressed ----
181024 16:32:34 time=1540387954.233800 func=Supervisor
level=CRIT logid=27b61c8a-d791-11e8-b374-000af7e02290
unit=mgm@eos.test.ru:1094 tid=00007fc507dfc700
source=Master:412 tident= sec= uid=0
gid=0 name= geo=“” msg=“dual RW master setup detected”
I fixed this problem changed broker url string to:
EOS_BROKER_URL=root://m01.test.ru:1097//eos/
EOS_BROKER_URL=root://m02.test.ru:1097//eos/
But, what is broker_url option? I can’t information about it in documentation. If I have alias eos.test.ru that is contain broker_url string?
but then I try to change:
m01:~ # eos -b ns master m02.test.ru
configdir=/var/eos/config/m02.test.ru/ activating master=m02.test.rusuccess: <m02.test.ru> is now the master
m02:~ # eos -b ns master m02.test.ru
In mgm log at m02:
[QCLIENT - INFO - processRedirection:377] redirecting to b02.test.ru:7777
[QCLIENT - INFO - processRedirection:377] redirecting to b02.test.ru:7777
[QCLIENT - INFO - processRedirection:377] redirecting to b02.test.ru:7777
[QCLIENT - INFO - processRedirection:377] redirecting to b02.test.ru:7777
181029 10:57:47 time=1540799867.150357 func=Slave2Master level=CRIT logid=e4eff2ce-db4e-11e8-b22c-000af7e0a0ea unit=mgm@m02.test.ru:1094 tid=00007fd3d07ff700 source=Master:1335 tident= sec= $
terminate called after throwing an instance of ‘std::logic_error’
what(): basic_string::_S_construct null not valid
error: received signal 6:
/lib64/libXrdEosMgm.so(_Z20xrdmgmofs_stacktracei+0x44)[0x7fd3ce29d874]
/lib64/libc.so.6(+0x36280)[0x7fd3d3a3c280]
/lib64/libc.so.6(gsignal+0x37)[0x7fd3d3a3c207]
/lib64/libc.so.6(abort+0x148)[0x7fd3d3a3d8f8]
/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x165)[0x7fd3d434b7d5]
/lib64/libstdc++.so.6(+0x5e746)[0x7fd3d4349746]
/lib64/libstdc++.so.6(+0x5e773)[0x7fd3d4349773]
/lib64/libstdc++.so.6(+0x5e993)[0x7fd3d4349993]
/lib64/libstdc++.so.6(_ZSt19__throw_logic_errorPKc+0x77)[0x7fd3d439e597]
/lib64/libstdc++.so.6(_ZNSs12_S_constructIPKcEEPcT_S3_RKSaIcESt20forward_iterator_tag+0xa1)[0x7fd3d43aa3c1]
#########################################################################stack trace exec=xrootd pid=14234 what=‘thread apply all bt’
#########################################################################
It stacked. Eos mgm is running but this actions can’t be performed.
By mgm node:
m01
ALL Replication mode=slave-ro state=slave-ro master=m02.test.ru configdir=/var/eos/config/m02.test.ru/ config=default active=true mgm:m02.test.ru=ok mgm:mode=slave-ro mq:m02.test.ru:1097=ok
m02
ALL Replication mode=slave-ro state=slave-ro master=m01.test.ru configdir=/var/eos/config/m01.test.ru/ config=default active=true mgm:m01.test.ru=ok mgm:mode=slave-ro mq:m01.test.ru:1097=o
Before this action m01 was - master
How I can switch master to slave? What’s wrong?