Dear Experts,
I have encountered the following errors while trying to compact files.$HOSTNAME.mdlog in /var/eos/md/.
[root@alice-t1-eos-mgm01 ~]# tail -f /var/log/eos/mgm/xrdlog.mgm | grep -i compact
200320 11:30:13 time=1584671413.208608 func=Compacting level=ALERT logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:663 tident=<service> sec= uid=0 gid=0 name= geo="" msg="online-compacting running"
200320 11:30:13 time=1584671413.208865 func=Compacting level=NOTE logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:665 tident=<service> sec= uid=0 gid=0 name= geo="" msg="starting online compaction"
200320 11:31:06 time=1584671466.280476 func=Compacting level=ALERT logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:838 tident=<service> sec= uid=0 gid=0 name= geo="" msg="compact done"
200320 13:17:06 time=1584677826.296367 func=Compacting level=ALERT logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:663 tident=<service> sec= uid=0 gid=0 name= geo="" msg="online-compacting running"
200320 13:17:06 time=1584677826.296560 func=Compacting level=NOTE logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:665 tident=<service> sec= uid=0 gid=0 name= geo="" msg="starting online compaction"
200320 13:36:09 time=1584678969.998252 func=Compacting level=CRIT logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:832 tident=<service> sec= uid=0 gid=0 name= geo="" online-compacting returned ec=5 error: Changelog file has corruption - autorepair is disabled
200320 13:36:11 time=1584678971.008025 func=Compacting level=CRIT logid=fec9a8de-6a4f-11ea-83a3-001a4a615c30 unit=mgm@alice-t1-eos-mgm01.sdfarm.kr:1094 tid=00007f347dbff700 source=Master:872 tident=<service> sec= uid=0 gid=0 name= geo="" failed online compactification
Before trying compaction, I stopped all eos mgm services (mgm, mq and sync) then tried to repair mdlogs of files and directories. The repair was OK as you can see below.
[root@alice-t1-eos-mgm01 md]# eos-log-repair files.alice-t1-eos-mgm01.sdfarm.kr.mdlog.tmp files.alice-t1-eos-mgm01.sdfarm.kr.mdlog
Header status: OK (version: 0x1, content: 0x1)
Elapsed time: 83 m. 58 s. Progress: 42.489 GB / 42.489 GB
Scanned: 336537874
Healthy: 336537874
Bytes total: 45623333224
Bytes accepted: 45623333224
Bytes discarded: 0
Not fixed: 0
Fixed (wrong magic): 0
Fixed (wrong checksum): 0
Fixed (wrong size): 0
Elapsed time: 83 m. 58 s.
[root@alice-t1-eos-mgm01 md]# eos-log-repair directories.alice-t1-eos-mgm01.sdfarm.kr.mdlog.tmp directories.alice-t1-eos-mgm01.sdfarm.kr.mdlog
Header status: OK (version: 0x1, content: 0x2)
Elapsed time: 73 m. 47 s. Progress: 42.505 GB / 42.505 GB
Scanned: 271753438
Healthy: 271753438
Bytes total: 45641029140
Bytes accepted: 45641029140
Bytes discarded: 0
Not fixed: 0
Fixed (wrong magic): 0
Fixed (wrong checksum): 0
Fixed (wrong size): 0
Elapsed time: 73 m. 47 s.
As shown above, the compaction on directories mdlog is done but on files is failed even though any suspicious things found during the repairing.
By the way, MGM booting is OK and the instance is working fine. Do you have any idea on this?
I plan to convert this EOS instance using (still!) in-memory namespace to the one with QuarkDB. I just would like to make sure that everything is OK before the conversion.
Thank you in advance.
Best regards,
Sang-Un