Understanding consistency checks : autorepair, fsck, syncmgm

barbet · July 31, 2019, 7:38am

Hi Elvin,

Thank you very much for the precise answers.

As for migrating our production EOS to using QuarkDB, the issues are:

a) according to Config in quarkDB for master/slave(s) I would need EOS version 4.4.47 in order for the manager failover to work correctly with QuarkDB. Right ?

b) having updated to 4.4.47 and migrated to QDB, I would have to disable fsck (which I believe is not currently enabled in our production storage). With a compacted namespace, we have no problems to boot our managers, so it may be better to wait. This depends also on your release plans (4.4.47 in stable repo and fixed fsck available).

About damaged filesystems:

If I am not wrong, the way it works at CERN is that if smartctl detects a disk failure, the fs is automatically drained. Is this right ?

I have a fs that I manually drained and is currently “empty,drained”. I moved it to a “spare” group. Is there an automatic procedure that would use this available fs in a group that has a failed one or is it a manual procedure ? In this case, is it enough to move it in the group which need it and let the balancer do its job ?

Thanks

JM

CERN Accelerating science

Understanding consistency checks : autorepair, fsck, syncmgm