Files missing from the FST - wondering if the FST isn't deleting them and a restart restores the record

An interesting one, as we have one server that for some reason contains files that aren’t replicated elsewhere on the system, even though the policy says they should be.

The suspicion is that the file was deleted, and for some reason this particular FST never recorded the delete locally. The file is missing from the FS themselves.

Under Beryl-Aquamarine, we could deleted the FMD sql files on the FST, and let the FST scan the disks and perform a slow boot. On Citrine, can we do a similar thing, delete the LevelDB directory, and let the FST perform a slow boot to repair whatever state is present on the FST?

Anyone?

If the file is deleted, they should be reported as orphaned files in FSCK.
You can then use ‘fsck repair’ to drop all this files and get rid of them.

It this are files which are in the namespace, you can adjust the replication also using ‘fsck repair’.

Considering the local FST DB, you can also wipe out Leveldb and then send a boot with the sync flag to these filesystems to repopulate the LevelDB.

Cheers Andreas.

thanks Andreas, we’ll be running fsck repair to remove orphans, unlinked & unregistered files.

rebooting the fses with --sync-mgm seems to repair e(mgmsize) and e(mgm-cx).

we do also have quite a few e(disksize) and e(disk-cx) errors from fs ls --fsck output, can I just double-check which fsck repair option to use to fix these? would it be fsck repair --checksum-commit ?

thanks!