We are trying to completely disable FST disk scanning to prevent the changes of files’s ctime and, thus, allow the tape aware GC to detect files for eviction from the retrieve space.
We have made the following space-wide changes
root@cta-adm ~]# eos space config retrieve space.scaninterval=0
success: setting scaninterval=0
[root@cta-adm ~]# eos space config retrieve space.scanrate=0
success: setting scanrate=0
but it looked like the ScanDisk was still running despite the above change.
Today, we set the filesystem attribute scaninterval to 0 , i.e
eos fs config XX scaninterval=0
on selected file systems.
Is this enough to stop the ScanDisk thread from running? Or we need to change more FS settings?
After restarting eos@fst, I see the log lines like the following
The following command will modify the scaninterval for any new file systems added to that space:
eos space config retrieve space.scaninterval=0
You need to use the following command to modify the value for the existing file systems:
eos space config retrieve fs.scaninterval=0
Also the command that you used does the same thing but it’s probably a bit more tedious to run it for each file system. The one above does everything in one go.
This is the only parameter that you need to change to disable the scanning. The output in the logs is expected since the thread still runs but does not pick up any files for scanning. And this you can verify when you get the summary for the current run. So search in the FST logs for a line like this [ScanDir] Directory: and there should be 0 files scanned.
Just to follow this up. Although setting fs.scaninterval to 0 results in all files being skipped according to the log message, we were still seeing updated ctimes for all files in an FS that correlate with the last scan time.
Setting {space,fs}.scanrate to 0 also didn’t help. I also restarted the FSTs just in case it needed a kick needed to pick up the settings, but no joy.
In the end, I set a week long fs.scan_disk_interval, which is enough time for the CTA FST garbage collection to clean up old files between the scans.
I feel like I probably missing something here, there are quite a number of filesystem ‘scan’ related settings, and I could believe we’ve done something silly. I also couldn’t obviously see what attribute was actually being updated to trigger the ctime update.
A more appropriate fix might be for the CTA FST garbage collection to look at the mtime instead of the ctime, but that’s a discussion for elsewhere.
Sorry for the late reply on this, I was on holidays.
The fact that the ctime is updated is just an artifact of the last scan time extended attribute being modified when the scanner decides to scan or not the file. Therefore, in reality nothing is scanned it just some metadata attached to the file which is updated - which in turn modifies the ctime.
The scanrate has not influence on the set of files which are to be scanned so there is no need to bother modifying this parameter.
Exactly, a better way for the CTA FST GC would be to look at the mtime rather than the ctime. I will discuss this with the CTA team.