QuarkDB namespace and attr on large directories

So the guy here who normally manages attributes for group directories said there is a noticeable difference in the time it takes to run the attr command on large dirs.

The one today was about 76 TB with 867,000 files. He said he was always surprised how fast such a command ran on the in-memory namespace. He said it now takes minutes.

Example (I’ve edited the actual path):
/usr/bin/eos -b -j ‘attr -r set sys.acl=g:5063:rx!d!u,u:46759:rwx+d+u,u:43841:rwx+d+u,u:43309:rwx+d+u,u:48952:rwx+d+u,u:4183:rwx+d+u,u:53908:rwx+d+u,u:49398:rwx+d+u,u:45849:rwx+d+u,u:1291:rwx+d+u,u:45540:rwx+d+u,u:45674:rwx+d+u,u:55629:rwx+d+u,u:16822:rwx+d+u,u:9203:rwx+d+u,u:47005:rwx+d+u,u:13175:rwx+d+u,u:56710:rwx+d+u,u:51144:rwx+d+u,u:52536:rwx+d+u /some/big/directory’

Is this expected? We’re on SSD.

Hi Dan,

Is this still an issue? How fast was the command with the in-memory namespace, and how many files / directories are there in the target directory?

Cheers,
Georgios

OK, I don’t have anything you would call ‘solid’ as far as numbers go. Here is a recent command that was run that I was told initially took about 3 minutes:

/usr/bin/eos -b -j ‘attr -r set sys.acl=g:5063:rx!d!u,u:54016:rwx+d+u,u:55630:rwx+d+u,u:48720:rwx+d+u,u:51868:rwx+d+u,u:43841:rwx+d+u,u:52330:rwx+d+u,u:10385:rwx+d+u,u:43729:rwx+d+u,u:56430:rwx+d+u,u:43309:rwx+d+u,u:2249:rwx+d+u,u:55405:rwx+d+u,u:51136:rwx+d+u,u:51615:rwx+d+u,u:56283:rwx+d+u,u:52081:rwx+d+u,u:51408:rwx+d+u,u:44258:rwx+d+u,u:52937:rwx+d+u,u:3278:rwx+d+u,u:51961:rwx+d+u,u:47113:rwx+d+u,u:48257:rwx+d+u,u:7586:rwx+d+u,u:50002:rwx+d+u,u:50062:rwx+d+u,u:51893:rwx+d+u,u:5524:rwx+d+u,u:11912:rwx+d+u,u:44750:rwx+d+u,u:15867:rwx+d+u,u:53986:rwx+d+u,u:48968:rwx+d+u,u:55957:rwx+d+u,u:45481:rwx+d+u,u:44456:rwx+d+u,u:48504:rwx+d+u,u:13281:rwx+d+u,u:44650:rwx+d+u,u:1291:rwx+d+u,u:47155:rwx+d+u,u:10222:rwx+d+u,u:6398:rwx+d+u,u:48622:rwx+d+u,u:1320:rwx+d+u,u:51338:rwx+d+u,u:6345:rwx+d+u,u:44851:rwx+d+u,u:50647:rwx+d+u,u:55854:rwx+d+u,u:2908:rwx+d+u,u:7027:rwx+d+u,u:45508:rwx+d+u,u:44788:rwx+d+u,u:42623:rwx+d+u,u:13175:rwx+d+u,u:47534:rwx+d+u,u:12665:rwx+d+u,u:51144:rwx+d+u,u:7333:rwx+d+u,u:42717:rwx+d+u,u:45130:rwx+d+u,u:53863:rwx+d+u,u:52536:rwx+d+u,u:43903:rwx+d+u,u:42228:rwx+d+u,u:51474:rwx+d+u,u:43648:rwx+d+u /eos/uscms/store/user/lpcsusyhad’

I was also told the command would return pretty much right away with the in-memory namespace.

I should also add that when the command was re-run with ‘time’ in front of it, the response was almost instantaneous, because I assume at that point all the metadata was cached.

Also for your consideration, this is one of the largest hierarchies we have. It’s 1.57M files and 5744 dirs using about 243TB of space. When I ran ‘find -f --count’ it took about 50 sec to return. When I ran ‘find --count’ again, it was again quicker.

To be fair, no one is really complaining, just remarking about the difference in speed and whether that is as it should be.

Thanks,
Dan

Hi Dan,

I see, this makes sense. If metadata is not cached, the namespace has to fetch all 1.57M files and 5744 directories, but in the end it only modifies the directories which are very few.

It spends the vast majority of the time fetching file metadata into the cache – which results in a very long runtime when metadata is not cached, but runs instantly otherwise.

I’m modifying the code so it only fetches the container metadata, and not the files when it doesn’t need to. :slight_smile: Thanks for the bug report! The improvement should be available in the next EOS release, which I think is 4.8.5.

Once you eventually install it feel free to let us know if the fix improves things.

Cheers,
Georgios