Dumpmd - too many entries

Hi all,

Running dumpmd to provide ALICE a list of files to redistribute due to a backing fsid failure (single disk) but getting the following.

How to proceed? Is there a QDB query method?

[root@ornl-eos-01]-diopside-~# eos fs dumpmd 12068 --path > fsid-12068-paths.txt
error: too many entries (>100k) on file system to dump them all

Thank you,
Pete

Hi Pete! I had recently the same issue and the solution was to dump the entire namespace and filter it ..
the dump was done with:

eos-ns-inspect scan --path / --no-dirs --members my_qdb_fqdn:qdb_port --password-file /etc/quarkdb.pass --json > full_info.json

and then i filtered it with:


cat filter_eosdump.py 
#!/usr/bin/env python3

import os
import sys
import json
import ijson

fileName = 'full_info.json'

with open(fileName, mode='r') as f:
    for record in ijson.items(f, "item"):
        if '/eos/alice/grid' in record['path']:
            print(f'{record["name"]},{record["size"]}')

HTH,
Adrian

Hi @asevcenc, thank you for that solution, that should work.

Too bad eos fs dumpmd is failing as the full ns dump is 20G and takes a while. I don’t see a way to scope eos-ns-inspect to a specific field, so seems one has to parse after as you suggested.

I installed version to match existing eos-server with dnf install eos-ns-inspect-$(rpm -q --queryformat %{VERSION} eos-server) (to avoid unscheduled update of other eos-*)

It appears the “locations” contains the fsid number, so I modified and sharing in case helpful to others.

#!/usr/bin/env python3

import os
import sys
import json
import ijson

fileName = 'eos-ns-inspect_paths_sample.json'

with open(fileName, mode='r') as f:
    for record in ijson.items(f, "item"):
        #if '120' in record['locations']: # the fsid number or substring thereof 
        if '12048' in record['locations'] or '12068' in record['locations']: # multiple fsids
            print(f'{record["name"]},{record["size"]}')

cheers,
Pete

@asevcenc @Costin_Grigoras @esindril

Curious that after filtering the ns dump on record[‘locations’] as above for a specific fsid there is quite a large discrepancy between what the QDB namespace dump produced (which essentially matches eos fs dumpmd --count) vs what eos fs status reports for stat.usedfiles as below.

A random sampling across fsids shows similar high variation, though not as significant as the one below: eos fs dumpmd $fsid --count && eos fs status $fsid | grep usedfiles

Is such expected? Is stat.usedfiles not in fact file entities on an fisd, or perhaps only initiall populated when the fsid boots? I’m not finding info on the forum which clarifies stat.usedfiles or correlates it to eos fs dumpmd --count

The 20G dump was of the full namespace was produced with:

eos-ns-inspect scan --path / --no-dirs --members ornl-eos-01.ornl.gov:7001 --password-file /etc/eos.keytab --json > /data/eos-inspect-ns-dumps/eos-ns-inspect_paths.json

eos fs status vs ns entries for two fsids:

ns dump produced 14% of what fs status reports:

wc -l fsid-12048-locations.txt
9596 fsid-12048-locations.txt

eos fs dumpmd 12048 --count
num_files=9590

Though... 

eos fs status 12048 | grep usedfiles
stat.usedfiles                   := 65720

Also, do a far less degree

wc -l fsid-12068-locations.txt
286720 fsid-12068-locations.txt

eos fs dumpmd 12068 --count
num_files=286542

Though...

eos fs status 12068 | grep usedfiles
stat.usedfiles                   := 305069

Hi Pete,

The fs status lists all the files on the mountpoint, irrespective if they correspond to a namespace entry or not - basically all the inodes on that filesystem and corresponds to the statfs->f_files item. This might include for example, orphan files, block checksum files which are usually attached the a main file depending on the layout, scrub files used to identify broken disks etc.

Therefore, in general, I would expect that the fs status shows more files then the eos fs dumpmd. This should not be a concern.

Cheers,
Elvin