I would like to make sure I understand the process, so I am going to state here how I understood it and people can correct:
stop EOS and perform an off-line compactification of the namespace
create a temporary QuarkDB database (in /var/lib/quarkdb/convert, for example) on the EOS manager with one node
configure xrootd-quarkdb to use this database
start the quarkdb service (systemctl start xrootd@quarkdb)
run eos-ns-convert with the compacted namespace files (files and directories)
stop the quarkdb service
create the final production clustered QuarkDB database (in /var/lib/quarkdb/production, for example) on all participating nodes.
on each of these nodes, delete the new raft-journal directory (/var/lib/quarkdb/production/current/raft-journal ?) and copy the one from /var/lib/quarkdb/convert on the manager.
on each node in the QuarkDB cluster check that the configuration is redis.mode raft
start the quarkdb service on all nodes and check convergence of the cluster (how ?)
start EOS manager and check that the new namespace is functional (to be described)
The above looks correct, apart from step 8: You need to copy the state-machine from the bulkload instance, into every node of the new clustered QuarkDB instance (replacing the existing one) - not the raft journal. (in fact, the raft-journal directory should not exist in bulkload instances)
Regarding step 10: To check convergence, view the QuarkDB logs to make sure an election has occurred.
Note: The above procedure is only necessary if your namespace is very large (hundreds of millions of files), as bulkload significantly speeds things up. If not, you could simply start a cluster QuarkDB in raft mode, and run the conversion tool against that, without any need for bulkload.
I’m not sure how much detail you’re writing into your process, but if you haven’t already, it might be worth noting down what changes are needed on the MGM side as well!
In the xrd.cf.mgm file, make sure to change mgmofs.nslib to use quarkdb, and if running a cluster, make sure to specify nodes:
In case this is of any interest, here are our namespace stats:
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL Files 200701974 [booted] (572s)
ALL Directories 40256837
ALL Total boot time 784 s
In bulkload mode, it took approximately one hour to complete the namespace conversion process.
Yes I was aware of the need to change the MGM config in order to use the new namespace.
Now the question will be if I really need to use the bulkload mode. Our namespace is much smaller than yours:
ALL Files 16683933 [booted] (534s)
ALL Directories 54544
16M instead of 200M.
I suppose the benchmark quarkdb-bench could give a clue and could be run before deciding which mode to use… Not sure on how to read the results… On my test cluster (database on standard disk), it gives:
quarkdb-bench --gtest_filter="Benchmark/hset.hset/threads2_events3000000_consensus"
Running main() from bench/main.cc
Note: Google Test filter = Benchmark/hset.hset/threads2_events3000000_consensus
[==========] Running 1 test from 1 test case.
[...]
[ OK ] Benchmark/hset.hset/threads2_events3000000_consensus (87673 ms)
[----------] 1 test from Benchmark/hset (87673 ms total)
[----------] Global test environment tear-down
[1551426742924] INFO: Global environment: clearing connection cache.
[==========] 1 test from 1 test case ran. (87694 ms total)
[ PASSED ] 1 test.
The best way to tell is to perform a test-migration, and measure how long it takes for the migration tool to complete. With 16M files it should not be very long, less than 20 minutes - therefore you don’t need to run in bulkload mode first, you could simply set-up your raft cluster and point the migration tool towards it.
The time to switch from bulkload to the full cluster would probably add more time than bulkload saves, anyway.
Regarding quarkdb-bench: The results you post are quite good, but still it’d be best to test the “real” thing, ie how long it takes for the migration tool to complete. You could test against both bulkload and raft, and decide depending on whether it makes a large difference for you, in terms of the downtime that will be necessary for your EOS instance.
I like that change!! We’re definitely using bulkload mode because of the size of our namespace, and while I think I’m reasonably familiar with the migration process now, it always helps to simplify it
Side question while we’re talking about quarkdb: what environment variables (if any) still need to be set if using the new quarkdb master/slave setup? Referring to things like EOS_MGM_MASTER1/MASTER2, EOS_MGM_ALIAS etc, I’m not sure if those can be removed or if they’re still necessary, as I imagine they probably won’t be used?
@esindril is the expert in this, but I believe the environment variable to use for QDB-based master-slave MGM setup, is EOS_USE_QDB_MASTER=1.
Note that, we still don’t use it in production… All of our production instances on QDB only have a single MGM at the moment.
PS: The extra option has been added to quarkdb-create, and will be available from 0.3.6. Now need to write some detailed documentation for bulkload, too…