QuarkDB setup

dszkola · December 21, 2018, 2:54pm

I have a few questions about a new EOS 4.4.10 setup that I have (I think) converted over to QuarkDB.

This test setup includes four hosts. Initially, two were set up as MGM (master/slave) and two were FST nodes.

First I upgraded to 4.4.10 from 4.2.28. This apparently went fine and the in-memory NS plugin still seems to work.

Then I set about getting QuarkDB set up. I used the two MGM nodes and one of the FST nodes. This went OK too for the most part. I then followed (as best I could) some of the doc in the GitHub repository.

The conversion seems to have worked, but I have some questions.

First, how can I tell for sure the setup is now completely using QuarkDB? The raft-info shows the DB cluster is working correctly, but is it propagating data? How can I know all nodes have the latest EOS data?

Second, there is some kind of issue with the config data.

Here is the ‘node ls’ and ‘fs ls’ from the two MGMs:

MGM1 (master):

EOS_SERVER_VERSION=4.4.10 EOS_SERVER_RELEASE=1
EOS_CLIENT_VERSION=4.4.10 EOS_CLIENT_RELEASE=1
EOS Console [root://localhost] |/eos/uscms/store/user/dszkola/> node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬──────┬──────────┬────────┬────────┬────────────────┬─────┐
│type      │                        hostport│          geotag│    status│      status│  txgw│ gw-queued│  gw-ntx│ gw-rate│  heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴──────┴──────────┴────────┴────────┴────────────────┴─────┘
 nodesview     cmseos-itbfst04.fnal.gov:1095    geotagdefault     online           on    off          0       10      120                2     3 
 nodesview     cmseos-itbfst05.fnal.gov:1095    geotagdefault     online           on    off          0       10      120                2     3 

EOS Console [root://localhost] |/eos/uscms/store/user/dszkola/> fs ls
┌────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬────────┬────────────────┐
│host                    │port│    id│                            path│      schedgroup│          geotag│        boot│  configstatus│ drainstatus│  active│          health│
└────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴────────────────┴────────────┴──────────────┴────────────┴────────┴────────────────┘
 cmseos-itbfst04.fnal.gov 1095   2001                   /storage/data1        default.1    geotagdefault       booted             rw      nodrain   online              N/A 
 cmseos-itbfst04.fnal.gov 1095   2002                   /storage/data2        default.2    geotagdefault       booted             rw      nodrain   online              N/A 
 cmseos-itbfst04.fnal.gov 1095   2003                   /storage/data3        default.3    geotagdefault       booted             rw      nodrain   online              N/A 
 cmseos-itbfst05.fnal.gov 1095   2011                   /storage/data1        default.1    geotagdefault       booted             rw      nodrain   online              N/A 
 cmseos-itbfst05.fnal.gov 1095   2012                   /storage/data2        default.2    geotagdefault       booted             rw      nodrain   online              N/A 
 cmseos-itbfst05.fnal.gov 1095   2013                   /storage/data3        default.3    geotagdefault       booted             rw      nodrain   online              N/A

MGM2 (slave):

EOS_SERVER_VERSION=4.4.10 EOS_SERVER_RELEASE=1
EOS_CLIENT_VERSION=4.4.10 EOS_CLIENT_RELEASE=1
EOS Console [root://localhost] |/eos/uscms/store/user/dszkola/> node ls
┌──────────┬────────────────────────────────┬────────────────┬──────────┬────────────┬──────┬──────────┬────────┬────────┬────────────────┬─────┐
│type      │                        hostport│          geotag│    status│      status│  txgw│ gw-queued│  gw-ntx│ gw-rate│  heartbeatdelta│ nofs│
└──────────┴────────────────────────────────┴────────────────┴──────────┴────────────┴──────┴──────────┴────────┴────────┴────────────────┴─────┘
 nodesview     cmseos-itbfst04.fnal.gov:1095    geotagdefault     online                 off          0       10      120                2     0 
 nodesview     cmseos-itbfst05.fnal.gov:1095    geotagdefault     online                 off          0       10      120                0     0 

EOS Console [root://localhost] |/eos/uscms/store/user/dszkola/> fs ls
EOS Console [root://localhost] |/eos/uscms/store/user/dszkola/>

and the ‘ns’ command from each:

MGM1:

# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            17 [booted] (0s)
ALL      Directories                      30
ALL      Total boot time                  1 s
# ------------------------------------------------------------------------------------
ALL      Compactification                 status=off waitstart=0 interval=0 ratio-file=0.0:1 ratio-dir=0.0:1
# ------------------------------------------------------------------------------------
ALL      Replication                      mode=master-rw state=master-rw master=cmseos-itbmgm01.fnal.gov configdir=/var/eos/config/cmseos-itbmgm01.fnal.gov/ config=default mgm:cmseos-itbmgm02.fnal.gov=down mq:cmseos-itbmgm02.fnal.gov:1097=ok
# ------------------------------------------------------------------------------------
ALL      files created since boot         1
ALL      container created since boot     0
# ------------------------------------------------------------------------------------
ALL      current file id                  112
ALL      current container id             39
# ------------------------------------------------------------------------------------
ALL      eosxd caps                       0
ALL      eosxd clients                    0
# ------------------------------------------------------------------------------------
ALL      File cache max num               30000000
ALL      File cache occupancy             11
ALL      Container cache max num          3000000
ALL      Container cache occupancy        23
# ------------------------------------------------------------------------------------
ALL      memory virtual                   2.24 GB
ALL      memory resident                  134.73 MB
ALL      memory share                     24.04 MB
ALL      memory growths                   269.64 MB
ALL      threads                          241
ALL      fds                              281
ALL      uptime                           61224
# ------------------------------------------------------------------------------------

MGM2:

# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            16 [failed] (1545343450s)
ALL      Directories                      30
ALL      Total boot time                  1545343449 s
# ------------------------------------------------------------------------------------
ALL      Compactification                 status=off waitstart=0 interval=0 ratio-file=0.0:1 ratio-dir=0.0:1
# ------------------------------------------------------------------------------------
ALL      Replication                      mode=slave-ro state=slave-ro master=cmseos-itbmgm01.fnal.gov configdir=/var/eos/config/cmseos-itbmgm01.fnal.gov/ config=default mgm:cmseos-itbmgm01.fnal.gov=ok mgm:mode=master-rw mq:cmseos-itbmgm01.fnal.gov:1097=ok
# ------------------------------------------------------------------------------------
ALL      files created since boot         1
ALL      container created since boot     0
# ------------------------------------------------------------------------------------
ALL      current file id                  112
ALL      current container id             39
# ------------------------------------------------------------------------------------
ALL      eosxd caps                       0
ALL      eosxd clients                    0
# ------------------------------------------------------------------------------------
ALL      File cache max num               30000000
ALL      File cache occupancy             0
ALL      Container cache max num          3000000
ALL      Container cache occupancy        6
# ------------------------------------------------------------------------------------
ALL      memory virtual                   2.22 GB
ALL      memory resident                  300.29 MB
ALL      memory share                     22.02 MB
ALL      memory growths                   2.22 GB
ALL      threads                          241
ALL      fds                              261
ALL      uptime                           60424
# ------------------------------------------------------------------------------------

So something is not right there.

Third, where is the metadata that existed in the files.md and directory.md now stored? Yes, in the QuarkDB, but what files on disk? I’d like to keep track of its size.

Fourth, how do we do a proper EOS backup now? Before, I backed up the 2 *.md files, the config, and the daily report file. I need to know how to do that same thing in the new environment

Fifth (and last for right now), is compacting the namespace still necessary with the QuarkDB setup?

Thanks,

Dan Szkola
FNAL

gbitzes · December 21, 2018, 3:24pm

Hi Dan,

To tell for sure an EOS instance is using the QuarkDB namespace, run “eos ns”: You will find a section containing “File cache max num”, “File cache occupancy”, “Container cache max num”, “Container cache occupancy”. If you can find the above stats in “eos ns” output, the instance is certainly using the QDB namespace.
Is the instance supposed to be brand new, or a result of namespace conversion? Also note that currently, master / slave setup on EOS + QDB namespace is experimental, Elvin can tell you more. You’ll probably have more luck right now using a single MGM. Since QDB is highly available by itself, and the MGM takes a couple of seconds to restart nowadays, implementing master / slave on the EOS side has not been a high priority, but is in the pipeline.

If this is a result of namespace conversion, the low number of files tells me something went wrong… can you describe the steps you took to convert the namespace? This will help a lot to add safeguards for things that may go wrong during the process.
Run redis command quarkdb-info, check BASE-DIRECTORY.
Backup daily QDB, the config, and the report file - see this page for QDB backup instructions: http://quarkdb.web.cern.ch/quarkdb/docs/master/BACKUP.html
No, compaction is now automatic and more-or-less continuous. (Yes, this is safe to do even under heavy load, or random restarts of the QDB daemon)

Let us know if you have more questions!

Best,
Georgios

dszkola · December 21, 2018, 3:37pm

In answer to your questions from #2:

It was upgraded from 4.2.28 so not a new instance. However, there are only a handful of files in it, thus the low number of files shown. The real problem is that the config data does not seem to be propagating to the other MGM. The second MGM shows no file systems on the nodes.

Also, still not clear on how I know each quarkdb node is getting all the data. i.e. is the EOS data up-to-date on each QuarkDB node.

Thanks.

gbitzes · December 21, 2018, 4:20pm

To have both MGMs agree on the configuration, it needs to be stored on QDB as well, and not on a local file. Try running “eos config export” from the node which has the correct configuration to export it to QDB, and then set “mgmofs.cfgtype quarkdb” in xrd.cf.mgm.

Still, master / slave setup on EOS with QDB namespace is not fully supported yet - you’ll probably have more luck with your testing if you simply use a single MGM for now. In our production instances using the QDB namespace, we still have only a single MGM per instance.
If QDB cannot replicate writes to at least a quorum of the cluster, it goes down and becomes unavailable: It won’t silently enter a state where only a single node has all up-to-date data.

(By the way: quorum = majority of the cluster. With 3 nodes, quorum = 2 nodes)

To inspect the status of QDB replication, run raft-info on the leader node. Check REPLICA entries:
```
19) REPLICA localhost:7777 ONLINE | UP-TO-DATE | NEXT-INDEX 1027759
20) REPLICA localhost:7778 ONLINE | UP-TO-DATE | NEXT-INDEX 1027759
```
Each write into QDB is assigned an index: The leader is always up-to-date, as it cannot win an election without being up-to-date.

Compare the value of LOG-SIZE to NEXT-INDEX to see how far behind the leader each replica is.

BLOCKED-WRITES is how many writes are currently pending in the leader’s “replication queue” - that is, how many writes have been sent by a client (in this case, the client is EOS), but not yet replicated to a quorum of nodes. The client does not receive a write acknowledgement until the data has been replicated to a quorum of nodes.

In general, you can trust that if QDB is up and accepting reads / writes, then it’s working properly and replicating all incoming writes: We understand that, having the replicas silently diverge would be quite a terrible thing to happen – QDB is being tested to exhaustion, with many tests simulating all kinds of scenarios and error conditions.

dszkola · December 21, 2018, 4:33pm

Already moved the config to QuarkDB, that’s why I was confused how it could be out-of-sync. I’ll keep poking at it, but I also get what you’re saying about the master/slave not being supported. All good info, thanks for the help.

CERN Accelerating science

QuarkDB setup