QDB metadata on ZFS?

Hello all,

A general query, wondering if anyone has had experience running QDB on ZFS?

I am hoping to use the pooling functionality to help with running two log KV stores on the same NVMe devices. There is easily enough IO capacity on the devices. De-duplication, snapshot, and the log nature of the underlying filesystem grants me some advantages for management operations.

Hi David,

The primary factor would be performance of RocksDB on ZFS, there are some benchmarks here which show a large performance hit. We haven’t tried it ourselves however, and we’d be interested in your experience if you go for ZFS. :slight_smile:

I’d be cautious of using ZFS snapshots for backing up the state of QDB, and not sure how de-duplication would help here?

Are you planning on running several QDB nodes on the same machine? That’s fine if they belong to entirely different clusters, but if they’re part of the same cluster it defeats the purpose of high availability: One machine goes down and the entire cluster goes down. (You probably know this already, just pointing out for other people reading this thread)

Cheers,
Georgios

Thankyou @gbitzes,

Apologies for taking so long to get back, as I’ve been on holidays (…going nowhere), and thought I’d take the mental time out anyway.

In this case, it was a more generic query rather than having an exact design in mind. In this case, I wanted to run a Ceph Mon and a QDB instance side by side on the same storage. The de-dupe obviously will be of no benefit for this, but it would help with backup. We’ve been using restic for backing up QDB snapshots, and the de-dupe is amazingly efficient.

As this is only a pilot project at this point, I am willing to accept a known single point of failure.

This is initially for a CTA deployment, so extreme levels of performance of won’t be required. I’m trying to think ahead of management concerns when I have unknown sizing requirements. Once I have accurate numbers, I am sure I can choose suitable sizes for partitions and move back towards ext4. ZFS allows me to be lazy with allocations.

The real world performance differences are something that’d be of interest to many I imagine, so I will have to try putting together some clear metrics on this.

@gbitzes, a further follow up, I was considering block/record size optimal for QDB.

@crystal mentioned some parameters inside QDB she’d seen for this, but they didn’t quite answer what was I after.

What is the optimal block size for a QDB metadata file?

Hi David,

As far I know we haven’t done any benchmarking with regards to filesystem block size, and just use the xfs defaults.

RocksDB uses a block size of 4096 bytes by default, which is what QDB uses as well. This relates to the unit by which RocksDB reads / writes data, ie 4096 bytes at a time. If you might be interested in running benchmarks with this value I could add an option to control it from within QDB configuration, let me know.

Cheers,
Georgios