CERN Accelerating science

QuarkDB 0.4.3 has been released

Dear all,

QuarkDB 0.4.3 has been released - notable changes:

  1. Bug fixes

    • The mechanism meant to provide an early warning for potential MANIFEST corruption was flaky, and would sometimes report a problem where none existed.
  2. New features

    • Implementation of an optional part of raft, pre-vote. This should prevent partitioned, or otherwise flaky rejoining servers from triggering unnecessary and disruptive elections. A node will first issue an experimental voting round before advancing its term, and start campaigning for earnest only if it has a good chance of winning.
    • Ability to demote a full node to observer through command raft-demote-to-observer.
    • Print warnings in the logs whenever write-stalls are triggered.
  3. Improvements

    • Show resilvering progress in raft-info.
    • Checkpoint creation through quarkdb-checkpoint will now fail if a different physical filesystem is specified.
    • RPMs now available for CentOS 8.
    • Print explicit warnings in the log in case of write stalling.
    • Reduce default trimming batch size to 200k.
    • Add in-memory cache for leases to significantly speed up all lease-related operations.

Many thanks to Franck Eyraud (JRC) for the bug report concerning erroneous MANIFEST-related warning.

Full release notes can be found here, packages here, and documentation on the optimal way to upgrade here.

Risk of upgrade: Low.

Cheers,
Georgios

What are the dependencies for quarkdb-0.4.3? i.e. What are the minimum versions of EOS and XROOTD needed? I tried to install it on a server with xrootd 4.11.3 and get:

–> Finished Dependency Resolution
Error: Package: quarkdb-0.4.3-1.el7.cern.x86_64 (dms-eostest)
Requires: libXrdUtils.so.3()(64bit)
Error: Package: quarkdb-0.4.3-1.el7.cern.x86_64 (dms-eostest)
Requires: libXrdCl.so.3()(64bit)


Dan Szkola
FNAL

Hi Daniel,

Unfortunately, since xrootd 5 is now pushed in EPEL quarkdb was build with that version. We’ll take down these RPMS and rebuilt with xrootd 5. We plan to release quarkdb-0.5* which will come with XRootD 5.

Cheers,
Elvin

Hi @gbitzes and @esindril

We are running quarkdb-0.4.2 and xrootd 4.12.8 and have recently begun receiving messages:

[1624226401828] ERROR: Potential MANIFEST corruption for DB at /quarkdb/checkpoints/checkpoint_2021-06-21/current/state-machine(1783322963 sec)

Do the 0.4.3 release notes regarding the potential manifest corruption apply to the above, or is there a way to determine if there is a valid issue?

Did I understand correctly that quardkdb 0.4.3 packages built against xrootd 5 and removed from the storage-ci.web.cern.ch install repo but not replaced with ones built against xrootd 4? I do not see quark 0.4.3 packages there, and the linuxsoft.cern.ch repo in release notes returns 403.

Cheers,
Pete

After writing above I saw this issue was reported in the 0.4.2 thread and that the behavior we are seeing is the similar with timestamp 1783322963 sec

[root@warp-ornl-cern-05 ~]# quarkdb-validate-checkpoint --path /quarkdb/checkpoints/checkpoint_2021-06-21/
[1624296403520] INFO: Attempting to open ShardDirectory...
[1624296403521] INFO: --- OK!
[1624296403521] INFO: Attempting to open StateMachine...
[1624296403523] INFO: Openning state machine '/quarkdb/checkpoints/checkpoint_2021-06-21/current/state-machine'.
[1624296403599] INFO: --- OK! LAST-APPLIED: 582451415
[1624296403600] INFO: Attempting to open RaftJournal...
[1624296403600] INFO: Opening raft journal '/quarkdb/checkpoints/checkpoint_2021-06-21/current/raft-journal'
[1624296403600] ERROR: Potential MANIFEST corruption for DB at /quarkdb/checkpoints/checkpoint_2021-06-21/current/state-machine(1783322963 sec)
[1624296404064] INFO: --- OK! LOG-SIZE: 582451416, COMMIT-INDEX: 582451415, LOG-START: 532000000
[1624296404065] INFO: Closing state machine '/quarkdb/checkpoints/checkpoint_2021-06-21/current/state-machine'
[1624296404081] INFO: Closing raft journal '/quarkdb/checkpoints/checkpoint_2021-06-21/current/raft-journal'

We will ignore errors with uninitialized var timestamp until move to 0.4.3 and xrootd 5

Hi Pete,

Yes, indeed we removed the 0.4.3 release since it was built with XRootD5 and we didn’t rebuild it with XRootD4. We don’t plan to do any new releases, since QuarkDB will be released as part of EOS starting with version 5 of EOS.

Cheers,
Elvin