Sudden 100GB+ jump in MGM memory usage

EOS_SERVER_VERSION=4.6.8 EOS_SERVER_RELEASE=1

Any idea what can cause this:

ALL memory virtual 253.80 GB
ALL memory resident 92.63 GB
ALL memory share 17.30 MB
ALL memory growths 251.48 GB
ALL threads 840
ALL fds 4204
ALL uptime 2392349

Forgot to add, we’re using the QDB namespace.
quarkdb-0.4.2-1.el7.cern.x86_64

Hi Dan,
sorry that you had to wait so long.

Elvin, can you point Dan to the minimum version to be used when you run with multiple MGMs and Georgios, which version is the current production recommendation for QDB?

Do you have the configuration stored in QuarkdB or still the configuration file on the MGM?

Yes, I have this set on both MGMs:
EOS_USE_QDB_MASTER=1

The config is stored in QDB:
# eos config ls
Existing Configurations on QuarkDB
================================
created: Fri May 8 14:24:21 2020 name: default *

This is set on both MGMs as well:
mgmofs.cfgtype quarkdb

We also encountered such problems, and we noticed that eos@sync service occupied about 75% or 90 GB memory. Is this service essential?

Hello,

We had some concern also once about MGM memory. Abnormal memory usage of the MGM (QDB namespace)

The conclusion for us was that someone launched the eos find command on the whole instance’s namespace, and the process went on forever, filling some buffer until the whole search was finished, then memory was freed. But this was not a but step like yours, more sustained increase during many days, just before a sudden free.

In addition, we were using values bigger than the default for the eos ns cache because we had available memory, and this was threatening memory outage. But this seems to be not useful (no performance gain, and bigger memory usage), so better keep the default values.

In general, we still observe that even after the cache maximum is reached, the MGM (v4.5.15) memory usage steadily, but slowly, always increases (~+50GB in 3 months)