MGM : secondary groups with sssd and LDAP backend, enumeration and user information caching

Good afternoon,

Yesterday our MGMs running version v5.2.24 have been successfully migrated to Alma and EOS v5.2.26.

However, we encountered an issue with the secondary groups which were not loaded any more when the users were authenticated (using Kerberos), meaning they were only considered part of their primary group so were now denied access to many folders.

We have been running for years with this configuration in CentOS, where the users and groups are defined in an LDAP directory, which is brought to the OS by the nslcd service which connects to LDAP. In Alma9, this task is now handled by more modern SSSD equivalent. In the EOS configuration, we checked of course that we still had the necessary EOS_SECONDARY_GROUPS environment variable.

We finally found that the reason was due to a parameter in the sssd configuration in our domain definition,enumerate=false which disables the possibility to list the whole groups or users (using getent passwd or getent group fro instance), and seems a default and recommended setting for sssd.

So first question is : is there a reason why EOS needs this enumerate option to be enabled ? When running a id user command in the host, all the secondary groups are correctly shown, so we thought it would be sufficient to make the same query for the user when he is authenticated. And also, if EOS enumerates all users/groups, is there a limit in the number of groups it can retrieve from the LDAP ? What if we have many many groups?

When changing this setting, and restarting SSSD and invalidating its cache, then for users who authenticate the first time, the groups were immediately loaded. But for users authenticated previously, this was not the case immediately, unless the MGM is restarted, or we wait enough : it appeared that after around 1 hour all the user information were updated at once, and the issue disappeared. As if indeed all the group information was reload at the same time.

Second question is : is there a way we can force cache invalidation of user and groups information in the MGM ? It would have ben helpful in this exceptional case, but we also observed several times even under CentOS 7 that adding a user to a group takes some time to be taken into account, and sometimes requires that the user logs out completely before it is updated.

Maybe there are some recommended settings to work optimally in this situation that we might apply to our configuration ? Thank you to anyone who can share his experience.

Additional note : In the past, we had the issue with secondary groups missing in the gateway mode, but we noticed that in some previous update (maybe version 5+) it was improved, thanks

@esindril sorry to tag you, but maybe you can give us some hint on this issue with secondary groups.

As written above, we fixed the issue with changing the sssd config.

However, we are not sure if it is exactly the way to go. And in fact, on some reduced moment, we got the situation were some users were not correctly identified with their secondary groups, and were denied the access to their folders. In this case it was temporary and limited to a user, but it causes their work to fail.

Could it be due to some limitation on the number of LDAP results which are enumerated ? Or if sometime the LDAP query fails, we lose information ? It is happening with users using the condor cluster, so they have many active sessions in parallel, maybe this adds a factor that causes this discrepancy ?

How can we investigate the user information retrieval from the MGM ? of course it happens on the production, and we cannot enable debug level as it would be way too verbose. But maybe there are some message that could who when the MGM fails getting the secondary group list of a user ?

And, again, is there a way to force the MGM to fetch again information on a user/all the users ? It seems to be done on a regular basis, but not very often (like 1 hour)

Hi Franck,

To answer your initial first question: there is nothing special that EOS does in this case, but it’s the way getgrent and sssd work together. As you correctly determined the solution is to enable enumeration in the sssd configuration. There is also an issue about this in the Red Hat forum as you can see in the snippet below, where they recommend the exact same solution as you found out:

Concerning the second question about caching of uid/gid values. That is also true, in EOS in the eos::common::Mapping functionality we use a cache with a 1h expiration for already resolved uid/gid/username pairs so that we alleviate the pressure on sssd - we’ve seen in the past that it can lead to considerable slowdown or even crashes when there are many requests handled by sssd. You can reset this cache information but using the:
eos space reset default --mapping.
The mapping is not actually connected to any space, but it’s global!

Let me know if this answers your questions.

Cheers,
Elvin

Thank you Elvin for your answer.

Definitely the command which resets the mappings might be very helpful.

Thank you also for the explanation about the internal function of the secondary groups mapping and the usage if the getgrent function, it might be very helpful to understand and debug if we observe the same issue again when a user doesn’t get his secondary groups.

Regarding the use of sssd on Alma9, do you retain that this is a good option, or could the usage of nslcd like we were using on CentOS 7 some considerable alternative ? Do you know which one do you use at Cern ?

Another thing : is it possible that in some cases the mapping of a user is not refreshed as long as he keeps some session open ? This is what we have observed in the past under CentOS7 when we were adding them to some new groups. Probably in this case the mapping reset command should help.

Hi Franck,

We use sssd at CERN everywhere. We have no experience with other similar service so I can’t recommend something else. So far, since migrating to Alma 9 we have seen no issues coming from this component.

It could be that used to be the case in the past, but this part of the code especially the mapping has been reworked and it should work as expected in 5.3.0+. Users keeping files open and the refresh mechanism do not interfere with each other.

Cheers,
Elvin

Dear Elvin,

After several days of analysis, it seems that we managed to understand a pattern for this issue. The test performed was simply to regularly query the identity of selected active users to the MGM using eos whoami or proc/whoami fusex file; both always give the same results for the same kerberos authentication.

For the users with very high usage (so who operate a lot of IdMap operations) the identity is wrong from time to time. When it gets wrong (i.e. without secondary groups), it lasts for exactly 2 hours. And interestingly, the time when it changes identity (from wrong to correct and vice versa) is always the same within the hour, and corresponds to a multiple of 2 hours after the last MGM start. When we use the eos space reset default --mapping command, all the wrong identities are corrected, but it doesn’t change the time of the next identity change, still indexed with the MGM start time. So something seems to invalidate internally all caches at the same time.

Indeed, when enabling temporarily the debug level around that time, we see a lot of messages not found in uid cache for all users, so they all get queried again at them same time, and even several times for the very active users, as the cache checks are done in parallel, and when the entry is not there, the groups are queried, until one of the threads store in the cache. My thought is that it could be that on some high demand of groups with the getgrent function, that function might return a truncated, or empty answer, resulting of some users not getting their groups, and this result is stored in the cache.

By trying to look into the code what is the foreseen behaviour, I came to this commit COMMON: Mapping: clear uid/gid caches every couple of hours (550481f1) · Commits · dss / eos · GitLab which seems to correspond to what we observe. This commit is introduced in version 5.2.25, and in fact we upgraded from version 5.2.24 to 5.2.26, so that could explain this new behaviour.

Very interesting, because this wants to fix an issue we have been having with the past of very active users who do not get their secondary groups updated from LDAP when they are constantly active. So it is very useful, but maybe the fact to invalidate the whole caches all at the same time causes some race condition in calling the getgrent function concurrently. This seems to never happen when we use the reset mapping command on the MGM, maybe it invalidates differently, but we do not have enough statistical data to conclude that.

Unfortunately, we couldn’t reproduce this behaviour on a test instance where we couldn’t reproduce the high throughput of the production one.

My idea for now for us is that we could downgrade the MGM back to previous version 5.2.24 (we didn’t yet test version 5.3.x, but it seems anyway that this commit is still there), and rely on the manual mapping reset when some users do not get their secondary groups updated (we didn’t know about it before). Unless you have some more idea (something in our setup that might cause this issues with groups not being retrieved under high number of requests)

Hi Franck,

I’m also taking a look at the mapping code, it does look like with secondary mapping esp. resetting the cache is heavy, I’ll try to see if it can be fixed easily

Best,
Abhishek

Thank you Abhishek !

If you need some more information, do not hesitate to contact me.

Hi Franck,

Currently the secondary groups indeed iterates through getgrent which lists all the groups, and is not threadsafe, so I can see why this can keep on hitting your ldap heavily, since the cache gets cleared and then we populate secondary groups again.

I see getgrouplist as an alternate option which seems to be threadsafe and much more efficient only returning the groups that a user is part of, does id -g for the users return all the groups you want?
I believe teh what getgrent does would be the equivalent of getgrent groups | grep which is the current behaviour

If the Id approach works, I’d change the addSecondaryGroups to use getgrouplist which should be much faster. I’ll also work on making this 2h interval configurable so that we probably don’t need such aggresive cleanups in all sites, what do you think of this approach?

Dear @abhishekl,

Thank you for having looked at it and sorry I couldn’t answer before.

Yes, indeed, in our setup the id function gives all the necessary groups. id -g returns only the primary group, but I think this is the desired behaviour, id -G returns all the groups

$ id | sed -e 's/(\([^)]*\))/(xxx)/g' #anonymising to post on the forum
uid=61928(xxx) gid=40507(xxx) groups=40507(xxx),504(xxx),507(xxx),22605(xxx),29431(xxx),36232(xxx),40600(xxx),41068(xxx),43500(xxx),43590(xxx),50003(xxx),54068(xxx),61895(xxx),63804(xxx),65422(xxx)
$ id -g
40507
$ id -G
40507 504 507 22605 29431 36232 40600 41068 43500 43590 50003 54068 61895 63804 65422

And indeed, this was also the case before we enabled the enumerate option of SSSD (our first issue in this thread), maybe meaning that if you use getgrouplist we wouldn’t need this enumerate enabled.

There is one thing in what you wrote that might not be exactly what it happens. The call of getgrent doesn’t appear to hit our LDAP heavily, it is all buffered by the local cache of SSSD, indeed we do not observe any requests in LDAP at the time the cache is refreshed, so everything is happening locally, the fact that it is not thread-safe might be sufficient to not getting the correct content every time.

And yes, being able to parameter the cache cleanup would be perfect. But in fact, in our case, just running the manual command when we make some changes that aren’t immediately refreshed is very helpful, the thing is that we discovered it only recently. To be complete, if it could be used to clean only a selected list of user identity it would be awesome, but it doesn’t seem necessary.

I have another thing to note : we have noticed that every time we were running eos space reset default --mapping after observing a bad identity, all the identity were correct. So since a few days we have set up a regular call of that function every 2 hours, around few seconds after the time the internal cache is invalidated. And doing that we never observe a bad identity. It seems that something is being done differently in the internal regular cache cleaning, and the manual one from the command line. Maybe it can help.

Edit: If you plan to apply these changes in the 5.2.x branch, we might use them by upgrading soon. If they are to be only in 5.3.x then we will not upgrade immediately as we yet need to test it.

Hi Franck,

Thanks for the pointers, largely the difference between the periodic cache and the reset would be in the username and groupname caches, and I see that this might become an issue as normally in mapping everything refers to ids only, however getgrent() family relies on names for mapping matches. We don’t clear the username caches every couple of hours like the other caches, but maybe we can add that also. I’ll write a fix for both. And yes worst case we can make a special 5.2.x release with these changes

Meanwhile good to hear you’re not actively blocked with the cron job clearing the caches!

Thank you very much Abhishek for your efforts in fixing this.

Let us know if we can give any other information, and in any case if the fix is available so that we can test it.