FST stuck in booting "waiting to know manager"

Hello,

When I start a FST it spends quite a while doing this:

241018 22:55:03 time=1729292103.530083 func=WaitManager              level=INFO  logid=static.............................. unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=00007f9bdb7fe700 source=Config:64                      tident= sec=(null) uid=99 gid=99 name=- geo="" msg="wait for manager info ..."
241018 22:55:04 time=1729292104.530242 func=WaitManager              level=INFO  logid=static.............................. unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=00007f9bdb7fe700 source=Config:64                      tident= sec=(null) uid=99 gid=99 name=- geo="" msg="wait for manager info ..."
241018 22:55:05 time=1729292105.515482 func=Boot                     level=INFO  logid=static.............................. unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=00007f9bd8ff3700 source=Storage:493                    tident= sec=(null) uid=99 gid=99 name=- geo="" msg="waiting to know manager" fsid=3
241018 22:55:05 time=1729292105.530374 func=WaitManager              level=INFO  logid=static.............................. unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=00007f9bdb7fe700 source=Config:64                      tident= sec=(null) uid=99 gid=99 name=- geo="" msg="wait for manager info ..."
241018 22:55:06 time=1729292106.530528 func=WaitManager              level=INFO  logid=static.............................. unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=00007f9bdb7fe700 source=Config:64                      tident= sec=(null) uid=99 gid=99 name=- geo="" msg="wait for manager info ..."
241018 22:55:07 time=1729292107.530746 func=WaitManager              level=INFO  logid=static.............................. unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=00007f9bdb7fe700 source=Config:64                      tident= sec=(null) uid=99 gid=99 name=- geo="" msg="wait for manager info ..."

Eventually (usually) it seems to time out, fail with exit code 139, and restart and then somehow eventually succeed.

Is it trying to register with the MGM? It has the correct config:

# echo $EOS_MGM_URL
root://eos-mgm-0.eos-mgm.eos.svc.kermes-dev.local

so I’m not sure what the issue is. Does the FST initiate a connection to port 1097 of $EOS_MGM_URL ?

Thanks!

Hi Ryan,

The FST needs to register to the MGM when it starts. It tries to connect to its port 1094, and to the MQ server on port 1097. MGM needs to contact FST on their 1095 port. Maybe you have firewall rules preventing them to communicate. I think that it should be sufficient for FST and MGM communication.

Thanks Franck! Those ports are open and reachable so I’m not sure what the issue is…

“wait for manager info” sounds like the FST doesn’t know what to do, or isn’t receiving an expected connection.

When I tested again the manager was instantly known, whatever that means:

eos-fst 241022 05:17:54 time=1729574274.404094 func=MgmSyncer                level=INFO  logid=FstOfsStorage unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=000
07fa5dbfff700 source=MgmSyncer:60                   tident=<service> sec=      uid=0 gid=0 name= geo="" msg="waiting to know manager"                                        
eos-fst 241022 05:17:54 time=1729574274.404148 func=MgmSyncer                level=INFO  logid=FstOfsStorage unit=fst@eos-fst-2.eos-fst.eos.svc.kermes-dev.local:1095 tid=000
07fa5dbfff700 source=MgmSyncer:52                   tident=<service> sec=      uid=0 gid=0 name= geo="" msg="manager known" manager="eos-mgm-0.eos-mgm.eos.svc.kermes-dev.loc
al:1094"

But nothing should have changed, the configuration remains the same…