FST Service Silently Fails

EOS gurus,

FST service on the ornl::temp SE continues to randomly die without apparent correlation to increased (or constrained) system resources.

Graphs below show resource usage during time of FST crash this morning (~09:30h)

gdb run on fst process has only shown “Program terminated with signal SIGKILL, Killed” without further trace info though we’ll keep running to see if another failure provides more information.

No OOM events, nor relevant events logged to xrdlog.fst

This FST will is intended to replace another and be production, to serve ~1PB when all fsids brought online.

Any suggestions to debug further are appreciated.

Cheers,
Pete