CERN Accelerating science

MGM xrdhttp crushes when POST with wrong parameters

Dear experts,

I found if I use POST request with wrong parameters (or wrong posted content), the mgm service will not refuse the request but crush and restart. For example, if I use this command to send a POST request to my EOS mgm server:

$ curl -k -d "aaa" https://junoeos01.ihep.ac.cn:9000/eos/juno/dirac/juno/user/z/zhangxt/?aaa

the mgm service crushes and restarts, errors like:

......
210728 17:57:48 time=1627466268.094635 func=Recycler                 level=INFO  logid=static.............................. unit=mgm@junoeos01.ihep.ac.cn:1094 tid=00007f9508bff700 source=Rec
ycle:98                     tident= sec=(null) uid=99 gid=99 name=- geo="" snooze-time=30
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp: received dlen: 16
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp: received dump: 22 03 01 00 -59 01 00 00 -63 03 03 -59 -119 -32 -02 00
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp: This does not look like http at pos 0
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp: This may look like https
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp: Protocol matched. https: 1
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp:  Process. lp:0x7f94e10192d8 reqstate: 0
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp:  Setting host: [::ffff:202.122.33.192]
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp:  Entering SSL_accept...
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp:  SSL_accept returned :1
210728 17:57:48 13481 ?:359@lxslc708 sysXrdHttp: No certificate found in peer chain.
210728 17:57:48 13481 sysXrdHttp: getDataOneShot BuffAvailable: 1048576 maxread: 1048576
210728 17:57:48 13481 sysXrdHttp: getDataOneShot sslavail: 1048576
210728 17:57:48 13481 sysXrdHttp: read 201 of 1048576 bytes
210728 17:57:48 13481 sysXrdHttp:  rc:57 got hdr line: POST /eos/juno/dirac/juno/user/z/zhangxt/?aaa/ HTTP/1.1^M

210728 17:57:48 13481 sysXrdHttp:  Parsing first line: POST /eos/juno/dirac/juno/user/z/zhangxt/?aaa/ HTTP/1.1^M

210728 17:57:48 13481 sysXrdHttp:  rc:25 got hdr line: User-Agent: curl/7.29.0^M

210728 17:57:48 13481 sysXrdHttp:  rc:33 got hdr line: Host: junoeos01.ihep.ac.cn:9000^M

210728 17:57:48 13481 sysXrdHttp:  rc:13 got hdr line: Accept: */*^M

210728 17:57:48 13481 sysXrdHttp:  rc:19 got hdr line: Content-Length: 3^M

210728 17:57:48 13481 sysXrdHttp:  rc:49 got hdr line: Content-Type: application/x-www-form-urlencoded^M

210728 17:57:48 13481 sysXrdHttp:  rc:2 got hdr line: ^M

210728 17:57:48 13481 sysXrdHttp:  rc:2 detected header end.
210728 17:57:48 time=1627466268.466214 func=MatchesPath              level=INFO  logid=static.............................. unit=mgm@junoeos01.ihep.ac.cn:1094 tid=00007f95f861a700 source=EosMgmHttpHandler:324          tident= sec=(null) uid=99 gid=99 name=- geo="" verb=POST path=/eos/juno/dirac/juno/user/z/zhangxt/
210728 17:57:48 time=1627466268.466254 func=MatchesPath              level=INFO  logid=static.............................. unit=mgm@junoeos01.ihep.ac.cn:1094 tid=00007f95f861a700 source=EosMgmHttpHandler:324          tident= sec=(null) uid=99 gid=99 name=- geo="" verb=POST path=/eos/juno/dirac/juno/user/z/zhangxt/
210728 17:57:48 time=1627466268.466278 func=ProcessReq               level=INFO  logid=cad60f8c-ef86-11eb-a357-0c42a15d0b00 unit=mgm@junoeos01.ihep.ac.cn:1094 tid=00007f95f861a700 source=EosMgmHttpHandler:355          tident=<service> sec=      uid=0 gid=0 name= geo="" msg="delegate request to XrdMacaroons library"
error: received signal 11:
/lib64/libXrdEosMgm-4.so(_Z20xrdmgmofs_stacktracei+0x47)[0x7f95f4a2b907]
/lib64/libc.so.6(+0x36400)[0x7f95fbd4e400]
/usr/lib64/libEosMgmHttp-4.so(_ZN17EosMgmHttpHandler10ProcessReqER13XrdHttpExtReq+0x10c)[0x7f95b35f5a4c]
/opt/eos/xrootd/lib64/libXrdHttpUtils.so.1(_ZN10XrdHttpReq14ProcessHTTPReqEv+0xfc)[0x7f95f78f801c]
/opt/eos/xrootd/lib64/libXrdHttpUtils.so.1(_ZN15XrdHttpProtocol7ProcessEP7XrdLink+0x91d)[0x7f95f78f055d]
/opt/eos/xrootd/lib64/libXrdUtils.so.2(_ZN7XrdLink4DoItEv+0x19)[0x7f95fcf97bb9]
/opt/eos/xrootd/lib64/libXrdUtils.so.2(_ZN12XrdScheduler3RunEv+0x17f)[0x7f95fcf9af4f]
/opt/eos/xrootd/lib64/libXrdUtils.so.2(_Z15XrdStartWorkingPv+0x9)[0x7f95fcf9b099]
/opt/eos/xrootd/lib64/libXrdUtils.so.2(XrdSysThread_Xeq+0x37)[0x7f95fcf60aa7]
/lib64/libpthread.so.0(+0x7ea5)[0x7f95fcb14ea5]
#########################################################################
# stack trace exec=/opt/eos/xrootd/bin/xrootd pid=13452 what='thread apply all bt'
#########################################################################
Reading symbols from /opt/eos/xrootd/bin/xrootd...Reading symbols from /opt/eos/xrootd/bin/xrootd...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Attaching to program: /opt/eos/xrootd/bin/xrootd, process 13452
Reading symbols from /usr/lib64/libjemalloc.so.1...Reading symbols from /usr/lib64/libjemalloc.so.1...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libjemalloc.so.1
Reading symbols from /opt/eos/xrootd/lib64/libXrdServer.so.2...Reading symbols from /opt/eos/xrootd/lib64/libXrdServer.so.2...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Loaded symbols for /opt/eos/xrootd/lib64/libXrdServer.so.2
Reading symbols from /opt/eos/xrootd/lib64/libXrdUtils.so.2...Reading symbols from /opt/eos/xrootd/lib64/libXrdUtils.so.2...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Loaded symbols for /opt/eos/xrootd/lib64/libXrdUtils.so.2
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[New LWP 25918]
[New LWP 13924]
[New LWP 13923]
[New LWP 13922]
[New LWP 13921]
......

My EOS server version is 4.8.40, and here is my /etc/xrd.cf.mgm:

$ cat /etc/xrd.cf.mgm | grep -Ev '#|^[[:space:]]*$'
xrootd.fslib libXrdEosMgm.so
xrootd.seclib libXrdSec.so
xrootd.async off nosf
xrootd.chksum adler32
xrd.sched mint 64 maxt 4096 idle 300
xrd.timeout idle 86400
all.export / nolock
all.role manager
oss.fdlimit 16384 32768
sec.protocol unix
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
sec.protocol gsi -crl:0 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gridmap:/etc/grid-security/dn-grid-mapfile -moninfo:1 -d:3 -gmapopt:2 -vomsat:1 -moninfo:1 -exppxy:=creds
sec.protbind localhost.localdomain unix sss
sec.protbind localhost unix sss
sec.protbind * only gsi sss unix
mgmofs.fs /
mgmofs.targetport 1095
mgmofs.broker root://localhost:1097//eos/
mgmofs.instance eosjuno
mgmofs.configdir /var/eos/config
mgmofs.metalog /var/eos/md
mgmofs.txdir /var/eos/tx
mgmofs.authdir /var/eos/auth
mgmofs.archivedir /var/eos/archive
mgmofs.reportstorepath /var/eos/report
mgmofs.autoloadconfig default
mgmofs.cfgtype quarkdb
mgmofs.fstgw someproxy.cern.ch:3001
mgmofs.nslib /usr/lib64/libEosNsQuarkdb.so
mgmofs.qdbcluster qdb1.ihep.ac.cn:6666 qdb2.ihep.ac.cn:6666 qdb3.ihep.ac.cn:6666 junoeos02.ihep.ac.cn:7777
mgmofs.qdbpassword_file /etc/eos.keytab
mgmofs.centraldrain true
if exec xrootd
   xrd.protocol XrdHttp:9000 /opt/eos/xrootd/lib64/libXrdHttp.so
   http.cadir /etc/grid-security/certificates/
   http.cert /etc/grid-security/daemon/hostcert.pem
   http.key /etc/grid-security/daemon/hostkey.pem
   http.gridmap /etc/grid-security/grid-mapfile
   http.secxtractor /opt/eos/xrootd/lib64/libXrdVoms.so
   http.trace all
   http.exthandler xrdtpc /opt/eos/xrootd/lib64/libXrdHttpTPC.so
   http.exthandler EosMgmHttp /usr/lib64/libEosMgmHttp.so eos::mgm::http::redirect-to-https=0
fi

Here, my EOS doesn’t use SciTokens nor Marcaroon, only X509 and VOMS are used.

Can this problem be fixed by upgrade my EOS server to up-to-date version? Or can I have any other method to let my EOS server only refuse those wrong POST request but not crush and restart?

Regards,
Xuantong Zhang

Hi Zhang,

Thanks for the report. Indeed, the MGM crashes when there is no token library specified and you do a simple POST request. This is now fixed by the following commit:
https://gitlab.cern.ch/dss/eos/-/commit/5049568dff66a07a9189d6bedbf2ada4c9af13d4

If you do enable either macaroons or scitokens this crash will not happen.
The fix will be included in the 4.8.60 release.

Thanks,
Elvin

Hi Elvin,

Thanks for your help. As we haven’t successfully used scitokens on our EOS, we could only use GSI at present. I will try to enable macaroons or scitokens next.

Cheers,
Xuantong Zhang