HTTP-TPC Redirection failed with Master/Slave QuarkDB Configuration

Dear Experts,

I found that HTTP-TPC will failed in pull mode if the destination host is MGM-slave.
We have three MGM nodes with Master/Slave QuarkDB Configuration.

master: f-dpmp28
slave: f-dpmp35, f-dpmp31

Eos version is 4.8.40.
The XrdHttp is set to 9000 on /etc/xrd.cf.mgm.

This is the script which I run

export SRC=$2
export DST=$3

gfal-copy $1 $SRC -f > /dev/null

# Get macaroon for source
export TSRC=$(curl --silent --cert ~/cernproxy.pem --key ~/cernproxy.pem --cacert ~/cernproxy.pem --capath /etc/grid-security/certificates -X POST -H 'Content-Type: application/macaroon-requ
est' -d '{"caveats": ["activity:DOWNLOAD"], "validity": "PT3000M"}' "$SRC" | jq -r '.macaroon')
# Get macaroon for destination
export TDST=$(curl --silent --cert ~/cernproxy.pem --key ~/cernproxy.pem --cacert ~/cernproxy.pem --capath /etc/grid-security/certificates -X POST -H 'Content-Type: application/macaroon-requ
est' -d '{"caveats": ["activity:UPLOAD,DELETE,LIST"], "validity": "PT3000M"}' "$DST" | jq -r '.macaroon')

# Trigger HTTP TPC PUSH
curl --capath /etc/grid-security/certificates -L -X COPY -H 'Secure-Redirection: 1' -H 'X-No-Delegate: 1' -H 'Credentials: none' -H "Authorization: Bearer $TDST" -H "TransferHeaderAuthorizat
ion: Bearer $TSRC" -H "TransferHeaderTest: Test" -H "Source: $SRC" "$DST"

gfal-rm $SRC > /dev/null
gfal-rm $DST > /dev/null

Following is the curl output

[...loading CA file...]
CApath: /etc/grid-security/certificates
* NSS: client certificate not found (nickname not specified)
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*       subject: CN=f-dpmp31.grid.sinica.edu.tw,OU=GRID,O=AS,C=TW
*       start date: May 10 02:12:50 2021 GMT
*       expire date: May 10 02:12:50 2022 GMT
*       common name: f-dpmp31.grid.sinica.edu.tw
*       issuer: CN=Academia Sinica Grid Computing Certification Authority Mercury,O=AS,C=TW
> COPY //eos/testarea/test.4101 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: f-dpmp31.grid.sinica.edu.tw:9000
> Accept: */*
> Secure-Redirection: 1
> X-No-Delegate: 1
> Credentials: none
> Authorization: Bearer MDAxNWxvY2F0aW9uIGVvc3Rlc3QKMDAzNGlkZW50aWZpZXIgNjQ1ZDlkNGEtNjVlMy00YWNlLTkyNzMtZDM4YjRiNzdlYzRmCjAwMTNjaWQgbmFtZTphdGxhcwowMDUyY2lkIGFjdGl2aXR5OlJFQURfTUVUQURBVEEsVVBMT0FELERPV05MT0FELERFTEVURSxNQU5BR0UsVVBEQVRFX01FVEFEQVRBLExJU1QKMDAyNGNpZCBhY3Rpdml0eTpVUExPQUQsREVMRVRFLExJU1QKMDAyNWNpZCBwYXRoOi9lb3MvdGVzdGFyZWEvdGVzdC40MTAxCjAwMjRjaWQgYmVmb3JlOjIwMjEtMDctMTdUMDc6NDQ6MDVaCjAwMmZzaWduYXR1cmUg6_1XgGMYJZYVMM9z8t3wDFpJL5CgUC6DrG6ZMx_1RJUK
> TransferHeaderAuthorization: Bearer dpm-macaroonMDA0Y2xvY2F0aW9uIC9kcG0vZ3JpZC5zaW5pY2EuZWR1LnR3L2hvbWUvYXRsYXMvYXRsYXNzY3JhdGNoZGlzay90ZXN0LjQwMDQ2CjAwMTZpZGVudGlmaWVyIGNvbmZpZwowMDU4Y2lkIGRuOi9EQz1jaC9EQz1jZXJuL09VPU9yZ2FuaWMgVW5pdHMvT1U9VXNlcnMvQ049Y2hpZW5kZS9DTj04MzcxNDYvQ049Q2hpZW4tRGUgTGkKMDAxM2NpZCBmcWFuOmF0bGFzCjAwMTZjaWQgZnFhbjphdGxhcy90dwowMDRjY2lkIHBhdGg6L2RwbS9ncmlkLnNpbmljYS5lZHUudHcvaG9tZS9hdGxhcy9hdGxhc3NjcmF0Y2hkaXNrL3Rlc3QuNDAwNDYKMDAxYWNpZCBhY3Rpdml0eTpET1dOTE9BRAowMDI0Y2lkIGJlZm9yZToyMDIxLTA3LTE3VDA3OjQ0OjA1WgowMDJmc2lnbmF0dXJlICIy-2sO_EnVX1I1j-YMN1eIsY2y-lSIwPHtzOfW8MD8Cg
> TransferHeaderTest: Test
> Source: https://f-dpm000.grid.sinica.edu.tw//dpm/grid.sinica.edu.tw/home/atlas/atlasscratchdisk/test.40046
> 
< HTTP/1.1 307 Unknown
< Connection: Keep-Alive
< Content-Length: 0
< Location: http://f-dpmp28.grid.sinica.edu.tw:1094///eos/testarea/test.4101
< 
* Connection #0 to host f-dpmp31.grid.sinica.edu.tw left intact
* Issue another request to this URL: 'http://f-dpmp28.grid.sinica.edu.tw:1094///eos/testarea/test.4101'
* About to connect() to f-dpmp28.grid.sinica.edu.tw port 1094 (#1)
*   Trying 2400:4500:0:2::1122...
* Permission denied
*   Trying 202.140.171.34...
* Connected to f-dpmp28.grid.sinica.edu.tw (202.140.171.34) port 1094 (#1)
> COPY ///eos/testarea/test.4101 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: f-dpmp28.grid.sinica.edu.tw:1094
> Accept: */*
> Secure-Redirection: 1
> X-No-Delegate: 1
> Credentials: none
> TransferHeaderAuthorization: Bearer dpm-macaroonMDA0Y2xvY2F0aW9uIC9kcG0vZ3JpZC5zaW5pY2EuZWR1LnR3L2hvbWUvYXRsYXMvYXRsYXNzY3JhdGNoZGlzay90ZXN0LjQwMDQ2CjAwMTZpZGVudGlmaWVyIGNvbmZpZwowMDU4Y2lkIGRuOi9EQz1jaC9EQz1jZXJuL09VPU9yZ2FuaWMgVW5pdHMvT1U9VXNlcnMvQ049Y2hpZW5kZS9DTj04MzcxNDYvQ049Q2hpZW4tRGUgTGkKMDAxM2NpZCBmcWFuOmF0bGFzCjAwMTZjaWQgZnFhbjphdGxhcy90dwowMDRjY2lkIHBhdGg6L2RwbS9ncmlkLnNpbmljYS5lZHUudHcvaG9tZS9hdGxhcy9hdGxhc3NjcmF0Y2hkaXNrL3Rlc3QuNDAwNDYKMDAxYWNpZCBhY3Rpdml0eTpET1dOTE9BRAowMDI0Y2lkIGJlZm9yZToyMDIxLTA3LTE3VDA3OjQ0OjA1WgowMDJmc2lnbmF0dXJlICIy-2sO_EnVX1I1j-YMN1eIsY2y-lSIwPHtzOfW8MD8Cg
> TransferHeaderTest: Test
> Source: https://f-dpm000.grid.sinica.edu.tw//dpm/grid.sinica.edu.tw/home/atlas/atlasscratchdisk/test.40046
> 
* Recv failure: Connection reset by peer
* Closing connection 1
curl: (56) NSS: client certificate not found (nickname not specified)

I expected the worlflow is to redirect to MGM-master with the 9000 port like the following diagram:

But curl output show the actual result like this:

Is this a bug or a missconfiguration ?

Thanks,
Chien-De

hello Chien-De Li
do you have any updates about this issue?
thank you ina advance
best
e.v.

I don’t have any update about this.
Did you encounter the same problem?
What’s your configuration?

Hello Chien-De Li
I have a similar problem
with X509 cert I can upload via the master or via the slave MGM
with tokens, it works only for master
if I switch the master with eos ns master other (let’s say I can 2 mgm)
it works again … ( for the slave which became master).
I have typical miniguide conf just the xrdhttp on 9000 and 9001 is always secure (https)
FYI
best
EV

P.S.

# I have initiate a valid x509 proxy with dteam voms attrs
NODE1=myslave.mydomain.fr
export MACAROON=$(macaroon-init https://${NODE1}:9000//eos/grif/dteam/dte 600 READ_METADATA,UPLOAD,DOWNLOAD,DELETE,MANAGE,UPDATE_METADATA,LIST)
URL=https://${NODE1}:9000//eos/grif/dteam/dte/curl-token-upload.${RANDOM}
curl -I -L -v --capath /etc/grid-security/certificates -H "Authorization: Bearer $MACAROON" '  $URL --upload /etc/hosts

curl fail with ( from master after redirection )

PUT /eos/grif/dteam/dte/curl-token-upload.23384?encURI= HTTP/1.1
User-Agent: curl/7.29.0
Host: mymaster.mydomain.fr:9000
Accept: /
X-No-Delegate: 1
Credentials: none
Content-Length: 149
Expect: 100-continue

< HTTP/1.1 403 token authorization failed
HTTP/1.1 403 token authorization failed
< Connection: Keep-Alive
Connection: Keep-Alive
< Content-Length: 26
Content-Length: 26

HTTP error before end of send, stop sending
<

Closing connection 1

Could you check if your /etc/eos.macaroon.secret is the same in each mgm node?

Hi Li,

I know what is the problem in this case, the redirection from the slave to the MGM happens on port 1094 also for the HTTP protocol which is not right. So we can classify this as a bug. I will fix it and report back here when it’s done. Thanks for the nice debugging.

Cheers,
Elvin

1 Like

Thanks Elvin.
This will be very helpful. :wink:

hello Evlin the issue is related ? with
curl upload redirection with token ?

in my example I got

name=- geo="" info=“redirect” path="/eos/grif/dteam/dte/file.hosts.1527" host=mymaster port=1094 errno=ENOENT:*
210823 18:55:56 time=1629737756.605119 func=HttpRedirect level=INFO logid=static… unit=mgm@myslave.fr:1094 tid=00007f8b6c2fe700 source=HttpServer:313 tident= sec=(null) uid=99 gid=99 name=- geo="" info=redirecting

Hi Emmanouil,

Yes, definitely. In general, the current master-slave setup is not for the slave to be used as replica of the master but more as a stand-by that can quickly replace the current master in case of issues. One should strive to have the DNS alias point to the master rather than the slave. Nevertheless, this use-case will be fixed.

Cheers,
Elvin

hello Elvin
do we have any update about this bug ?
thank you in advance
best
e.v.

Hi Emmanouil,

I am caught up with other things for the moment, I will let you know once this is available.

Thanks,
Elvin