FUSEX: Cases of incompletely written files after successful copy

Dear @apeters,

Following our discussion during the recent EOS Workshop about files that are seen as correctly written by the user, but happen to be corrupted (which usually means incomplete, i.e. either 0 size or smaller than the expected length), I want to share some logs with you to understand how we can avoid this.

The fusex client writes something like this :

240321 07:46:46 t=1711003606.852129 f=HandleResponse   l=CRIT  tid=00007f08cc7f4700 s=xrdclproxy:914           write error '[ERROR] Operation expired'
                 ---- high rate error messages suppressed ----
240321 07:46:50 t=1711003610.854069 f=HandleResponseWithHosts l=ERROR tid=00007f08cb7f2700 s=xrdclproxy:559           state=failed async open returned errmsg=[ERROR] Operation expired
240321 07:46:50 t=1711003610.854302 f=TryRecovery      l=CRIT  ino:80000000a7ea552c s=data:1103                triggering write recovery state = 5
240321 07:46:50 t=1711003610.857345 f=recover_write    l=CRIT  ino:80000000a7ea552c s=data:1769                failed to open local stagefile 
240321 07:46:50 t=1711003610.857399 f=flush_nolock     l=ERROR ino:80000000a7ea552c s=data:450                 remote open failed - returning 121
240321 07:46:50 t=1711003610.857464 f=flush            l=WARN  tid=00007f08977ff700 s=eosfuse:5373             invalidating ino=0x80000000a7ea552c after flush error
240321 07:46:50 t=1711003610.857527 f=TryRecovery      l=CRIT  ino:80000000a7ea5520 s=data:1103                triggering write recovery state = 3
240321 07:46:50 t=1711003610.859791 f=recover_write    l=CRIT  ino:80000000a7ea5520 s=data:1769                failed to open local stagefile 
240321 07:46:50 t=1711003610.860427 f=ioflush          l=WARN  tid=00007f08de7f3700 s=data:3323                re-issuing OpenAsync request after timeout - ino:80000000a7ea552c err-code:206
#      -------------------
#      - recovery record -
#      -------------------
#        path := '/MSK_DETFOO_B01.jp2'
#        fid  := 2817152300
#        -[ 000 ] 240321 07:46:50 t=1711003610.854256 f=flush_nolock     l= ino:80000000a7ea552c s=data:444                 status='XrdCl::Proxy::FAILED' hint='will TryRecovery'
#        -[ 001 ] 240321 07:46:50 t=1711003610.854329 f=recover_write    l= ino:80000000a7ea552c s=data:1671                hint='recover write'
#        -[ 002 ] 240321 07:46:50 t=1711003610.854343 f=recover_write    l= ino:80000000a7ea552c s=data:1708                hint='recover from file cache'
#        -[ 003 ] 240321 07:46:50 t=1711003610.857367 f=recover_write    l= ino:80000000a7ea552c s=data:1774                hint='failed to open local stagefile'
#        -[ 004 ] 240321 07:46:50 t=1711003610.857393 f=flush_nolock     l= ino:80000000a7ea552c s=data:449                 errno='121' hint='failed TryRecovery'

240321 07:47:02 t=1711003622.193859 f=ioflush          l=WARN  tid=00007f08de7f3700 s=data:3334                OpenAsync failed - trying recovery - ino:80000000a7ea5520 err-code:0
240321 07:47:02 t=1711003622.193908 f=ioflush          l=WARN  tid=00007f08de7f3700 s=data:3360                giving up OpenAsync request - ino:80000000a7ea5520 err-code:0
240321 07:47:02 t=1711003622.194129 f=ioflush          l=CRIT  tid=00007f08de7f3700 s=data:3389                ino:80000000a7ea5520 msg=file writing failed file-recovery= journal-recovery=/scratch_hdd/fusex_cache/journal/srs_downloader_Sentinel2_46/00A/80000000A7EA5520.jc.recover
#      -------------------
#      - recovery record -
#      -------------------
#        path := '/T36NWM_20240319T075631_B07.jp2'
#        fid  := 2817152288
#        -[ 000 ] 240321 07:46:50 t=1711003610.857548 f=recover_write    l= ino:80000000a7ea5520 s=data:1671                hint='recover write'
#        -[ 001 ] 240321 07:46:50 t=1711003610.857559 f=recover_write    l= ino:80000000a7ea5520 s=data:1708                hint='recover from file cache'
#        -[ 002 ] 240321 07:46:50 t=1711003610.859815 f=recover_write    l= ino:80000000a7ea5520 s=data:1774                hint='failed to open local stagefile'

240321 07:49:26 t=1711003766.917928 f=HandleResponse   l=CRIT  tid=00007f08cc7f4700 s=xrdclproxy:914           write error '[ERROR] Operation expired'
                 ---- high rate error messages suppressed ----
240321 07:49:27 t=1711003767.106613 f=TryRecovery      l=CRIT  ino:80000000a7ea5612 s=data:1103                triggering write recovery state = 3
240321 07:49:27 t=1711003767.108746 f=recover_write    l=CRIT  ino:80000000a7ea5612 s=data:1769                failed to open local stagefile 
240321 07:49:27 t=1711003767.249698 f=ioflush          l=WARN  tid=00007f08de7f3700 s=data:3334                OpenAsync failed - trying recovery - ino:80000000a7ea5612 err-code:0
240321 07:49:27 t=1711003767.249724 f=ioflush          l=WARN  tid=00007f08de7f3700 s=data:3360                giving up OpenAsync request - ino:80000000a7ea5612 err-code:0
240321 07:49:27 t=1711003767.249866 f=ioflush          l=CRIT  tid=00007f08de7f3700 s=data:3389                ino:80000000a7ea5612 msg=file writing failed file-recovery= journal-recovery=/scratch_hdd/fusex_cache/journal/srs_downloader_Sentinel2_46/00A/80000000A7EA5612.jc.recover
#      -------------------
#      - recovery record -
#      -------------------
#        path := ''
#        fid  := 2817152530
#        -[ 000 ] 240321 07:49:27 t=1711003767.106633 f=recover_write    l= ino:80000000a7ea5612 s=data:1671                hint='recover write'
#        -[ 001 ] 240321 07:49:27 t=1711003767.106651 f=recover_write    l= ino:80000000a7ea5612 s=data:1708                hint='recover from file cache'
#        -[ 002 ] 240321 07:49:27 t=1711003767.108798 f=recover_write    l= ino:80000000a7ea5612 s=data:1774                hint='failed to open local stagefile'

From the above log, one files results as incorrect (a7ea552c is 0 size, were the correct size after re-download is 15783)

# eos file info fxid:a7ea552c
  File: '/eos/jeodpp/proc/recycle/uid:47000/2024/03/21/0/....#:#MSK_DETFOO_B01.jp2.00000000a7ea552c'  Flags: 0644  Clock: 17bebae36ade04d5
  Size: 0
Status: fuse::missingcommmits
Modify: Thu Mar 21 07:45:50 2024 Timestamp: 1711003550.107999647
Change: Thu Mar 21 09:39:54 2024 Timestamp: 1711010394.360592784
Access: Thu Mar 21 07:45:50 2024 Timestamp: 1711003550.107999647
 Birth: Thu Mar 21 07:45:50 2024 Timestamp: 1711003550.107999647
# path="/eos/jeodpp/proc/recycle/uid:47000/2024/03/21/0/...#:#MSK_DETFOO_B01.jp2.00000000a7ea552c" fxid="a7ea552c" size="0" nrep="2" checksumtype="adler" checksum="0000000100000000000000000000000000000000000000000000000000000000"
nrep="00" fsid="2963" host="fst-018.cidsn.jrc.it:1095" fstpath="/data41/00044c73/a7ea552c" size="0" statsize="0" checksum="0000000100000000000000000000000000000000000000000000000000000000" diskchecksum="0000000000000000000000000000000000000000000000000000000000000000" error_label="none"
nrep="01" fsid="1043" host="s-jrciprjeos234p.cidsn.jrc.it:1095" fstpath="/data41/00044c73/a7ea552c" size="0" statsize="0" checksum="0000000100000000000000000000000000000000000000000000000000000000" diskchecksum="0000000000000000000000000000000000000000000000000000000000000000" error_label="none"

# eos attr ls fxid:a7ea552c
sys.eos.btime="1711003550.107999647"
sys.fusex.state="Cs|c|"

This happens on one specific host which continuously downloads many products. They are shipped in zip files which are copied locally, extracted locally, then copied on eos using the rsync command, whose error code is checked before removing the local data. However from time to time it seems that something happens in the client, or with the FST, and the file is not correctly written. The process uses a user eosxd mount. The failure rate is quite low, around 3 files out of 10000 which, after sanity check, have to be redownloaded. But it would be good to be sure that a file is actually correctly written when no error is given on the OS, and with no need to check them afterwards.

The version in use is 5.1.30. It seems that with previous version 5.1.21 the same was happening, but the error message was a bit different : 'unable to read file for recovery from local cache file instead of failed to open local stagefile.

We are planning to upgrade to a newer 5.2.x version on this host, we’ll see if this still happens.