Hi Experts,
We are facing “Server responded with an error: [3009] Unable to get free physical space /eos/alicekolkata/grid/07/00488/aa6fb000-db59-11ea-ab1b-0242ec98ab37; No space left on device (destination)” error in Monalisa Web. We investigated and found one of the FST i.e. eos05 is continuously going into “ro” mode out of 7 fsts.
Here is a snapshot:
==================
[root@eos-mgm ~]# eos -b fs ls| grep ro
│host │port│ id│ path│ schedgroup│ geotag│ boot│ configstatus│ drain│ active│ health│
eos05.tier2-kol.res.in 1095 2 /xdata0 default.0 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 10 /xdata1 default.1 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 17 /xdata10 default.2 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 24 /xdata11 default.3 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 31 /xdata12 default.4 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 38 /xdata13 default.5 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 45 /xdata14 default.6 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 52 /xdata15 default.7 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 59 /xdata2 default.8 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 66 /xdata3 default.9 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 73 /xdata4 default.10 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 80 /xdata5 default.11 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 87 /xdata6 default.12 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 94 /xdata7 default.13 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 101 /xdata8 default.14 Kolkata::EOS2 booted ro nodrain online N/A
eos05.tier2-kol.res.in 1095 108 /xdata9 default.15 Kolkata::EOS2 booted ro nodrain online N/A
[root@eos-mgm ~]#
===============
We try to change the configstatus from ro to rw manually from manager.
[root@eos-mgm ~]# eos -b node config eos05.tier2-kol.res.in:1095 configstatus=rw
But after few hours, it’s automatically convert to ro mode.Due to these an error like "xrdcp exited with exit code 54: [ERROR] Server responded with an error: [3009] Unable to get free physical space /eos/alicekolkata/grid/07/00488/aa6fb000-db59-11ea-ab1b-0242ec98ab37; No space left on device (destination).
We compared the config file with other fsts, but there is no change, we tried rebooting the fst and then restart the eos services at fst and mgm, it makes the fst into “rw” mode, but after few hours it is again going into “ro” mode.
In the output of “eos -b node ls --io”, we found that the valuw of “bw” and “iops” for eos05 are 0 as compared to others fsts:-
[root@eos-slave ~]# eos -b node ls --io
┌────────────────────────────────┬────────────────┬──────────┬────────────┬────────────┬──────────┬──────────┬──────────┬──────┬──────┬────────────┬────────────┬────────────┬───────────┬──────────┬──────────┬──────────┬──────┬─────────┐
│hostport │ geotag│ diskload│ diskr-MB/s│ diskw-MB/s│ eth-MiB/s│ ethi-MiB│ etho-MiB│ ropen│ wopen│ used-bytes│ max-bytes│ used-files│ max-files│ bal-shd│ drain-shd│ gw-queue│ iops│ bw│
└────────────────────────────────┴────────────────┴──────────┴────────────┴────────────┴──────────┴──────────┴──────────┴──────┴──────┴────────────┴────────────┴────────────┴───────────┴──────────┴──────────┴──────────┴──────┴─────────┘
eos04.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 0 0 1192 283.036 317.334 318 0 29.40 TB 156.71 TB 1.45 M 15.31 G 0 0 0 1180 3765 MB
eos05.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 0 0 1192 309.188 311.925 317 0 29.72 TB 156.71 TB 1.45 M 15.31 G 0 0 0 0 0 MB
eos06.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 0 0 1192 221.185 303.258 299 0 28.28 TB 156.71 TB 1.44 M 15.31 G 0 0 0 1175 3790 MB
eos07.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 2 0 1192 259.264 327.801 317 0 29.69 TB 156.71 TB 1.45 M 15.31 G 0 0 0 1183 3805 MB
eos08.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 0 0 1192 345.311 305.18 314 0 29.58 TB 156.71 TB 1.45 M 15.31 G 0 0 0 1190 3798 MB
eos09.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 0 0 1192 309.044 317.452 319 0 29.47 TB 156.71 TB 1.45 M 15.31 G 0 0 0 1198 3747 MB
eos10.tier2-kol.res.in:1095 Kolkata::EOS2 0.00 0 0 1192 323.402 308.584 313 0 29.21 TB 156.71 TB 1.45 M 15.31 G 0 0 0 1194 3787 MB
[root@eos-slave ~]#
Attribute and layout of eos intance are below:-
[root@eos-slave ~]# eos -b attr ls /eos/alicekolkata/grid
sys.forced.blockchecksum=“crc32c”
sys.forced.blocksize=“1M”
sys.forced.checksum=“adler”
sys.forced.layout=“raid6”
sys.forced.nstripes=“7”
sys.forced.space=“default”
sys.forced.stripes=“7”
sys.lru.expire.empty="“12h”
[root@eos-slave ~]#
(Our eos version EOS 4.7.7 (2019), EOS instance: Kolkata::EOS2)
Kindly help us to solve this problem.
Regards
Prasun, Kolkata, India