Dear Elvin,
Sorry for late reply.
As per your suggestion, to use RAIN (raid6) layout of 7 strips with 8 FSTs, we add another fst (i.e. eos11.tier2-kol.res.in ) which has same nos of fs and same size of fs. We had untouched the value of sys.forced.nstripes=“7” inside the eos attr. Also change the value of groupsize from 7 to 8 and the value of groupmod is same i.e. 24. EOS package are updated to EOS 4.8.46 (2020) in all eos systems.
Then the errors i.e. [Missing and Unable tor Restore/Repair Replica in RAIN6 (Unable to schedule stripes for reconstruction; could not place new replica; replica inconsistency repair failed and No space left on device) are not coming like before.
After modification of above changes, we restart the eos. But, we notice that “Balancing” are started on all 16 groups. Earlier its running on 4 groups. In converter.log and xrdlog.mgm, there are multiple error like "[ERROR] Server responded with an error: [3010] Unable to open file " , “Operation not permitted”, ERROR ConversionJob, etc.
When investigate xrdlog,mgm and Converter.log thoroughly, we find that the copy of file from /eos/alicekolkata/grid/* to /eos/alicekolkata/proc/conversion/* by tpc were successfully. But during conversion , open of those files at /eos/alicekolkata/proc/conversion/ are failed and show unable to open and Operation not permitted. Examples of such files are given below:-
===================
[root@eos-mgm ~]# zcat /var/log/eos/mgm/xrdlog.mgm-20210528.gz | grep cb7c3183b75f
210527 17:09:08 time=1622115548.469267 func=DoIt level=INFO logid=static… unit=mgm@eos-mgm.tier2-kol.res.in:1094 tid=00007f7817fed700 source=ConversionJob:235 tident= sec=(null) uid=99 gid=99 name=- geo="" [tpc]: root@eoskolkata.tier2-kol.res.in:1094@root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f => root@eoskolkata.tier2-kol.res.in:1094@root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642 prepare_msg=[SUCCESS]
210527 17:09:08 time=1622115548.470326 func=open level=INFO logid=26044cf2-bee0-11eb-980a-e4434b664554 unit=mgm@eos-mgm.tier2-kol.res.in:1094 tid=00007f78b7df6700 source=XrdMgmOfsFile:499 tident=root.21413:436@eos-mgm sec=sss uid=0 gid=0 name=daemon geo="" op=read path=/eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f info=eos.app=eos/converter&eos.rgid=0&eos.ruid=0&tpc.stage=placement
210527 17:09:08 time=1622115548.472068 func=open level=INFO logid=26044cf2-bee0-11eb-980a-e4434b664554 unit=mgm@eos-mgm.tier2-kol.res.in:1094 tid=00007f78b7df6700 source=XrdMgmOfsFile:2938 tident=root.21413:436@eos-mgm sec=sss uid=0 gid=0 name=daemon geo="" op=read path=/eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f info=eos.app=eos/converter&eos.rgid=0&eos.ruid=0&tpc.stage=placement target[0]=(eos04.tier2-kol.res.in,57) target[1]=(eos09.tier2-kol.res.in,63) target[2]=(eos05.tier2-kol.res.in,59) target[3]=(eos10.tier2-kol.res.in,62) target[4]=(eos06.tier2-kol.res.in,61) target[5]=(eos07.tier2-kol.res.in,60) target[6]=(eos08.tier2-kol.res.in,58) redirection=eos06.tier2-kol.res.in?&cap.sym=<…>&cap.msg=<…>&mgm.logid=26044cf2-bee0-11eb-980a-e4434b664554&mgm.replicaindex=4&mgm.replicahead=4&mgm.id=024f69c4&mgm.mtime=1604330663 xrd_port=1095 http_port=8001
210527 17:09:08 time=1622115548.606344 func=open level=INFO logid=26191100-bee0-11eb-980a-e4434b664554 unit=mgm@eos-mgm.tier2-kol.res.in:1094 tid=00007f78b7df6700 source=XrdMgmOfsFile:497 tident=root.21413:436@eos-mgm sec=sss uid=99 gid=99 name=daemon geo="" op=write trunc=512 path=/eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642 info=eos.app=eos/converter&eos.checksum=d02d0ca2&eos.excludefsid=57,63,59,62,61,60,58&eos.group=5&eos.layout.blockchecksum=crc32c&eos.layout.blocksize=1M&eos.layout.checksum=adler&eos.layout.nstripes=7&eos.layout.type=raid6&eos.rgid=2&eos.ruid=2&eos.space=default&eos.targetsize=604041&oss.asize=604041&tpc.dlg=root@eoskolkata.tier2-kol.res.in:1094&tpc.dlgon=0&tpc.key=2416177a000153a560af84dc&tpc.lfn=/eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f&tpc.spr=root&tpc.src=root@eos06.tier2-kol.res.in:1095&tpc.stage=copy&tpc.str=1&tpc.tpr=root
210527 17:09:08 time=1622115548.607728 func=HandleError level=ERROR logid=static… unit=mgm@eos-mgm.tier2-kol.res.in:1094 tid=00007f7817fed700 source=ConversionJob:378 tident= sec=(null) uid=99 gid=99 name=- geo="" msg="[ERROR] Server responded with an error: [3010] Unable to open file /eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642; Operation not permitted" tpc_src=root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f tpc_dst=root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642 conversion_id=00000000024f69c4:default.5#20640642
[root@eos-mgm ~]#
[root@eos-mgm ~]# zcat /var/log/eos/mgm/Converter.log-20210528.gz |grep cb7c3183b75f
210527 17:09:08 INFO ConversionJob:235 [tpc]: root@eoskolkata.tier2-kol.res.in:1094@root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f => root@eoskolkata.tier2-kol.res.in:1094@root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642 prepare_msg=[SUCCESS]
210527 17:09:08 ERROR ConversionJob:378 msg="[ERROR] Server responded with an error: [3010] Unable to open file /eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642; Operation not permitted" tpc_src=root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/grid/08/29291/77bc70a0-1d1f-11eb-aabb-cb7c3183b75f tpc_dst=root://eoskolkata.tier2-kol.res.in:1094//eos/alicekolkata/proc/conversion/00000000024f69c4:default.5#20640642 conversion_id=00000000024f69c4:default.5#20640642
[root@eos-mgm ~]#
===============================
Above output clearly showed that files insides folder /eos/alicekolkata/proc/conversion are unable to open and conversion error.
We also read the suggestion in threads i.e. TPC setup with token authentication - #21 by gbiro and File base manual conversion issue - #9 by ebirngru. Link File base manual conversion issue - #14 by esindril suggested that to remove space policy. Accordingly, we had remove space policy parameter i.e. eos space config default space.policy.[layout, nstripes, checksum, blocksize and blockchecksum]=remove.
Also, the permission of /eos/alicekolkata/proc and /eos/alicekolkata/proc/conversion are
drwxr-xr-x 1 root root 20480 Jul 30 2019 proc
drwxrwx— 1 daemon daemon 0 May 28 19:27 conversion.
But, till conversion is failed with permission error.
So, suggest accordingly.
Regards
Prasun.