I have an EOS instance used to store the files from a DAQ system.
In the instance, there are 20 FST with 2 FS (2 groups) / FST (eos-server-4.4.10-1 + xrootd-server-4.8.4-1)
The main workflow is writing files continously (raw data) at the maximum speed
I’m trying to find the way to distribute network load to the FST more uniformly than what I see.
I did tests with 24 to 128 flows (from 4 DAQ hosts) writing 3GB files to my instance (1 replica, no raid6/raiddp).
At the end of the tests, in result, the files (number ~10k) are well distributed to FST, but when I take a look at the wopen column when I monitor the output of the commands ‘eos fs ls --io’ or ‘eos node ls --io’, I see that the ‘wopen’ column for nodes or fs from ‘0’ to ‘x’.
Some examples :
Group blancing (well balanced) :
[root@np02eos1 ~]# eos group ls --io
┌────────────────┬──────────┬────────────┬────────────┬──────────┬──────────┬──────────┬──────┬──────┬────────────┬────────────┬────────────┬───────────┬──────────┬──────────┐
│name │ diskload│ diskr-MB/s│ diskw-MB/s│ eth-MiB/s│ ethi-MiB│ etho-MiB│ ropen│ wopen│ used-bytes│ max-bytes│ used-files│ max-files│ bal-shd│ drain-shd│
└────────────────┴──────────┴────────────┴────────────┴──────────┴──────────┴──────────┴──────┴──────┴────────────┴────────────┴────────────┴───────────┴──────────┴──────────┘
default.1 0.00 0 0 22648 0 0 0 32 35.18 TB 438.66 TB 10.67 K 42.84 G 0 0
default.2 0.00 0 0 22648 0 0 0 30 35.16 TB 438.66 TB 10.66 K 42.84 G 0 0
Files balanced to FST nodes (not well balanced : 1 to 9 concurrent write / node) :
[root@np02eos1 ~]# eos node ls --io
┌────────────────────────────────┬────────────────┬──────────┬────────────┬────────────┬──────────┬──────────┬──────────┬──────┬──────┬────────────┬────────────┬────────────┬───────────┬──────────┬──────────┬──────────┬──────┬─────────┐
│hostport │ geotag│ diskload│ diskr-MB/s│ diskw-MB/s│ eth-MiB/s│ ethi-MiB│ etho-MiB│ ropen│ wopen│ used-bytes│ max-bytes│ used-files│ max-files│ bal-shd│ drain-shd│ gw-queue│ iops│ bw│
└────────────────────────────────┴────────────────┴──────────┴────────────┴────────────┴──────────┴──────────┴──────────┴──────┴──────┴────────────┴────────────┴────────────┴───────────┴──────────┴──────────┴──────────┴──────┴─────────┘
np02ss00.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 1 3.15 TB 46.18 TB 955 4.51 G 0 0 0 118 647 MB
np02ss01.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 1 3.89 TB 46.18 TB 1.18 K 4.51 G 0 0 0 76 696 MB
np02ss02.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 2 3.95 TB 46.18 TB 1.20 K 4.51 G 0 0 0 92 668 MB
np02ss03.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 4 3.72 TB 46.18 TB 1.13 K 4.51 G 0 0 0 98 314 MB
np02ss04.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 3 3.89 TB 46.18 TB 1.18 K 4.51 G 0 0 0 114 458 MB
np02ss05.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 2 3.87 TB 46.18 TB 1.17 K 4.51 G 0 0 0 110 394 MB
np02ss06.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 5 3.83 TB 46.18 TB 1.16 K 4.51 G 0 0 0 118 382 MB
np02ss07.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 3 3.94 TB 46.18 TB 1.20 K 4.51 G 0 0 0 119 1109 MB
np02ss08.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 2 3.67 TB 46.18 TB 1.11 K 4.51 G 0 0 0 109 379 MB
np02ss09.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 2 3.86 TB 46.18 TB 1.17 K 4.51 G 0 0 0 57 313 MB
np02ss10.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 2 3.89 TB 46.18 TB 1.18 K 4.51 G 0 0 0 116 392 MB
np02ss11.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 3 3.72 TB 46.18 TB 1.13 K 4.51 G 0 0 0 110 482 MB
np02ss12.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 9 4.08 TB 46.18 TB 1.24 K 4.51 G 0 0 0 120 432 MB
np02ss13.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 4 3.80 TB 46.18 TB 1.15 K 4.51 G 0 0 0 114 451 MB
np02ss14.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 3 3.88 TB 46.18 TB 1.18 K 4.51 G 0 0 0 118 412 MB
np02ss15.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 4 3.13 TB 46.18 TB 950 4.51 G 0 0 0 110 377 MB
np02ss16.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 4 3.90 TB 46.18 TB 1.18 K 4.51 G 0 0 0 107 390 MB
np02ss17.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 5 3.80 TB 46.18 TB 1.15 K 4.51 G 0 0 0 113 370 MB
np02ss18.cern.ch:1095 np02-daq 0.00 0 0 1192 0 0 0 4 3.74 TB 46.18 TB 1.14 K 4.51 G 0 0 0 85 538 MB
Files balanced to FS (not well balanced : 0 to 6concurrent write / FS) :
[root@np02eos1 ~]# eos fs ls --io
┌────────────────────────────────┬──────┬────────────────┬────────────────┬──────────┬────────────┬────────────┬──────────┬──────────┬──────────┬──────┬──────┬────────────┬────────────┬────────────┬───────────┬──────────┬──────────────┬────────────┬──────┬─────────┐
│hostport │ id│ schedgroup│ geotag│ diskload│ diskr-MB/s│ diskw-MB/s│ eth-MiB/s│ ethi-MiB│ etho-MiB│ ropen│ wopen│ used-bytes│ max-bytes│ used-files│ max-files│ bal-shd│ drain-shd│ drainpull│ iops│ bw│
└────────────────────────────────┴──────┴────────────────┴────────────────┴──────────┴────────────┴────────────┴──────────┴──────────┴──────────┴──────┴──────┴────────────┴────────────┴────────────┴───────────┴──────────┴──────────────┴────────────┴──────┴─────────┘
np02ss00.cern.ch:1095 152 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.38 TB 23.09 TB 415 2.25 G 0 0 off 60 270 MB
np02ss00.cern.ch:1095 153 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.51 TB 23.09 TB 458 2.25 G 0 0 off 58 377 MB
np02ss01.cern.ch:1095 154 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.68 TB 23.09 TB 509 2.25 G 0 0 off 19 249 MB
np02ss01.cern.ch:1095 155 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.80 TB 23.09 TB 545 2.25 G 0 0 off 57 447 MB
np02ss02.cern.ch:1095 156 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 1 1.70 TB 23.09 TB 514 2.25 G 0 0 off 56 419 MB
np02ss02.cern.ch:1095 157 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.69 TB 23.09 TB 511 2.25 G 0 0 off 36 249 MB
np02ss03.cern.ch:1095 158 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 1 1.61 TB 23.09 TB 486 2.25 G 0 0 off 53 120 MB
np02ss03.cern.ch:1095 159 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.67 TB 23.09 TB 506 2.25 G 0 0 off 45 194 MB
np02ss04.cern.ch:1095 160 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.69 TB 23.09 TB 510 2.25 G 0 0 off 59 288 MB
np02ss04.cern.ch:1095 161 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.70 TB 23.09 TB 514 2.25 G 0 0 off 55 170 MB
np02ss05.cern.ch:1095 162 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 4 1.70 TB 23.09 TB 514 2.25 G 0 0 off 58 243 MB
np02ss05.cern.ch:1095 163 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.67 TB 23.09 TB 506 2.25 G 0 0 off 52 151 MB
np02ss06.cern.ch:1095 164 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.68 TB 23.09 TB 509 2.25 G 0 0 off 57 186 MB
np02ss06.cern.ch:1095 165 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.68 TB 23.09 TB 507 2.25 G 0 0 off 61 196 MB
np02ss07.cern.ch:1095 166 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.72 TB 23.09 TB 520 2.25 G 0 0 off 65 945 MB
np02ss07.cern.ch:1095 167 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 3 1.71 TB 23.09 TB 518 2.25 G 0 0 off 54 164 MB
np02ss08.cern.ch:1095 168 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.71 TB 23.09 TB 517 2.25 G 0 0 off 54 178 MB
np02ss08.cern.ch:1095 169 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.63 TB 23.09 TB 494 2.25 G 0 0 off 55 201 MB
np02ss09.cern.ch:1095 170 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 4 1.78 TB 23.09 TB 540 2.25 G 0 0 off 20 134 MB
np02ss09.cern.ch:1095 171 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 3 1.72 TB 23.09 TB 522 2.25 G 0 0 off 37 179 MB
np02ss10.cern.ch:1095 172 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 5 1.73 TB 23.09 TB 522 2.25 G 0 0 57 157 MB
np02ss10.cern.ch:1095 173 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.71 TB 23.09 TB 518 2.25 G 0 0 59 235 MB
np02ss11.cern.ch:1095 174 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.70 TB 23.09 TB 514 2.25 G 0 0 58 326 MB
np02ss11.cern.ch:1095 175 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 3 1.65 TB 23.09 TB 500 2.25 G 0 0 52 156 MB
np02ss12.cern.ch:1095 176 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.80 TB 23.09 TB 545 2.25 G 0 0 59 157 MB
np02ss12.cern.ch:1095 177 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 3 1.82 TB 23.09 TB 551 2.25 G 0 0 61 275 MB
np02ss13.cern.ch:1095 178 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 5 1.66 TB 23.09 TB 503 2.25 G 0 0 55 182 MB
np02ss13.cern.ch:1095 179 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 3 1.70 TB 23.09 TB 515 2.25 G 0 0 59 269 MB
np02ss14.cern.ch:1095 180 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 1 1.73 TB 23.09 TB 523 2.25 G 0 0 60 210 MB
np02ss14.cern.ch:1095 181 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 2 1.72 TB 23.09 TB 519 2.25 G 0 0 58 202 MB
np02ss15.cern.ch:1095 182 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 1 1.37 TB 23.09 TB 414 2.25 G 0 0 57 179 MB
np02ss15.cern.ch:1095 183 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 1 1.40 TB 23.09 TB 424 2.25 G 0 0 53 198 MB
np02ss16.cern.ch:1095 184 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 4 1.69 TB 23.09 TB 511 2.25 G 0 0 56 195 MB
np02ss16.cern.ch:1095 185 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.75 TB 23.09 TB 529 2.25 G 0 0 51 195 MB
np02ss17.cern.ch:1095 186 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 3 1.79 TB 23.09 TB 540 2.25 G 0 0 55 199 MB
np02ss17.cern.ch:1095 187 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 6 1.70 TB 23.09 TB 516 2.25 G 0 0 58 171 MB
np02ss18.cern.ch:1095 188 default.1 np02-daq 0.00 0.00 0.00 1192 0 0 0 0 1.73 TB 23.09 TB 523 2.25 G 0 0 56 293 MB
np02ss18.cern.ch:1095 189 default.2 np02-daq 0.00 0.00 0.00 1192 0 0 0 4 1.58 TB 23.09 TB 477 2.25 G 0 0 29 245 MB
Here are the geosched parameters :
[root@np02eos1 ~]# eos geosched show param
### GeoTreeEngine parameters :
skipSaturatedPlct = 1
skipSaturatedAccess = 1
skipSaturatedDrnAccess = 1
skipSaturatedBlcAccess = 1
skipSaturatedDrnPlct = 0
skipSaturatedBlcPlct = 0
proxyCloseToFs = 1
penaltyUpdateRate = 1
plctDlScorePenalty = 10(default) | 10(1Gbps) | 10(10Gbps) | 10(100Gbps) | 10(1000Gbps)
plctUlScorePenalty = 10(defaUlt) | 10(1Gbps) | 10(10Gbps) | 10(100Gbps) | 10(1000Gbps)
accessDlScorePenalty = 10(default) | 10(1Gbps) | 10(10Gbps) | 10(100Gbps) | 10(1000Gbps)
accessUlScorePenalty = 10(defaUlt) | 10(1Gbps) | 10(10Gbps) | 10(100Gbps) | 10(1000Gbps)
fillRatioLimit = 80
fillRatioCompTol = 100
saturationThres = 10
timeFrameDurationMs = 1000
### GeoTreeEngine list of groups :
default.1 , default.2 ,
Whow to reduce the wopen standard deviation as low as possible ?