iagocanalejas
Cadet
- Joined
- Feb 4, 2024
- Messages
- 9
I have a dataset with 5x18TB drives in RaidZ2 that was working fine until a system reboot a couple of days ago. Now i can't manage to read from it.
- I have a SMB share to acces it, i can copy files at about 80MiB/s (same as before) and i can see all the files in the system but trying to copy files from the NAS to my local system fails.
- Trying with SCP i can also copy into the NAS, but trying to copy to my local system the speed jump from 20MiB/s to 1KiB/s and even 0 taking hours to copy a single 8GB file.
- Trying to rsync the files to another dataset runs slow (48days for 7TB)
- Iperf shows the network is not the problem as it shows the correct output both ways.
- I ran a couple of fio tests attached below
Code:
sudo fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=rw --size=50g --io_size=1500g --blocksize=128k --iodepth=16 --direct=1 --numjobs=1 --runtime=120 --group_reporting --output=/mnt/BigVault/media/fiotest.txt
TEST: (g=0): rw=rw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=psync, iodepth=16
fio-3.33
Starting 1 process
TEST: (groupid=0, jobs=1): err= 0: pid=43094: Sun Feb 4 18:03:56 2024
read: IOPS=73, BW=9375KiB/s (9600kB/s)(1183MiB/129245msec)
clat (usec): min=7, max=66847k, avg=13605.20, stdev=818138.08
lat (usec): min=8, max=66847k, avg=13606.42, stdev=818138.07
clat percentiles (usec):
| 1.00th=[ 12], 5.00th=[ 13], 10.00th=[ 14],
| 20.00th=[ 15], 30.00th=[ 16], 40.00th=[ 17],
| 50.00th=[ 18], 60.00th=[ 20], 70.00th=[ 21],
| 80.00th=[ 23], 90.00th=[ 29], 95.00th=[ 56],
| 99.00th=[ 241], 99.50th=[ 816], 99.90th=[ 10814],
| 99.95th=[ 173016], 99.99th=[17112761]
bw ( KiB/s): min=50432, max=1908736, per=100.00%, avg=484608.00, stdev=802302.84, samples=5
iops : min= 394, max=14912, avg=3786.00, stdev=6267.99, samples=5
write: IOPS=73, BW=9369KiB/s (9594kB/s)(1183MiB/129245msec); 0 zone resets
clat (usec): min=14, max=528, avg=33.33, stdev=23.64
lat (usec): min=16, max=537, avg=35.73, stdev=24.02
clat percentiles (usec):
| 1.00th=[ 20], 5.00th=[ 21], 10.00th=[ 21], 20.00th=[ 22],
| 30.00th=[ 23], 40.00th=[ 24], 50.00th=[ 25], 60.00th=[ 28],
| 70.00th=[ 33], 80.00th=[ 39], 90.00th=[ 55], 95.00th=[ 79],
| 99.00th=[ 137], 99.50th=[ 141], 99.90th=[ 227], 99.95th=[ 355],
| 99.99th=[ 529]
bw ( KiB/s): min=51968, max=1895680, per=100.00%, avg=484352.00, stdev=795820.21, samples=5
iops : min= 406, max=14810, avg=3784.00, stdev=6217.35, samples=5
lat (usec) : 10=0.07%, 20=35.65%, 50=55.61%, 100=6.55%, 250=1.57%
lat (usec) : 500=0.18%, 750=0.09%, 1000=0.04%
lat (msec) : 2=0.08%, 4=0.04%, 10=0.05%, 20=0.03%, 50=0.01%
lat (msec) : 250=0.01%, >=2000=0.02%
cpu : usr=0.04%, sys=0.44%, ctx=1354, majf=0, minf=14
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=9466,9460,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=9375KiB/s (9600kB/s), 9375KiB/s-9375KiB/s (9600kB/s-9600kB/s), io=1183MiB (1241MB), run=129245-129245msec
WRITE: bw=9369KiB/s (9594kB/s), 9369KiB/s-9369KiB/s (9594kB/s-9594kB/s), io=1183MiB (1240MB), run=129245-129245msec
Last edited by a moderator: