Slow Transfer speeds

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I can take one drive off and plug into the single spare onboard sata and create a new pool. I saw an option for trim, I'll see if it let's me do it. Are you thinking then when they have finished reshingling themselves speeds will improve?

If the drives were completely erased with zeroes and allowed to reshingle, they would regain their speed - at least until fragmentation or overwrites required them to do it again.

SMR disks are divided into a number of (usually) 256MB "zones" which can only be written to in an sequential "append-only" fashion, or erased entirely. It's similar to the "program" and "erase" zones on SSD NAND - you can often write to them in small pages of 4K or 8K at a time, but you have to erase at the 1MB or larger level. In order to erase 1MB, the SSD needs to quickly sweep up the blocks that are still useful, and rewrite them into another empty page. This is fast because NAND is fast. Now imagine doing this on spinning disks, and at the 256MB "erase" level - it becomes painfully slow very quickly, which is why it gets hidden behind those couple dozen GB of "conventional" disk space and a bunch of firmware trickery.

I'll post both test results of the single hdd on the onboard later on. Unless a basic pcie to 4 port sata is any good? I have a spare of those

Likely no. Budget SATA controllers often don't have sufficient PCIe lanes to handle the demands ZFS puts on them, and/or have immature FreeBSD drivers. There's a reason the short answer for HBAs is often the rhyme of "Just Buy LSI"

Use the onboard SATA ports, and verify in the BIOS that they're set to "AHCI" mode.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
If the drives were completely erased with zeroes and allowed to reshingle, they would regain their speed - at least until fragmentation or overwrites required them to do it again.

SMR disks are divided into a number of (usually) 256MB "zones" which can only be written to in an sequential "append-only" fashion, or erased entirely. It's similar to the "program" and "erase" zones on SSD NAND - you can often write to them in small pages of 4K or 8K at a time, but you have to erase at the 1MB or larger level. In order to erase 1MB, the SSD needs to quickly sweep up the blocks that are still useful, and rewrite them into another empty page. This is fast because NAND is fast. Now imagine doing this on spinning disks, and at the 256MB "erase" level - it becomes painfully slow very quickly, which is why it gets hidden behind those couple dozen GB of "conventional" disk space and a bunch of firmware trickery.



Likely no. Budget SATA controllers often don't have sufficient PCIe lanes to handle the demands ZFS puts on them, and/or have immature FreeBSD drivers. There's a reason the short answer for HBAs is often the rhyme of "Just Buy LSI"

Use the onboard SATA ports, and verify in the BIOS that they're set to "AHCI" mode.

Here is the correct pool ifo check:

Run status group 0 (all jobs):
WRITE: bw=1695MiB/s (1777MB/s), 1695MiB/s-1695MiB/s (1777MB/s-1777MB/s), io=49.7GiB (53.3GB), run=30001-30001msec

I've fixed the 10Gigabit network, who knew, MTU was defaulted to 1500 :/

I've also got 8 CMR drives arriving tomorrow so I can see their performance.

You mentioned before moving my system data set to the boot pool. Is this the zil? If so and I lose my boot device is my pool at risk, or just the last 5 seconds of the data transfer?

Thankyou for your continued support.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The fio testing seems to show that your system itself has no issues with generating the data, and ZFS has no issues sending it along to disks - it's now down to something in the storage subsystem that's bottlenecking at that 170MB/s before falling down to the 25-30MB/s level, oddly even with an SSD.

You mentioned before moving my system data set to the boot pool. Is this the zil? If so and I lose my boot device is my pool at risk, or just the last 5 seconds of the data transfer?
No, the "system dataset" is different from the ZFS Intent Log (ZIL) - the system dataset is where the TrueNAS system and middleware writes logs, configuration, and other tidbits. Because it's a lot of small writes, this could potentially cause your SMR drives to decide they need to "re-organize" their data internally - moving it to the boot SSD would mean less chance of that.

It would slightly reduce the redundancy of that dataset as it's no longer mirrored, but taking routine configuration backups can mitigate that.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Excellent OK. Thanks for all your help, since the 10gb network started to work correctly the smr drives got more consistent at around 80MB, not sure why it would make any difference as it was still 2GB/s I was getting on the network, but uping the MTU put it to 10GB's.

Cmr drives, copying from an ssd I got consistent speeds of 460MB with sync off to the pool. Sync on, I got a consistent 220MB from the SSD to the cmr pool.

So just to confirm my understanding, there is no cache, the drop in speed is simply because data was being written to a zil with sync on?
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Cmr drives, copying from an ssd I got consistent speeds of 460MB with sync off to the pool. Sync on, I got a consistent 220MB from the SSD to the cmr pool.

So just to confirm my understanding, there is no cache, the drop in speed is simply because data was being written to a zil with sync on?

Terminology correction - data is being written to a "pool" with sync=on. The pool needs to place the incoming writes onto stable storage before acknowledging them, which causes the drop in speeds.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Terminology correction - data is being written to a "pool" with sync=on. The pool needs to place the incoming writes onto stable storage before acknowledging them, which causes the drop in speeds.
OK thanks. So where is it stored with sync off? Can't be Ram as that would fill to quickly
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
OK thanks. So where is it stored with sync off? Can't be Ram as that would fill to quickly
Async writes (and even sync writes) do indeed collect in RAM before being flushed to the pool; the advantage to this is so that ZFS can collect a large number of small writes in a transaction and then push them to disk as a small number of large writes, which is easier for spinning disks to handle.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
So what's the difference then? 440MB/second on 80GB+ files (amount of ram I have) with sync off. I turn sync on and it drops to 220MB/s if they both write to ram first, surely should he same speed since ram would fill quick
Async writes (and even sync writes) do indeed collect in RAM before being flushed to the pool; the advantage to this is so that ZFS can collect a large number of small writes in a transaction and then push them to disk as a small number of large writes, which is easier for spinning disks to handle.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
So what's the difference then? 440MB/second on 80GB+ files (amount of ram I have) with sync off. I turn sync on and it drops to 220MB/s if they both write to ram first, surely should he same speed since ram would fill quick

Async writes to RAM only - sync writes write to RAM, but also "cc" the writes to a section of stable storage called the ZFS Intent Log (ZIL) that serves as a record of the pending writes. This can either be in your pool, or on a Separate Log Device (SLOG). Since you have no SLOG device, your pool has to "write twice" in a sense, so it makes sense that throughput is halved.[1]

[1] I'm trying to comb through the commits to see if this is still true, I know things got changed with zil_slog_limit being replaced by zil_slog_bulk
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Async writes to RAM only - sync writes write to RAM, but also "cc" the writes to a section of stable storage called the ZFS Intent Log (ZIL) that serves as a record of the pending writes. This can either be in your pool, or on a Separate Log Device (SLOG). Since you have no SLOG device, your pool has to "write twice" in a sense, so it makes sense that throughput is halved.[1]

[1] I'm trying to comb through the commits to see if this is still true, I know things got changed with zil_slog_limit being replaced by zil_slog_bulk
OK thanks, so zil to an independent ssd or the boot ssd and I should see the speeds stabilise higher. Then if my understanding is correct, if the boot ssd or independent ssd fails at the wrong time I'd only have up to five seconds of data impacted?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
OK thanks, so zil to an independent ssd or the boot ssd and I should see the speeds stabilise higher. Then if my understanding is correct, if the boot ssd or independent ssd fails at the wrong time I'd only have up to five seconds of data impacted?
Let's step back a bit here. The ZIL is only involved for synchronous writes, and synchronous writes are designed for situations where the data can't be replaced or "replayed" (such as copying the file again). SMB file share usage for file copies doesn't typically require it, unless there's a process that's writing live to the share.

See the resource here - while it's written from the perspective of ESXi/virtualization, that's because it's one of the most common scenarios to encounter them.


Async writes are already "as fast as possible" - adding an SLOG will do nothing to those. Adding an SLOG can help restore speed lost by enabling sync=always.
 
Top