Curious - Fluctuations in Throughput During Replication?

Phase

Explorer
Joined
Sep 30, 2020
Messages
63
Hi just curious. I'm replicating a 35 TiB pool from one server to another and I see the pattern below. What are these gradual dips with restorative pickups?

Each 6-hour block corresponds roughly to 11.3 TiB transferred.

Is it GC? Is it from the OS, ZFS, Replication Engine? Thoughts?

These are the disk on the target, source and network on the source

target-disk.png
source-disk.png


source-network.png
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
This looks like classic TCP sawtooth behavior. Try changing the TCP congestion control algorithm on both sides of the replication.

 

Phase

Explorer
Joined
Sep 30, 2020
Messages
63
That is an interesting thought. However, the network bandwidth is around 27 Gib/s as tested with multiple threads of iperf3 between these 2 machines, probably being limited by 4x PCIe 3.0 sockets. The replication is consuming only 6 Gib/s and that seems to max out the target disks -- all target disks as on the same controller.

Would the TCP sawtooth behavior still be present if the network is being underutilized? (this is a card-to-card connection, no routers or switches)
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
TCP is notorious for going into sawtooth far short of the full bandwidth of the network.
If this is a direct link, you should also enable jumbo frames to help mitigate the onset of sawtooth.
 
Top