Abysmally slow ZFS replication to local USB drive

stefan.

Cadet
Joined
Jan 9, 2024
Messages
3
I moved over to TrueNAS-SCALE-23.10.1 over the past week and noticing a very bad performance regression.

I have a replication task to backup my main pool to an external USB 3.0 drive. I rotate through USB drives and store one off site for disaster recovery. I'm attempting to perform a fresh replication from scratch (empty dataset) and the performance is incredibly slow, thrashing between 70 MiB/s to only 10 MiB/s. This is causing a replication task for about 3 TiB to take days.

The USB drive/pool is a 1 x DISK | 1 wide | 4.55 TiB configuration. The USB pool is encrypted at the root. The source pool is 1 x RAIDZ2 | 6 wide | 2.73 TiB with no encryption.

The USB drive itself can do sustained sequential writes of 150 MiB/s or more, so that's not the bottleneck. I'm trying to figure out what is occurring before reverting back to TrueNas CORE (which didn't have this problem).

Graph of drive throughput below. The first part shows a sequential write test, followed by a replication task start around 08:03. The replication task seems to work OK for about 4 minutes (90MiB/s sustained), and then it starts to thrash.

Note that advanced power management is disabled for all drives.

1704809128808.png


Replication task configuration below:
1704809149180.png


System info:
RAM: 32 GiB ECC
CPU: i3-4160 CPU @ 3.60GHz
Motherboard: Supermicro X10SL7-F

USB dataset settings:
Code:
root@nas[~]# zfs get all usb
NAME  PROPERTY              VALUE                  SOURCE
usb   type                  filesystem             -
usb   creation              Tue Jan  9  8:26 2024  -
usb   used                  75.5G                  -
usb   available             4.33T                  -
usb   referenced            192K                   -
usb   compressratio         1.09x                  -
usb   mounted               yes                    -
usb   quota                 none                   default
usb   reservation           none                   default
usb   recordsize            128K                   default
usb   mountpoint            /mnt/usb               default
usb   sharenfs              off                    default
usb   checksum              on                     default
usb   compression           lz4                    local
usb   atime                 off                    local
usb   devices               on                     default
usb   exec                  on                     default
usb   setuid                on                     default
usb   readonly              off                    default
usb   zoned                 off                    default
usb   snapdir               hidden                 default
usb   aclmode               discard                local
usb   aclinherit            passthrough            local
usb   createtxg             1                      -
usb   canmount              on                     default
usb   xattr                 on                     default
usb   copies                1                      default
usb   version               5                      -
usb   utf8only              off                    -
usb   normalization         none                   -
usb   casesensitivity       sensitive              -
usb   vscan                 off                    default
usb   nbmand                off                    default
usb   sharesmb              off                    default
usb   refquota              none                   default
usb   refreservation        none                   default
usb   guid                  xxxxxxxxx    -
usb   primarycache          all                    default
usb   secondarycache        all                    default
usb   usedbysnapshots       0B                     -
usb   usedbydataset         192K                   -
usb   usedbychildren        75.5G                  -
usb   usedbyrefreservation  0B                     -
usb   logbias               latency                default
usb   objsetid              54                     -
usb   dedup                 off                    default
usb   mlslabel              none                   default
usb   sync                  standard               default
usb   dnodesize             legacy                 default
usb   refcompressratio      1.00x                  -
usb   written               192K                   -
usb   logicalused           81.9G                  -
usb   logicalreferenced     69K                    -
usb   volmode               default                default
usb   filesystem_limit      none                   default
usb   snapshot_limit        none                   default
usb   filesystem_count      none                   default
usb   snapshot_count        none                   default
usb   snapdev               hidden                 default
usb   acltype               posix                  local
usb   context               none                   default
usb   fscontext             none                   default
usb   defcontext            none                   default
usb   rootcontext           none                   default
usb   relatime              on                     default
usb   redundant_metadata    all                    default
usb   overlay               on                     default
usb   encryption            aes-256-gcm            -
usb   keylocation           prompt                 local
usb   keyformat             hex                    -
usb   pbkdf2iters           0                      default
usb   encryptionroot        usb                    -
usb   keystatus             available              -
usb   special_small_blocks  0                      default


Source dataset settings:
Code:
root@nas[~]# zfs get all master
NAME    PROPERTY                 VALUE                     SOURCE
master  type                     filesystem                -
master  creation                 Sat Feb  1  8:49 2020     -
master  used                     3.04T                     -
master  available                7.64T                     -
master  referenced               122M                      -
master  compressratio            1.34x                     -
master  mounted                  yes                       -
master  quota                    none                      default
master  reservation              none                      default
master  recordsize               128K                      local
master  mountpoint               /mnt/master               default
master  sharenfs                 off                       local
master  checksum                 on                        local
master  compression              lz4                       local
master  atime                    off                       local
master  devices                  on                        default
master  exec                     off                       local
master  setuid                   on                        default
master  readonly                 off                       local
master  zoned                    off                       default
master  snapdir                  hidden                    local
master  aclmode                  passthrough               local
master  aclinherit               passthrough               local
master  createtxg                1                         -
master  canmount                 on                        default
master  xattr                    on                        default
master  copies                   1                         local
master  version                  5                         -
master  utf8only                 off                       -
master  normalization            none                      -
master  casesensitivity          sensitive                 -
master  vscan                    off                       default
master  nbmand                   off                       default
master  sharesmb                 off                       local
master  refquota                 none                      default
master  refreservation           none                      default
master  guid                     xxxxxxxxx      -
master  primarycache             all                       default
master  secondarycache           all                       default
master  usedbysnapshots          559K                      -
master  usedbydataset            122M                      -
master  usedbychildren           3.04T                     -
master  usedbyrefreservation     0B                        -
master  logbias                  latency                   default
master  objsetid                 21                        -
master  dedup                    off                       local
master  mlslabel                 none                      default
master  sync                     standard                  local
master  dnodesize                legacy                    default
master  refcompressratio         1.00x                     -
master  written                  0                         -
master  logicalused              3.63T                     -
master  logicalreferenced        122M                      -
master  volmode                  default                   default
master  filesystem_limit         none                      default
master  snapshot_limit           none                      default
master  filesystem_count         none                      default
master  snapshot_count           none                      default
master  snapdev                  hidden                    local
master  acltype                  nfsv4                     default
master  context                  none                      default
master  fscontext                none                      default
master  defcontext               none                      default
master  rootcontext              none                      default
master  relatime                 on                        default
master  redundant_metadata       all                       default
master  overlay                  on                        default
master  encryption               off                       default
master  keylocation              none                      default
master  keyformat                none                      default
master  pbkdf2iters              0                         default
master  special_small_blocks     0                         local
master  snapshots_changed        Tue Jan  9  8:17:17 2024  -
master  org.freenas:description                            local
master  org.freebsd.ioc:active   yes                       local
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Hey @stefan.

Can you identify your USB HDD? I suspect that at ~5TB in size it's likely a Seagate SMR drive.

This behavior would be what I would expect to see from an SMR drive that's re-shingling itself under a rewrite workload.
 

stefan.

Cadet
Joined
Jan 9, 2024
Messages
3
Oof, great point, how did I miss that. It's a STEB5000100. Based on what I read online, it's most likely SMR.

Is there any way to work around this? It appears that there are parallel zfs send jobs running, which would easily exacerbate this problem. I'm assuming that a single send / receive pair would allow for better performance on an SMR drive, but perhaps I'm being naive.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Yep, those are definitely SMR.

A single stream will likely mitigate some of the issues going forward, as the drive should have more luck seeing it as a stream of sequential writes, but you may still have a re-shingling in the middle of things - blanking the drive may help it "reset" the internal SMR translation table, but I don't know if it will do that immediately or if it will have to re-shingle anyways. I know we have a few community members who are backing up to USB drives, and a few using SMR, but I don't know if I can recall any offhand that are using both together.
 

trennseite

Cadet
Joined
Jan 14, 2024
Messages
1
Hello!
My 1st post to this forum :smile:
Stefan, did you check the pool for write errors? This could also lead into performance issues. My storage is all external, attached via USB. When copying large amounts of data to those zfs pools via usb, wirite errory occur very soon and disks were shown as degraded or in mirror vdevs even faulted. In my case, disabling UAS helped.
See:
(description on why you would like to disable UAS for your USB device in question and how this can be accomplished):

CU,
 
Joined
Oct 22, 2019
Messages
3,641
You can purchase a Western Digital 8TiB External USB Drive that will de facto not contain an SMR HDD within.

Interestingly, they are priced lower than your 5 TiB Seagate. (At least in the U.S.)

* The reason I specifically mention 8 TiB is because from this capacity and higher, WD does not use SMR drives. If you try to save money by purchasing a 6 TiB instead, you run the risk of an SMR HDD inside the enclosure.
 

stefan.

Cadet
Joined
Jan 9, 2024
Messages
3
Thanks for the link for a good replacement. The original drives I bought many years ago for less than half of what they're listed for now, oddly. Comparable to the 6TB option today for those WD drives. I might just get those 8TB WD drives or larger as I'm running out of space for the backup drive.

Some things I also tried over the past week or so:
1. Replace the drive: Same performance issue occurs on two different drives (same model).
2. Recreate the pool without a swap partition (originally had 2 GiB): No impact on performance.
3. Wipe the drive with all zeros to hopefully reset the SMR state: No impact on performance.
4. Tried to read the SMART info, but error counters aren't available. Health status is OK.
 
Top