Problem expanding pool after drive upgrade

tvsjr · Jan 12, 2024

Good day:

I've been away from the forums for years, but have run into a perplexing problem. Let's start with the system:

Supermicro X11SSH-LN4F motherboard
E3-1275v6 CPU
64GB ECC RAM

This system has been iteratively upgraded over the years, starting with Core and now moving to scale. The only pool that currently exists on the system consists of two RAIDZ2 VDEVs... one 6x3TB, one 6x4TB. One 6-drive set is connected to the onboard SATA controller, the other 6-drive set is connected to the venerable SAS2008 HBA properly flashed for IT mode. This hardware has been dead nuts reliable for many years.

One of the 3TB drives started throwing a couple SMART errors and, being those drives are 10 years old and have more than 80,000 hours each, I figured it might be smart to get a little proactive and swap them. Plus, hey, more space, right? So, I picked up 6 20TB Seagate Exos Enterprise drives. Each drive went through a proper break-in process... short/conveyance SMART tests, a badblocks run (geez, that took for-effing-ever), and a SMART long test. Fast forward almost 10 days and I've got 6 minty, proven drives ready to go.

I go through the standard process of carefully swapping one drive at a time, doing the replace, and letting the resilver complete. The process goes swimmingly - no errors, no downtime, but *err?* the pool doesn't grow once the 6th drive is replaced. Autoexpand is on. I try a manual expand from the GUI, which fails with an error (which I didn't write down and can't replicate without another 24 hour process) about the partitions not being written, can't inform the kernel, recommend reboot. The pool then drops to a degraded state, with one drive show unavailable.

I figure something's weird, because the very same gptid shows to be online. And, of course, the drive has ZFS data so TN really doesn't want to make it part of the pool. So, I did a complete wipe to zero from the GUI. This is after the wipe, before replacement:

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0B in 07:57:12 with 0 errors on Fri Jan 12 11:57:17 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        Tier3                                     DEGRADED     0     0     0
          raidz2-0                                ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
          raidz2-1                                DEGRADED     0     0     0
            12098450062948849965                  OFFLINE      0     0     0  was /dev/disk/by-partuuid/67f1318c-713b-4b68-abbe-0c55a7c04d77
            056e938d-b5b9-449f-a6a6-acac72b82a87  ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d  ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0  ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3  ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1  ONLINE       0     0     0

errors: No known data errors

root@filestore[~]# lsblk -o name,partuuid,fstype,size
NAME        PARTUUID                             FSTYPE             SIZE
sda                                                                 3.6T
├─sda1      26c0769e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sda2      26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdb                                                                 3.6T
├─sdb1      2976efe0-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdb2      298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdc                                                                 3.6T
├─sdc1      2a2bb3de-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdc2      2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdd                                                                 3.6T
├─sdd1      28c5f5d1-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdd2      28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sde                                                                 3.6T
├─sde1      2815200e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sde2      28279512-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdf                                                                18.2T
sdg                                                                 3.6T
├─sdg1      2769ac8f-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdg2      277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdh                                                                18.2T
└─sdh1      056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member         2.7T
sdi                                                                18.2T
└─sdi1      87a9889c-557c-4692-b583-21b648dc7d7d zfs_member         2.7T
sdj                                                                18.2T
└─sdj1      35818415-8885-4671-bee1-d3d5a30970e3 zfs_member         2.7T
sdk                                                                18.2T
└─sdk1      e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member         2.7T
sdl                                                                18.2T
└─sdl1      478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member         2.7T

I start the replace/resilver process...

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Jan 12 13:53:00 2024
        14.1T / 34.8T scanned at 6.78G/s, 1.80T / 26.8T issued at 886M/s
        303G resilvered, 6.72% done, 08:13:14 to go
config:

        NAME                                        STATE     READ WRITE CKSUM
        Tier3                                       DEGRADED     0     0     0
          raidz2-0                                  ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
          raidz2-1                                  DEGRADED     0     0     0
            replacing-0                             DEGRADED     0     0     0
              12098450062948849965                  OFFLINE      0     0     0  was /dev/disk/by-partuuid/67f1318c-713b-4b68-abbe-0c55a7c04d77
              4e1cf4fe-3853-4952-a40d-a6b76dbbff81  ONLINE       0     0     0  (resilvering)
            056e938d-b5b9-449f-a6a6-acac72b82a87    ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d    ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0    ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3    ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1    ONLINE       0     0     0

errors: No known data errors

That completes and all is well... buuut, have we entered a temporal causality loop?

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: ONLINE
  scan: resilvered 2.55T in 04:03:34 with 0 errors on Fri Jan 12 17:56:34 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        Tier3                                     ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
          raidz2-1                                ONLINE       0     0     0
            4e1cf4fe-3853-4952-a40d-a6b76dbbff81  ONLINE       0     0     0
            056e938d-b5b9-449f-a6a6-acac72b82a87  ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d  ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0  ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3  ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1  ONLINE       0     0     0

errors: No known data errors

root@filestore[~]# zpool list
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3        38T  34.8T  3.21T        -         -     9%    91%  1.00x    ONLINE  /mnt

No autoextend! And yes, autoextend is on:

Code:

root@filestore[~]# zpool get autoexpand Tier3
NAME   PROPERTY    VALUE   SOURCE
Tier3  autoexpand  on      local

Go to the GUI, tell it to expand, and... yep:

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: resilvered 2.55T in 04:03:34 with 0 errors on Fri Jan 12 17:56:34 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        Tier3                                     DEGRADED     0     0     0
          raidz2-0                                ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
          raidz2-1                                DEGRADED     0     0     0
            4e1cf4fe-3853-4952-a40d-a6b76dbbff81  UNAVAIL      0     0     0
            056e938d-b5b9-449f-a6a6-acac72b82a87  ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d  ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0  ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3  ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1  ONLINE       0     0     0

errors: No known data errors

I'm kinda out of things to try on this one. I highly doubt I have a hardware issue - this stuff has been very reliable for many years. And yes, I've been through this same process multiple times with the same result.

The only possible thought I came up with was pool utilization... I was bad and let things get to 91%. Would this prevent extending the drive? And why would it cause a drive to drop offline every time?

Unless someone has a better option, my next thought (which is expensive) is to procure a 16-port SAS HBA and 6 more drives. Light 6 new drives up in a new pool, replicate everything over from the old pool (which would still be smaller than the 6x20TB), nuke all the old drives, then add the other 6x20TB drives back (after wiping them) as a second VDEV. On the plus side, even more free space! On the minus side, that's a $2.5K proposition that I'd kinda like to avoid.

Does anyone see something I'm missing?

danb35 · Jan 12, 2024

Which version are you running? Your sig says FreeNAS 11.1, but I doubt that's the case. There was a bug with SCALE 23.10.0 that affected this, but 23.10.1 was supposed to have fixed it.

tvsjr · Jan 12, 2024

danb35 said:
Which version are you running? Your sig says FreeNAS 11.1, but I doubt that's the case. There was a bug with SCALE 23.10.0 that affected this, but 23.10.1 was supposed to have fixed it.

Well hell, that would have been useful information, wouldn't it?

23.10.1. It was updated before this process started. The sig info is old.

tvsjr · Jan 13, 2024

Uh oh... have I stumped everyone?

joeschmuck · Jan 13, 2024

tvsjr said:
Uh oh... have I stumped everyone?

Nope, not really. Looking at the data provided it appears your drive sdf was never actually partitioned and added to the VDEV, none of your 18TB drives have a swap partition (you can override it of course) but I assume that the drives you removed also had a 2G swap just like the first VDEV, which is why I'm asking for the data again, in the order I asked for, at the same time period. The data above, I don't know what happened from the first output, and in between, and I certainly have no idea what it looks like right now.

Run the commands I have listed here and post the entire output and yes, some may be duplicates of above but I want to see it as of now, not what it was yesterday:

cat /etc/version
fdisk -l
lsblk -o name,partuuid,fstype,size
zpool list -v
hdparm -I /dev/sd? | grep -E "Number|dev"

I am not a guru on VDEVs but at a minimum we can get the correct information out there and maybe come up with a proper solution. Hopefully it's something easy to fix.

tvsjr · Jan 13, 2024

joeschmuck said:
Nope, not really. Looking at the data provided it appears your drive sdf was never actually partitioned and added to the VDEV, none of your 18TB drives have a swap partition (you can override it of course) but I assume that the drives you removed also had a 2G swap just like the first VDEV, which is why I'm asking for the data again, in the order I asked for, at the same time period. The data above, I don't know what happened from the first output, and in between, and I certainly have no idea what it looks like right now.

Run the commands I have listed here and post the entire output and yes, some may be duplicates of above but I want to see it as of now, not what it was yesterday:

cat /etc/version
fdisk -l
lsblk -o name,partuuid,fstype,size
zpool list -v
hdparm -I /dev/sd? | grep -E "Number|dev"

I am not a guru on VDEVs but at a minimum we can get the correct information out there and maybe come up with a proper solution. Hopefully it's something easy to fix.

Kidding, kidding... I'm fairly decent with this stuff and have been around FN/TN quite a while and it's got me stumped as well.

The drive definitely gets resilvered and attached. It takes the requisite amount of time and, if you look at the third code snippet above, you'll see all twelve drives happily attached and chugging along. Things don't go to hell until I try to do the manual extend (I've been through this three times now).

I do notice that the new drives didn't get partitioned with the small 2GB partition as well, but I understand this to be a holdover from the Core days? Shouldn't matter.

As for your requests... this is in the same state as I stopped with the posts above, so this is post-attempting a manual extend. In the current state, the array is degraded with one drive missing.

Note that the hdparm won't give the results you're expecting for the smaller drives (the vdev which doesn't have issues) as they are SAS.

Code:

root@filestore[~]# cat /etc/version
23.10.1#

Code:

root@filestore[~]# fdisk -l
Disk /dev/nvme1n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C5F2D93-78CA-4922-8A1C-70BA2B2BBF48

Device            Start        End    Sectors  Size Type
/dev/nvme1n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme1n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme1n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme1n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/nvme0n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E6299728-CC39-4238-88F3-E250BFFCBC3B

Device            Start        End    Sectors  Size Type
/dev/nvme0n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme0n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme0n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme0n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/sdi: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 33595098-4F22-4A0D-84EF-D3456F3349AF

Device     Start        End    Sectors  Size Type
/dev/sdi1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdk: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 889CD31F-CE3D-43D5-961C-C3ECCA23ECB1

Device     Start        End    Sectors  Size Type
/dev/sdk1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdj: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 8329A11C-66CC-41E8-BDFD-26AF4124ED5A

Device     Start        End    Sectors  Size Type
/dev/sdj1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdg: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D78C6F9D-78EE-43F1-8AD8-4240F2E272E1

Device     Start        End    Sectors  Size Type
/dev/sdg1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdl: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 275A588B-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdl1      128    4194431    4194304    2G FreeBSD swap
/dev/sdl2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdd: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28B89F29-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdd1      128    4194431    4194304    2G FreeBSD swap
/dev/sdd2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2967F0D3-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdc1      128    4194431    4194304    2G FreeBSD swap
/dev/sdc2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sde: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28058C68-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sde1      128    4194431    4194304    2G FreeBSD swap
/dev/sde2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2A1C3503-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdb1      128    4194431    4194304    2G FreeBSD swap
/dev/sdb2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26B15138-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sda1      128    4194431    4194304    2G FreeBSD swap
/dev/sda2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdh: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0BCB972F-6CE2-4D3D-823A-1150E82B9036

Device     Start        End    Sectors  Size Type
/dev/sdh1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdf: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 84908874-6FA0-4D96-815C-1E47DA97276C

Device     Start        End    Sectors  Size Type
/dev/sdf1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Code:

root@filestore[~]# lsblk -o name,partuuid,fstype,size
NAME        PARTUUID                             FSTYPE             SIZE
sda                                                                 3.6T
├─sda1      26c0769e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sda2      26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdb                                                                 3.6T
├─sdb1      2a2bb3de-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdb2      2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdc                                                                 3.6T
├─sdc1      2976efe0-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdc2      298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdd                                                                 3.6T
├─sdd1      28c5f5d1-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdd2      28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sde                                                                 3.6T
├─sde1      2815200e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sde2      28279512-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdf                                                                18.2T
└─sdf1      056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member         2.7T
sdg                                                                18.2T
└─sdg1      1b1e93bc-3b26-4f7a-9407-427099d2fe9e zfs_member         2.7T
sdh                                                                18.2T
└─sdh1      87a9889c-557c-4692-b583-21b648dc7d7d zfs_member         2.7T
sdi                                                                18.2T
└─sdi1      478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member         2.7T
sdj                                                                18.2T
└─sdj1      35818415-8885-4671-bee1-d3d5a30970e3 zfs_member         2.7T
sdk                                                                18.2T
└─sdk1      e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member         2.7T
sdl                                                                 3.6T
├─sdl1      2769ac8f-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdl2      277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
nvme1n1                                                             1.9T
├─nvme1n1p1 b9e725ec-f3ce-487e-a069-6d2cc0176d85                      1M
├─nvme1n1p2 6ca2e420-1aa3-4570-9611-db8f4cdee622 vfat               512M
├─nvme1n1p3 2fb772ff-12c4-4d30-abca-c8445fd24ef5 zfs_member         1.8T
└─nvme1n1p4 dd5a6688-6a80-450d-986e-5fade642b37c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G
nvme0n1                                                             1.9T
├─nvme0n1p1 0f42cfba-b2e1-440d-8f86-291ab99c0700                      1M
├─nvme0n1p2 99142a8d-c0b0-4b6a-b64b-afbb7b97e081 vfat               512M
├─nvme0n1p3 0871f49a-6b82-46c7-8662-581a4241cd49 zfs_member         1.8T
└─nvme0n1p4 c1c15327-fb2c-48fe-811d-21da1acfb87c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G

Code:

root@filestore[~]# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3                                       38T  34.8T  3.20T        -         -     9%    91%  1.00x  DEGRADED  /mnt
  raidz2-0                                21.8T  19.5T  2.27T        -         -     9%  89.5%      -    ONLINE
    26d2563c-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    277b82e6-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28279512-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28d8c104-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    298897cc-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    2a3c8e32-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
  raidz2-1                                16.2T  15.3T   956G        -         -    10%  94.2%      -  DEGRADED
    4e1cf4fe-3853-4952-a40d-a6b76dbbff81  2.73T      -      -        -         -      -      -      -   REMOVED
    056e938d-b5b9-449f-a6a6-acac72b82a87  2.73T      -      -        -         -      -      -      -    ONLINE
    87a9889c-557c-4692-b583-21b648dc7d7d  2.73T      -      -        -         -      -      -      -    ONLINE
    478c82ba-c501-4a7c-b2c7-d2fdc631efe0  2.73T      -      -        -         -      -      -      -    ONLINE
    35818415-8885-4671-bee1-d3d5a30970e3  2.73T      -      -        -         -      -      -      -    ONLINE
    e63df09f-e344-46a2-8653-f8d266886cd1  2.73T      -      -        -         -      -      -      -    ONLINE
boot-pool                                 1.84T  2.88G  1.84T        -         -     0%     0%  1.00x    ONLINE  -
  mirror-0                                1.84T  2.88G  1.84T        -         -     0%  0.15%      -    ONLINE
    nvme0n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE
    nvme1n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE

Code:

root@filestore[~]# hdparm -I /dev/sd? | grep -E "Number|dev"
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
/dev/sda:
/dev/sdb:
/dev/sdc:
/dev/sdd:
/dev/sde:
/dev/sdf:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC484K
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdg:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC3R6V
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdh:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC8CZY
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdi:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC2WJM
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdj:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC8GFN
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdk:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC5H9H
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
/dev/sdl:

The weird part is now, the partuuid doesn't appear. In the past, it matched even though it showed offline.

I'll leave it in the degraded state for now - any thoughts would be greatly appreciated.

joeschmuck · Jan 14, 2024

Well that is not what I expected for the results at all.

To summarize:
1) hdparm does not seem to recognize any of the drives in raidz2-0 and tosses a "bad/missing sense data" message for each drive. But this is not the original problem but I do see it as a definite problem.

2) Physical drive Serial Number: ZVTC3R6V should be identified as "sdg" but is not as far as 'zpool' is concerned, it is marked as REMOVED.

Question: These six drives, what are they physically connected to?

My suggestions right now are unfortunately all over the place, my brain tells me to roll back to CORE and fix the issue, I personally do not have faith in SCALE yet. Read these ideas and maybe you will select something that works, or maybe someone else will come in and save the day. I'm thinking I'd wipe the drive if the GUI supports it, but I have a backup of my data. If the GUI didn't support it, personally I'd pull the drive and wipe it manually on another computer, but I do have that backup.

0) What does the GUI look like (Storage -> Disk Health Manage Disks). Again, always track by Serial Number, not sd? or ada?. Use The Serial Number!

If the drive serial number ZVTC3R6V is looking normal (hell, post a screen capture).

Now look at the GUI (Storage -> Topology Manage devices) use the dropdowns to locate your drives, they "should" all say ONLINE, and you can click on the Device Name to pull up detailed information of each drive (including serial number - have I beat that dead horse yet?) Examine this data, look for something that doesn't look like all the others. Feel free to post a screen capture of the drive in question and maybe a few others for reference. I'm not a SCALE expert but I'm trying to learn.

1) If you do not have a backup, make a backup.

2) Install the current version of CORE on a removable USB Flash Drive and boot that up, and see what the pool looks like there.

3) Power off, Swap the data connections of drive Serial Number: ZVTC3R6V with one of the 3TB drives and Power on. See what happens. Your raidz2-0 should be unaffected. If the same drive still says REMOVED, in the GUI can you Wipe the drive (by serial number)? If yes, Wipe the drive and then resilver it again.

4) If all else fails, after you backup, destroy it all, add those new drives you were going to buy to replace the last of the 3TB drives, and rebuild. Or something like that, but the main thing here is to basically start all over.

There is some talk on the internet that the controller could be the issue of the "bad/missing sense data" messages, but it's odd because those messages are on the 3TB drives that are working fine.

Hopefully the data you posted and what I have wrote up will give someone else an idea to fix the issue easily, but I'm not certain easy is going to happen, it would be nice.

Last word of advice, if you don't need SCALE, I wouldn't use it. Also do not update the ZFS feature set when asked, it will limit your ability to use an older version of software which is tragic but true.

Good Luck.

Jailer · Jan 14, 2024

Users in the past that have run into the pool not auto expanding were usually able to solve it by exporting and then re importing the pool to get it to expand to the new capacity. As far as to why your pool keeps going to a degraded state I'll defer to others that are more knowledgeable than I am to solve that.

tvsjr · Jan 14, 2024

joeschmuck said:
Well that is not what I expected for the results at all.

To summarize:
1) hdparm does not seem to recognize any of the drives in raidz2-0 and tosses a "bad/missing sense data" message for each drive. But this is not the original problem but I do see it as a definite problem.

Correct. Raidz2-0 is comprised of 6x 4TB SAS drives, not SATA, so they behave a little differently. They also came out of an EMC array, were reformatted to 512b blocks, etc. They've always been a little odd... but they were free 8+ years ago and have worked well.

joeschmuck said:
2) Physical drive Serial Number: ZVTC3R6V should be identified as "sdg" but is not as far as 'zpool' is concerned, it is marked as REMOVED.

Question: These six drives, what are they physically connected to?

The six SAS drives (raidz2-0) are connected to an LSI 9200-8i in IT mode. The six SATA drives are connected to the SATA ports on the motherboard.

joeschmuck said:
My suggestions right now are unfortunately all over the place, my brain tells me to roll back to CORE and fix the issue, I personally do not have faith in SCALE yet. Read these ideas and maybe you will select something that works, or maybe someone else will come in and save the day. I'm thinking I'd wipe the drive if the GUI supports it, but I have a backup of my data. If the GUI didn't support it, personally I'd pull the drive and wipe it manually on another computer, but I do have that backup.

I've tried a full wipe - didn't affect the pool's integrity (since already degraded and showing that drive missing). It did resilver fine.

joeschmuck said:
0) What does the GUI look like (Storage -> Disk Health Manage Disks). Again, always track by Serial Number, not sd? or ada?. Use The Serial Number!

If the drive serial number ZVTC3R6V is looking normal (hell, post a screen capture).

Now look at the GUI (Storage -> Topology Manage devices) use the dropdowns to locate your drives, they "should" all say ONLINE, and you can click on the Device Name to pull up detailed information of each drive (including serial number - have I beat that dead horse yet?) Examine this data, look for something that doesn't look like all the others. Feel free to post a screen capture of the drive in question and maybe a few others for reference. I'm not a SCALE expert but I'm trying to learn.

I wiped the errant drive and am going to resilver, just to show that it does complete and is happy up until the extend. After the reboot, ZVTC3R6V remained at /dev/sdg. It appears identical to all the other 20TB drives in Storage/Disks. Storage/Topology shows the one drive missing, as expected.

Code:

root@filestore[~]# fdisk -l
Disk /dev/nvme1n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C5F2D93-78CA-4922-8A1C-70BA2B2BBF48

Device            Start        End    Sectors  Size Type
/dev/nvme1n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme1n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme1n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme1n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/nvme0n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E6299728-CC39-4238-88F3-E250BFFCBC3B

Device            Start        End    Sectors  Size Type
/dev/nvme0n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme0n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme0n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme0n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/sdk: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 889CD31F-CE3D-43D5-961C-C3ECCA23ECB1

Device     Start        End    Sectors  Size Type
/dev/sdk1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdg: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdj: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 8329A11C-66CC-41E8-BDFD-26AF4124ED5A

Device     Start        End    Sectors  Size Type
/dev/sdj1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdl: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0BCB972F-6CE2-4D3D-823A-1150E82B9036

Device     Start        End    Sectors  Size Type
/dev/sdl1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sde: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28058C68-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sde1      128    4194431    4194304    2G FreeBSD swap
/dev/sde2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdf: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 275A588B-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdf1      128    4194431    4194304    2G FreeBSD swap
/dev/sdf2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdd: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28B89F29-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdd1      128    4194431    4194304    2G FreeBSD swap
/dev/sdd2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2A1C3503-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdb1      128    4194431    4194304    2G FreeBSD swap
/dev/sdb2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26B15138-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdc1      128    4194431    4194304    2G FreeBSD swap
/dev/sdc2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2967F0D3-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sda1      128    4194431    4194304    2G FreeBSD swap
/dev/sda2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdh: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 84908874-6FA0-4D96-815C-1E47DA97276C

Device     Start        End    Sectors  Size Type
/dev/sdh1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdi: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 33595098-4F22-4A0D-84EF-D3456F3349AF

Device     Start        End    Sectors  Size Type
/dev/sdi1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 2.55T in 04:03:34 with 0 errors on Fri Jan 12 17:56:34 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        Tier3                                     DEGRADED     0     0     0
          raidz2-0                                ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
          raidz2-1                                DEGRADED     0     0     0
            4e1cf4fe-3853-4952-a40d-a6b76dbbff81  REMOVED      0     0     0
            056e938d-b5b9-449f-a6a6-acac72b82a87  ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d  ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0  ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3  ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1  ONLINE       0     0     0

errors: No known data errors

  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:03 with 0 errors on Fri Jan 12 03:45:05 2024
config:

        NAME           STATE     READ WRITE CKSUM
        boot-pool      ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            nvme0n1p3  ONLINE       0     0     0
            nvme1n1p3  ONLINE       0     0     0

errors: No known data errors

Code:

root@filestore[~]# lsblk -o name,partuuid,fstype,size
NAME        PARTUUID                             FSTYPE             SIZE
sda                                                                 3.6T
├─sda1      2976efe0-a9f6-11e5-b3e9-002590869c3c                      2G
└─sda2      298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdb                                                                 3.6T
├─sdb1      2a2bb3de-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdb2      2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdc                                                                 3.6T
├─sdc1      26c0769e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdc2      26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdd                                                                 3.6T
├─sdd1      28c5f5d1-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdd2      28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sde                                                                 3.6T
├─sde1      2815200e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sde2      28279512-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdf                                                                 3.6T
├─sdf1      2769ac8f-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdf2      277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdg                                                                18.2T
sdh                                                                18.2T
└─sdh1      056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member         2.7T
sdi                                                                18.2T
└─sdi1      478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member         2.7T
sdj                                                                18.2T
└─sdj1      35818415-8885-4671-bee1-d3d5a30970e3 zfs_member         2.7T
sdk                                                                18.2T
└─sdk1      e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member         2.7T
sdl                                                                18.2T
└─sdl1      87a9889c-557c-4692-b583-21b648dc7d7d zfs_member         2.7T
nvme1n1                                                             1.9T
├─nvme1n1p1 b9e725ec-f3ce-487e-a069-6d2cc0176d85                      1M
├─nvme1n1p2 6ca2e420-1aa3-4570-9611-db8f4cdee622 vfat               512M
├─nvme1n1p3 2fb772ff-12c4-4d30-abca-c8445fd24ef5 zfs_member         1.8T
└─nvme1n1p4 dd5a6688-6a80-450d-986e-5fade642b37c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G
nvme0n1                                                             1.9T
├─nvme0n1p1 0f42cfba-b2e1-440d-8f86-291ab99c0700                      1M
├─nvme0n1p2 99142a8d-c0b0-4b6a-b64b-afbb7b97e081 vfat               512M
├─nvme0n1p3 0871f49a-6b82-46c7-8662-581a4241cd49 zfs_member         1.8T
└─nvme0n1p4 c1c15327-fb2c-48fe-811d-21da1acfb87c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G

Code:

root@filestore[~]# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3                                       38T  35.1T  2.92T        -         -     9%    92%  1.00x  DEGRADED  /mnt
  raidz2-0                                21.8T  19.6T  2.19T        -         -     9%  89.9%      -    ONLINE
    26d2563c-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    277b82e6-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28279512-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28d8c104-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    298897cc-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    2a3c8e32-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
  raidz2-1                                16.2T  15.5T   754G        -         -    11%  95.5%      -  DEGRADED
    4e1cf4fe-3853-4952-a40d-a6b76dbbff81      -      -      -        -         -      -      -      -   REMOVED
    056e938d-b5b9-449f-a6a6-acac72b82a87  2.73T      -      -        -         -      -      -      -    ONLINE
    87a9889c-557c-4692-b583-21b648dc7d7d  2.73T      -      -        -         -      -      -      -    ONLINE
    478c82ba-c501-4a7c-b2c7-d2fdc631efe0  2.73T      -      -        -         -      -      -      -    ONLINE
    35818415-8885-4671-bee1-d3d5a30970e3  2.73T      -      -        -         -      -      -      -    ONLINE
    e63df09f-e344-46a2-8653-f8d266886cd1  2.73T      -      -        -         -      -      -      -    ONLINE
boot-pool                                 1.84T  2.87G  1.84T        -         -     0%     0%  1.00x    ONLINE  -
  mirror-0                                1.84T  2.87G  1.84T        -         -     0%  0.15%      -    ONLINE
    nvme0n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE
    nvme1n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE

Code:

root@filestore[~]# hdparm -I /dev/sd? | grep -E "Number|dev"
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
/dev/sda:
/dev/sdb:
/dev/sdc:
/dev/sdd:
/dev/sde:
/dev/sdf:
/dev/sdg:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC3R6V
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdh:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC484K
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdi:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC2WJM
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdj:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC8GFN
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdk:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC5H9H
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdl:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC8CZY
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum

For documentation purposes, I'm going to rerun all the commands and dump the output here. This is post reboot, post wipe (wipefs -a), basically what you'd expect to see if things had been going swimmingly, one drive crapped out, the bad drive got pulled and replaced, and the array is ready to be resilvered and brought to full redundancy. After I post all of this, I'll start the replacement process and post the same data during replacement, then again once complete (which should show everything happy and non-degraded). So, without further ado:

joeschmuck said:
1) If you do not have a backup, make a backup.

2) Install the current version of CORE on a removable USB Flash Drive and boot that up, and see what the pool looks like there.

3) Power off, Swap the data connections of drive Serial Number: ZVTC3R6V with one of the 3TB drives and Power on. See what happens. Your raidz2-0 should be unaffected. If the same drive still says REMOVED, in the GUI can you Wipe the drive (by serial number)? If yes, Wipe the drive and then resilver it again.

4) If all else fails, after you backup, destroy it all, add those new drives you were going to buy to replace the last of the 3TB drives, and rebuild. Or something like that, but the main thing here is to basically start all over.

There is some talk on the internet that the controller could be the issue of the "bad/missing sense data" messages, but it's odd because those messages are on the 3TB drives that are working fine.

Hopefully the data you posted and what I have wrote up will give someone else an idea to fix the issue easily, but I'm not certain easy is going to happen, it would be nice.

Last word of advice, if you don't need SCALE, I wouldn't use it. Also do not update the ZFS feature set when asked, it will limit your ability to use an older version of software which is tragic but true.

Good Luck.

I've been running Scale since it came out and haven't had any issues. My original intention was to try to run my lab environment on the same box using virtualization, but quickly discovered that the virtualization support is rather poor, plus I got into playing with stuff that needed more horsepower. All of that got moved off to three i7-12700/128GB/4x2TB NVMe SSD/Proxmox nodes. The pool did get upgraded to current, so I'm not sure if a Core rollback is even an option.

tvsjr · Jan 14, 2024

And this is the GUI starting replacement:

tvsjr · Jan 14, 2024

Aaand... this is new:

Interestingly, this error is thrown *after* the disk is partitioned:

Code:

root@filestore[~]# lsblk -o name,partuuid,fstype,size /dev/sdg
NAME   PARTUUID                             FSTYPE      SIZE
sdg                                                    18.2T
└─sdg1 245fa773-e9f0-4f6c-926a-b247c1518ae4 zfs_member  2.7T

And yes, I'm tracking by serial number... it just happened to remain at sdg.

joeschmuck · Jan 14, 2024

I want to tell you to do some things however I am not confident in my skills in this area, if it were my system the I'd give it a shot, but it is not my system. I am including two names who may be able to help you.

@danb35 and @Patrick M. Hausen have skills in this area. Hopefully one or both will come to your rescue.

tvsjr · Jan 15, 2024

I'm up for any suggestions... I've been playing with *nix and BSD for 30 years now, so I'm reasonably confident I can keep myself out of too much trouble

After completely wiping the drive, the GUI replace operation worked. Here's how things currently stand:

Code:

root@filestore[~]# fdisk -l
Disk /dev/nvme1n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C5F2D93-78CA-4922-8A1C-70BA2B2BBF48

Device            Start        End    Sectors  Size Type
/dev/nvme1n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme1n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme1n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme1n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/nvme0n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E6299728-CC39-4238-88F3-E250BFFCBC3B

Device            Start        End    Sectors  Size Type
/dev/nvme0n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme0n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme0n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme0n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdl: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 889CD31F-CE3D-43D5-961C-C3ECCA23ECB1

Device     Start        End    Sectors  Size Type
/dev/sdl1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdj: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 33595098-4F22-4A0D-84EF-D3456F3349AF

Device     Start        End    Sectors  Size Type
/dev/sdj1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdk: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 8329A11C-66CC-41E8-BDFD-26AF4124ED5A

Device     Start        End    Sectors  Size Type
/dev/sdk1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdi: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0BCB972F-6CE2-4D3D-823A-1150E82B9036

Device     Start        End    Sectors  Size Type
/dev/sdi1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26B15138-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sda1      128    4194431    4194304    2G FreeBSD swap
/dev/sda2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28B89F29-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdc1      128    4194431    4194304    2G FreeBSD swap
/dev/sdc2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sde: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28058C68-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sde1      128    4194431    4194304    2G FreeBSD swap
/dev/sde2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdd: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2967F0D3-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdd1      128    4194431    4194304    2G FreeBSD swap
/dev/sdd2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdg: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: AFA789F2-0275-4BDA-B5E0-98B5F6FE38AB

Device     Start        End    Sectors  Size Type
/dev/sdg1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2A1C3503-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdb1      128    4194431    4194304    2G FreeBSD swap
/dev/sdb2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdf: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 275A588B-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdf1      128    4194431    4194304    2G FreeBSD swap
/dev/sdf2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdh: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 84908874-6FA0-4D96-815C-1E47DA97276C

Device     Start        End    Sectors  Size Type
/dev/sdh1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/mapper/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Code:

root@filestore[~]# lsblk -o name,partuuid,fstype,size
NAME        PARTUUID                             FSTYPE             SIZE
sda                                                                 3.6T
├─sda1      26c0769e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sda2      26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdb                                                                 3.6T
├─sdb1      2a2bb3de-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdb2      2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdc                                                                 3.6T
├─sdc1      28c5f5d1-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdc2      28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdd                                                                 3.6T
├─sdd1      2976efe0-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdd2      298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sde                                                                 3.6T
├─sde1      2815200e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sde2      28279512-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdf                                                                 3.6T
├─sdf1      2769ac8f-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdf2      277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdg                                                                18.2T
└─sdg1      c2fefdf0-ef37-4db9-84a5-a7d8955686a5 zfs_member         2.7T
sdh                                                                18.2T
└─sdh1      056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member         2.7T
sdi                                                                18.2T
└─sdi1      87a9889c-557c-4692-b583-21b648dc7d7d zfs_member         2.7T
sdj                                                                18.2T
└─sdj1      478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member         2.7T
sdk                                                                18.2T
└─sdk1      35818415-8885-4671-bee1-d3d5a30970e3 zfs_member         2.7T
sdl                                                                18.2T
└─sdl1      e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member         2.7T
nvme1n1                                                             1.9T
├─nvme1n1p1 b9e725ec-f3ce-487e-a069-6d2cc0176d85                      1M
├─nvme1n1p2 6ca2e420-1aa3-4570-9611-db8f4cdee622 vfat               512M
├─nvme1n1p3 2fb772ff-12c4-4d30-abca-c8445fd24ef5 zfs_member         1.8T
└─nvme1n1p4 dd5a6688-6a80-450d-986e-5fade642b37c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G
nvme0n1                                                             1.9T
├─nvme0n1p1 0f42cfba-b2e1-440d-8f86-291ab99c0700                      1M
├─nvme0n1p2 99142a8d-c0b0-4b6a-b64b-afbb7b97e081 vfat               512M
├─nvme0n1p3 0871f49a-6b82-46c7-8662-581a4241cd49 zfs_member         1.8T
└─nvme0n1p4 c1c15327-fb2c-48fe-811d-21da1acfb87c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G

Code:

root@filestore[~]# zpool list -v
NAME                                         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3                                         38T  35.1T  2.92T        -         -     9%    92%  1.00x  DEGRADED  /mnt
  raidz2-0                                  21.8T  19.6T  2.19T        -         -     9%  89.9%      -    ONLINE
    26d2563c-a9f6-11e5-b3e9-002590869c3c    3.64T      -      -        -         -      -      -      -    ONLINE
    277b82e6-a9f6-11e5-b3e9-002590869c3c    3.64T      -      -        -         -      -      -      -    ONLINE
    28279512-a9f6-11e5-b3e9-002590869c3c    3.64T      -      -        -         -      -      -      -    ONLINE
    28d8c104-a9f6-11e5-b3e9-002590869c3c    3.64T      -      -        -         -      -      -      -    ONLINE
    298897cc-a9f6-11e5-b3e9-002590869c3c    3.64T      -      -        -         -      -      -      -    ONLINE
    2a3c8e32-a9f6-11e5-b3e9-002590869c3c    3.64T      -      -        -         -      -      -      -    ONLINE
  raidz2-1                                  16.2T  15.5T   754G        -         -    11%  95.5%      -  DEGRADED
    replacing-0                                 -      -      -        -         -      -      -      -  DEGRADED
      4e1cf4fe-3853-4952-a40d-a6b76dbbff81      -      -      -        -         -      -      -      -   REMOVED
      c2fefdf0-ef37-4db9-84a5-a7d8955686a5  2.73T      -      -        -         -      -      -      -    ONLINE
    056e938d-b5b9-449f-a6a6-acac72b82a87    2.73T      -      -        -         -      -      -      -    ONLINE
    87a9889c-557c-4692-b583-21b648dc7d7d    2.73T      -      -        -         -      -      -      -    ONLINE
    478c82ba-c501-4a7c-b2c7-d2fdc631efe0    2.73T      -      -        -         -      -      -      -    ONLINE
    35818415-8885-4671-bee1-d3d5a30970e3    2.73T      -      -        -         -      -      -      -    ONLINE
    e63df09f-e344-46a2-8653-f8d266886cd1    2.73T      -      -        -         -      -      -      -    ONLINE
boot-pool                                   1.84T  2.89G  1.84T        -         -     0%     0%  1.00x    ONLINE  -
  mirror-0                                  1.84T  2.89G  1.84T        -         -     0%  0.15%      -    ONLINE
    nvme0n1p3                               1.85T      -      -        -         -      -      -      -    ONLINE
    nvme1n1p3                               1.85T      -      -        -         -      -      -      -    ONLINE

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Jan 15 20:35:57 2024
        9.64T / 35.1T scanned at 40.3G/s, 0B / 29.5T issued
        0B resilvered, 0.00% done, no estimated completion time
config:

        NAME                                        STATE     READ WRITE CKSUM
        Tier3                                       DEGRADED     0     0     0
          raidz2-0                                  ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c    ONLINE       0     0     0
          raidz2-1                                  DEGRADED     0     0     0
            replacing-0                             DEGRADED     0     0     0
              4e1cf4fe-3853-4952-a40d-a6b76dbbff81  REMOVED      0     0     0
              c2fefdf0-ef37-4db9-84a5-a7d8955686a5  ONLINE       0     0     0
            056e938d-b5b9-449f-a6a6-acac72b82a87    ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d    ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0    ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3    ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1    ONLINE       0     0     0

errors: No known data errors

  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:03 with 0 errors on Fri Jan 12 03:45:05 2024
config:

        NAME           STATE     READ WRITE CKSUM
        boot-pool      ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            nvme0n1p3  ONLINE       0     0     0
            nvme1n1p3  ONLINE       0     0     0

errors: No known data errors

Code:

root@filestore[~]# hdparm -I /dev/sd? | grep -E "Number|dev"
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
SG_IO: bad/missing sense data, sb[]:  72 05 20 00 00 00 00 34 00 0a 00 00 00 00 00 00 00 00 00 00 01 0a 00 00 00 00 00 00 00 00 00 00
/dev/sda:
/dev/sdb:
/dev/sdc:
/dev/sdd:
/dev/sde:
/dev/sdf:
/dev/sdg:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC3R6V
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdh:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC484K
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdi:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC8CZY
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdj:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC2WJM
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdk:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC8GFN
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum
/dev/sdl:
ATA device, with non-removable media
        Model Number:       ST20000NM007D-3DJ103
        Serial Number:      ZVTC5H9H
        device size with M = 1024*1024:    19074048 MBytes
        device size with M = 1000*1000:    20000588 MBytes (20000 GB)
        Standby timer values: spec'd by Standard, no device specific minimum

Once the resilver completes, I'll post another set of data. Even if it doesn't help me, it might be useful if this is a bug. I went ahead and ordered 6 more drives before the sale goes off - worst case, I'll fire those up as a new pool/single vdev, send the data over, nuke the old stuff, and build a second vdev. Who knows, I might feel feisty and go for a third vdev

joeschmuck · Jan 15, 2024

tvsjr said:
I've been playing with *nix and BSD for 30 years

You got me beat by almost 20 years. I do recall CPM back in the day and an oldie but goodie "BASIC".

You want to get fancy, show someone how to write a program in "Debug", and have it be the very first thing to run on the hard drive. You can do some very bad things with INT13.

Sorry, off topic.

Well it's looking promising. If you get to the end of the resilver and 4e1cf4fe-3853-4952-a40d-a6b76dbbff81 is still there. I know there is a CLI command that will remove it, well I'm 99% sure there is one. Search for that, "how to cli remove a drive from zfs pool" or similar. I'm certain I saw someone do that back in the FreeNAS days. It's still ZFS so I don't think the command is gone.

FOUND IT! Read this link, I hope this is the stuff you are looking for.

Replacing a dead disk in a zpool

I'm running Ubuntu Server 13.04 64-bit using native ZFS. I have a zpool consisting of 4 hard drives of which one died yesterday and now is not being recognized by the OS or the BIOS anymore.

askubuntu.com

tvsjr · Jan 15, 2024

joeschmuck said:
You got me beat by almost 20 years. I do recall CPM back in the day and an oldie but goodie "BASIC". You want to get fancy, show someone how to write a program in "Debug", and have it be the very first thing to run on the hard drive. You can do some very bad things with INT13.

Sorry, off topic.

Haha, yep, last year was the big 4-0, but I was a young geek. Cut my teeth on a Trash 80 and a few Kaypro CP/M boxen. And I've got some time with IRIX, HPUX, etc. Started with Slackware when it came as a bundle of floppies.

joeschmuck said:
Well it's looking promising. If you get to the end of the resilver and 4e1cf4fe-3853-4952-a40d-a6b76dbbff81 is still there. I know there is a CLI command that will remove it, well I'm 99% sure there is one. Search for that, "how to cli remove a drive from zfs pool" or similar. I'm certain I saw someone do that back in the FreeNAS days. It's still ZFS so I don't think the command is gone.

FOUND IT! Read this link, I hope this is the stuff you are looking for.

Replacing a dead disk in a zpool

I'm running Ubuntu Server 13.04 64-bit using native ZFS. I have a zpool consisting of 4 hard drives of which one died yesterday and now is not being recognized by the OS or the BIOS anymore.

askubuntu.com

So, the resilver is complete. Other than the pool not autoexpanding, everything *looks* normal.

Code:

root@filestore[~]# fdisk -l
Disk /dev/nvme1n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C5F2D93-78CA-4922-8A1C-70BA2B2BBF48

Device            Start        End    Sectors  Size Type
/dev/nvme1n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme1n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme1n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme1n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/nvme0n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E6299728-CC39-4238-88F3-E250BFFCBC3B

Device            Start        End    Sectors  Size Type
/dev/nvme0n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme0n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme0n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme0n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdl: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 889CD31F-CE3D-43D5-961C-C3ECCA23ECB1

Device     Start        End    Sectors  Size Type
/dev/sdl1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdj: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 33595098-4F22-4A0D-84EF-D3456F3349AF

Device     Start        End    Sectors  Size Type
/dev/sdj1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdk: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 8329A11C-66CC-41E8-BDFD-26AF4124ED5A

Device     Start        End    Sectors  Size Type
/dev/sdk1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdi: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0BCB972F-6CE2-4D3D-823A-1150E82B9036

Device     Start        End    Sectors  Size Type
/dev/sdi1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26B15138-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sda1      128    4194431    4194304    2G FreeBSD swap
/dev/sda2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28B89F29-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdc1      128    4194431    4194304    2G FreeBSD swap
/dev/sdc2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sde: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28058C68-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sde1      128    4194431    4194304    2G FreeBSD swap
/dev/sde2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdd: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2967F0D3-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdd1      128    4194431    4194304    2G FreeBSD swap
/dev/sdd2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdg: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: AFA789F2-0275-4BDA-B5E0-98B5F6FE38AB

Device     Start        End    Sectors  Size Type
/dev/sdg1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2A1C3503-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdb1      128    4194431    4194304    2G FreeBSD swap
/dev/sdb2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdf: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 275A588B-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdf1      128    4194431    4194304    2G FreeBSD swap
/dev/sdf2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdh: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 84908874-6FA0-4D96-815C-1E47DA97276C

Device     Start        End    Sectors  Size Type
/dev/sdh1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/mapper/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Code:

root@filestore[~]# lsblk -o name,partuuid,fstype,size
NAME        PARTUUID                             FSTYPE             SIZE
sda                                                                 3.6T
├─sda1      26c0769e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sda2      26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdb                                                                 3.6T
├─sdb1      2a2bb3de-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdb2      2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdc                                                                 3.6T
├─sdc1      28c5f5d1-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdc2      28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdd                                                                 3.6T
├─sdd1      2976efe0-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdd2      298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sde                                                                 3.6T
├─sde1      2815200e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sde2      28279512-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdf                                                                 3.6T
├─sdf1      2769ac8f-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdf2      277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdg                                                                18.2T
└─sdg1      c2fefdf0-ef37-4db9-84a5-a7d8955686a5 zfs_member         2.7T
sdh                                                                18.2T
└─sdh1      056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member         2.7T
sdi                                                                18.2T
└─sdi1      87a9889c-557c-4692-b583-21b648dc7d7d zfs_member         2.7T
sdj                                                                18.2T
└─sdj1      478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member         2.7T
sdk                                                                18.2T
└─sdk1      35818415-8885-4671-bee1-d3d5a30970e3 zfs_member         2.7T
sdl                                                                18.2T
└─sdl1      e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member         2.7T
nvme1n1                                                             1.9T
├─nvme1n1p1 b9e725ec-f3ce-487e-a069-6d2cc0176d85                      1M
├─nvme1n1p2 6ca2e420-1aa3-4570-9611-db8f4cdee622 vfat               512M
├─nvme1n1p3 2fb772ff-12c4-4d30-abca-c8445fd24ef5 zfs_member         1.8T
└─nvme1n1p4 dd5a6688-6a80-450d-986e-5fade642b37c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G
nvme0n1                                                             1.9T
├─nvme0n1p1 0f42cfba-b2e1-440d-8f86-291ab99c0700                      1M
├─nvme0n1p2 99142a8d-c0b0-4b6a-b64b-afbb7b97e081 vfat               512M
├─nvme0n1p3 0871f49a-6b82-46c7-8662-581a4241cd49 zfs_member         1.8T
└─nvme0n1p4 c1c15327-fb2c-48fe-811d-21da1acfb87c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G

Code:

root@filestore[~]# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3                                       38T  35.1T  2.92T        -         -     9%    92%  1.00x    ONLINE  /mnt
  raidz2-0                                21.8T  19.6T  2.19T        -         -     9%  89.9%      -    ONLINE
    26d2563c-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    277b82e6-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28279512-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28d8c104-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    298897cc-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    2a3c8e32-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
  raidz2-1                                16.2T  15.5T   754G        -         -    11%  95.5%      -    ONLINE
    c2fefdf0-ef37-4db9-84a5-a7d8955686a5  2.73T      -      -        -         -      -      -      -    ONLINE
    056e938d-b5b9-449f-a6a6-acac72b82a87  2.73T      -      -        -         -      -      -      -    ONLINE
    87a9889c-557c-4692-b583-21b648dc7d7d  2.73T      -      -        -         -      -      -      -    ONLINE
    478c82ba-c501-4a7c-b2c7-d2fdc631efe0  2.73T      -      -        -         -      -      -      -    ONLINE
    35818415-8885-4671-bee1-d3d5a30970e3  2.73T      -      -        -         -      -      -      -    ONLINE
    e63df09f-e344-46a2-8653-f8d266886cd1  2.73T      -      -        -         -      -      -      -    ONLINE
boot-pool                                 1.84T  2.88G  1.84T        -         -     0%     0%  1.00x    ONLINE  -
  mirror-0                                1.84T  2.88G  1.84T        -         -     0%  0.15%      -    ONLINE
    nvme0n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE
    nvme1n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: ONLINE
  scan: resilvered 2.59T in 04:06:52 with 0 errors on Tue Jan 16 00:42:49 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        Tier3                                     ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
          raidz2-1                                ONLINE       0     0     0
            c2fefdf0-ef37-4db9-84a5-a7d8955686a5  ONLINE       0     0     0
            056e938d-b5b9-449f-a6a6-acac72b82a87  ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d  ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0  ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3  ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1  ONLINE       0     0     0

errors: No known data errors

  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:03 with 0 errors on Fri Jan 12 03:45:05 2024
config:

        NAME           STATE     READ WRITE CKSUM
        boot-pool      ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            nvme0n1p3  ONLINE       0     0     0
            nvme1n1p3  ONLINE       0     0     0

errors: No known data errors

The GUI shows everything as being happy as well:

But, I know if I do what I've done in the past and try to extend from the GUI, I'll end up right back in the same spot.

The only thing I've considered is exporting the pool and reimporting it. Sadly, I don't have anywhere to back all of the contents up due to the size... so I want to tread carefully.

The 20TB drives went off special, but I grabbed 18TB drives. They will be here Friday - then I have to decide if I want to endure another ~10 days of badblocks and smart testing or just go for it. They will still have plenty of capacity to bring online, replicate the existing data, destroy the old pool, then add in.

Decisions, decisions...

danb35 · Jan 16, 2024

Manually partitioning the drives and doing the replacement at the CLI is always an option, but kind of defeats the purpose of having the GUI. It's worth reporting a bug on this, I think--it isn't quite the same behavior I'd seen in 23.10.0.1, but it's still buggy. And also somewhat disconcerting, as I'm planning on doing the same thing once badblocks finishes on the 6 x 16 TB disks that arrived last week.

joeschmuck · Jan 16, 2024

tvsjr said:
The only thing I've considered is exporting the pool and reimporting it. Sadly, I don't have anywhere to back all of the contents up due to the size... so I want to tread carefully.

This is a safe operation from the GUI. Just export the pool, reboot, then import the pool. I hope that works. But if not not take a look at this link, maybe these steps will help.

Why isn't my ZFS pool expanding using ZFS on Linux?

I have a ZFS zpool on linux under kernel 2.6.32-431.11.2.el6.x86_64 which has a single vdev. The vdev is a SAN device. I expanded the size of the SAN, and despite the zpool having autoexpand set to...

serverfault.com

tvsjr · Jan 16, 2024

So...

Export/reboot/reimport? Nope, no change. Not degraded, but still the old size.
Zpool online -e <pool> <uuid> for each drive? Nope, no change.
Turn autoexpand off, try the zpool online -e on each drive? Still no change.
Offline, then online -e each drive? With autoexpand either off or on? No change.

I figured what the hell, I'd try the expand from GUI again... yep, same error:

Nothing got bigger, but everything stayed online:

Code:

root@filestore[~]# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3                                       38T  35.1T  2.93T        -         -     9%    92%  1.00x    ONLINE  /mnt
  raidz2-0                                21.8T  19.6T  2.19T        -         -     9%  89.9%      -    ONLINE
    26d2563c-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    277b82e6-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28279512-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28d8c104-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    298897cc-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    2a3c8e32-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
  raidz2-1                                16.2T  15.5T   755G        -         -    11%  95.5%      -    ONLINE
    c2fefdf0-ef37-4db9-84a5-a7d8955686a5  2.73T      -      -        -         -      -      -      -    ONLINE
    056e938d-b5b9-449f-a6a6-acac72b82a87  2.73T      -      -        -         -      -      -      -    ONLINE
    87a9889c-557c-4692-b583-21b648dc7d7d  2.73T      -      -        -         -      -      -      -    ONLINE
    478c82ba-c501-4a7c-b2c7-d2fdc631efe0  2.73T      -      -        -         -      -      -      -    ONLINE
    35818415-8885-4671-bee1-d3d5a30970e3  2.73T      -      -        -         -      -      -      -    ONLINE
    e63df09f-e344-46a2-8653-f8d266886cd1  2.73T      -      -        -         -      -      -      -    ONLINE
boot-pool                                 1.84T  2.88G  1.84T        -         -     0%     0%  1.00x    ONLINE  -
  mirror-0                                1.84T  2.88G  1.84T        -         -     0%  0.15%      -    ONLINE
    nvme0n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE
    nvme1n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE

So, let's try a reboot... aaaaaand, fail. Let's get all the same data as provided previously just to stay consistent:

Code:

root@filestore[~]# fdisk -l
Disk /dev/nvme0n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E6299728-CC39-4238-88F3-E250BFFCBC3B

Device            Start        End    Sectors  Size Type
/dev/nvme0n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme0n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme0n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme0n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/nvme1n1: 1.86 TiB, 2048408248320 bytes, 4000797360 sectors
Disk model: KINGSTON SKC3000D2048G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C5F2D93-78CA-4922-8A1C-70BA2B2BBF48

Device            Start        End    Sectors  Size Type
/dev/nvme1n1p1     4096       6143       2048    1M BIOS boot
/dev/nvme1n1p2     6144    1054719    1048576  512M EFI System
/dev/nvme1n1p3 34609152 4000797326 3966188175  1.8T Solaris /usr & Apple ZFS
/dev/nvme1n1p4  1054720   34609151   33554432   16G Linux swap

Partition table entries are not in disk order.


Disk /dev/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdf: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: AFA789F2-0275-4BDA-B5E0-98B5F6FE38AB

Device     Start         End     Sectors  Size Type
/dev/sdf1   2048 39063650270 39063648223 18.2T Solaris /usr & Apple ZFS


Disk /dev/sdg: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 33595098-4F22-4A0D-84EF-D3456F3349AF

Device     Start        End    Sectors  Size Type
/dev/sdg1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdk: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 889CD31F-CE3D-43D5-961C-C3ECCA23ECB1

Device     Start        End    Sectors  Size Type
/dev/sdk1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdj: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 8329A11C-66CC-41E8-BDFD-26AF4124ED5A

Device     Start        End    Sectors  Size Type
/dev/sdj1   4096 5856346112 5856342017  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdd: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 84908874-6FA0-4D96-815C-1E47DA97276C

Device     Start        End    Sectors  Size Type
/dev/sdd1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28B89F29-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdc1      128    4194431    4194304    2G FreeBSD swap
/dev/sdc2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdi: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2967F0D3-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdi1      128    4194431    4194304    2G FreeBSD swap
/dev/sdi2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdh: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2A1C3503-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdh1      128    4194431    4194304    2G FreeBSD swap
/dev/sdh2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 275A588B-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdb1      128    4194431    4194304    2G FreeBSD swap
/dev/sdb2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sdl: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 28058C68-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sdl1      128    4194431    4194304    2G FreeBSD swap
/dev/sdl2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/sde: 18.19 TiB, 20000588955648 bytes, 39063650304 sectors
Disk model: ST20000NM007D-3D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0BCB972F-6CE2-4D3D-823A-1150E82B9036

Device     Start        End    Sectors  Size Type
/dev/sde1   4096 5856344064 5856339969  2.7T Solaris /usr & Apple ZFS


Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: HUS724040ALS640
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26B15138-A9F6-11E5-B3E9-002590869C3C

Device       Start        End    Sectors  Size Type
/dev/sda1      128    4194431    4194304    2G FreeBSD swap
/dev/sda2  4194432 7814037134 7809842703  3.6T Solaris /usr & Apple ZFS


Disk /dev/mapper/md127: 15.98 GiB, 17162043392 bytes, 33519616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Code:

root@filestore[~]# lsblk -o name,partuuid,fstype,size
NAME        PARTUUID                             FSTYPE             SIZE
sda                                                                 3.6T
├─sda1      26c0769e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sda2      26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdb                                                                 3.6T
├─sdb1      2769ac8f-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdb2      277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdc                                                                 3.6T
├─sdc1      28c5f5d1-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdc2      28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdd                                                                18.2T
└─sdd1      056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member         2.7T
sde                                                                18.2T
└─sde1      87a9889c-557c-4692-b583-21b648dc7d7d zfs_member         2.7T
sdf                                                                18.2T
└─sdf1      c2fefdf0-ef37-4db9-84a5-a7d8955686a5                   18.2T
sdg                                                                18.2T
└─sdg1      478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member         2.7T
sdh                                                                 3.6T
├─sdh1      2a2bb3de-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdh2      2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdi                                                                 3.6T
├─sdi1      2976efe0-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdi2      298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
sdj                                                                18.2T
└─sdj1      35818415-8885-4671-bee1-d3d5a30970e3 zfs_member         2.7T
sdk                                                                18.2T
└─sdk1      e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member         2.7T
sdl                                                                 3.6T
├─sdl1      2815200e-a9f6-11e5-b3e9-002590869c3c                      2G
└─sdl2      28279512-a9f6-11e5-b3e9-002590869c3c zfs_member         3.6T
nvme0n1                                                             1.9T
├─nvme0n1p1 0f42cfba-b2e1-440d-8f86-291ab99c0700                      1M
├─nvme0n1p2 99142a8d-c0b0-4b6a-b64b-afbb7b97e081 vfat               512M
├─nvme0n1p3 0871f49a-6b82-46c7-8662-581a4241cd49 zfs_member         1.8T
└─nvme0n1p4 c1c15327-fb2c-48fe-811d-21da1acfb87c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G
nvme1n1                                                             1.9T
├─nvme1n1p1 b9e725ec-f3ce-487e-a069-6d2cc0176d85                      1M
├─nvme1n1p2 6ca2e420-1aa3-4570-9611-db8f4cdee622 vfat               512M
├─nvme1n1p3 2fb772ff-12c4-4d30-abca-c8445fd24ef5 zfs_member         1.8T
└─nvme1n1p4 dd5a6688-6a80-450d-986e-5fade642b37c linux_raid_member   16G
  └─md127                                                            16G
    └─md127                                      swap                16G

Code:

root@filestore[~]# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Tier3                                       38T  35.1T  2.93T        -         -     9%    92%  1.00x  DEGRADED  /mnt
  raidz2-0                                21.8T  19.6T  2.19T        -         -     9%  89.9%      -    ONLINE
    26d2563c-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    277b82e6-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28279512-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    28d8c104-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    298897cc-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
    2a3c8e32-a9f6-11e5-b3e9-002590869c3c  3.64T      -      -        -         -      -      -      -    ONLINE
  raidz2-1                                16.2T  15.5T   755G        -         -    11%  95.5%      -  DEGRADED
    8503278285498570938                   18.2T      -      -        -         -      -      -      -   UNAVAIL
    056e938d-b5b9-449f-a6a6-acac72b82a87  2.73T      -      -        -         -      -      -      -    ONLINE
    87a9889c-557c-4692-b583-21b648dc7d7d  2.73T      -      -        -         -      -      -      -    ONLINE
    478c82ba-c501-4a7c-b2c7-d2fdc631efe0  2.73T      -      -        -         -      -      -      -    ONLINE
    35818415-8885-4671-bee1-d3d5a30970e3  2.73T      -      -        -         -      -      -      -    ONLINE
    e63df09f-e344-46a2-8653-f8d266886cd1  2.73T      -      -        -         -      -      -      -    ONLINE
boot-pool                                 1.84T  2.88G  1.84T        -         -     0%     0%  1.00x    ONLINE  -
  mirror-0                                1.84T  2.88G  1.84T        -         -     0%  0.15%      -    ONLINE
    nvme0n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE
    nvme1n1p3                             1.85T      -      -        -         -      -      -      -    ONLINE

Code:

root@filestore[~]# zpool status
  pool: Tier3
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: resilvered 36K in 00:00:01 with 0 errors on Tue Jan 16 16:32:57 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        Tier3                                     DEGRADED     0     0     0
          raidz2-0                                ONLINE       0     0     0
            26d2563c-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            277b82e6-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28279512-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            28d8c104-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            298897cc-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
            2a3c8e32-a9f6-11e5-b3e9-002590869c3c  ONLINE       0     0     0
          raidz2-1                                DEGRADED     0     0     0
            8503278285498570938                   UNAVAIL      0     0     0  was /dev/disk/by-partuuid/c2fefdf0-ef37-4db9-84a5-a7d8955686a5
            056e938d-b5b9-449f-a6a6-acac72b82a87  ONLINE       0     0     0
            87a9889c-557c-4692-b583-21b648dc7d7d  ONLINE       0     0     0
            478c82ba-c501-4a7c-b2c7-d2fdc631efe0  ONLINE       0     0     0
            35818415-8885-4671-bee1-d3d5a30970e3  ONLINE       0     0     0
            e63df09f-e344-46a2-8653-f8d266886cd1  ONLINE       0     0     0

errors: No known data errors

  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:03 with 0 errors on Fri Jan 12 03:45:05 2024
config:

        NAME           STATE     READ WRITE CKSUM
        boot-pool      ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            nvme0n1p3  ONLINE       0     0     0
            nvme1n1p3  ONLINE       0     0     0

errors: No known data errors

The serial is the same one we saw before... but it's also the first member of the vdev, so that might be just coincidental:

Code:

root@filestore[~]# smartctl -x /dev/sdf | grep -i serial
Serial Number:    ZVTC3R6V

Looking at the lsblk output, the interesting part is that the zfs partition on that drive was expanded:

Code:

sdf                                                                18.2T
└─sdf1      c2fefdf0-ef37-4db9-84a5-a7d8955686a5                   18.2T

So at least it's trying. Since the uuids match, let's try putting it online:

Code:

root@filestore[~]# zpool online Tier3 c2fefdf0-ef37-4db9-84a5-a7d8955686a5
warning: device 'c2fefdf0-ef37-4db9-84a5-a7d8955686a5' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present

If I do a replace (from past experience), it'll fuss. Even wipefs isn't sufficient, nor is dd-ing the first and last few GB of the drive. If I wipe the entire drive to zeroes, then I can do a replace/resilver... and, once done (36ish hours later), we're right back to the same situation - the vdev will be fully happy again, but no expansion in size.

The only thing I can think of that's somewhat abnormal for this pool is the fact that I'm well above 80% capacity. Otherwise, it's a pretty boring configuration.

I will open a bug just for documentation purposes, maybe it'll help someone help. I'm open to other attempts if anyone can think of something to try!

joeschmuck · Jan 16, 2024

I figured out the problem, I just can't believe it was staring me in the face, well all of us. I've been looking into a possible problem and yea, so obvious. The issue has to do with the rotation speed of the drives during a full moon in Texas. The moon is closer this time of the year and you have higher tides, much worse weather, and of course the counterclockwise rotation of the platters fighting against the clockwise rotation of the earth. If you flip your drive upside down and level with the vertical deflection (direction gravity is pulling for a given location) then the two rotations should cancel out and it might work, but you still have the issue of being in Texas. Well maybe there is no hope.

Run this command:
sudo Im -a -schm -uck Tier3 -FORCE /mnt/meyebulsh1t

Hopefully that command made you happy.

Have you booted up CORE yet? The most current version just to see if the pool can be imported? If it can, maybe it will auto-expand or maybe you would try to resilver the drives back into the system. I'm flat out of ideas now.

I wish you the best of luck figuring this out but I fear you will need to backup everything, destroy your pool and recreate. I would like to state that if I were in your shoes, I would boot up TrueNAS 13.0-U6.1 or even FreeNAS 12 just to create your pool, and NEVER upgrade your ZFS version again unless there is a new feature you NEED. This preserves the ability to roll back. I switch between CORE and SCALE all the time on two different machines. One system had the drives built during FreeNAS 11, my new system using TrueNAS 13.0-U6. It works great and I never export my pools, they just keep on supporting me. I'm not saying you should switch back and forth, but I am saying it gives you the option should you ever need it.

tvsjr · Jan 16, 2024

joeschmuck said:
I figured out the problem, I just can't believe it was staring me in the face, well all of us. I've been looking into a possible problem and yea, so obvious. The issue has to do with the rotation speed of the drives during a full moon in Texas. The moon is closer this time of the year and you have higher tides, much worse weather, and of course the counterclockwise rotation of the platters fighting against the clockwise rotation of the earth. If you flip your drive upside down and level with the vertical deflection (direction gravity is pulling for a given location) then the two rotations should cancel out and it might work, but you still have the issue of being in Texas. Well maybe there is no hope.

Run this command:
sudo Im -a -schm -uck Tier3 -FORCE /mnt/meyebulsh1t

Hopefully that command made you happy.

Have you booted up CORE yet? The most current version just to see if the pool can be imported? If it can, maybe it will auto-expand or maybe you would try to resilver the drives back into the system. I'm flat out of ideas now.

I wish you the best of luck figuring this out but I fear you will need to backup everything, destroy your pool and recreate. I would like to state that if I were in your shoes, I would boot up TrueNAS 13.0-U6.1 or even FreeNAS 12 just to create your pool, and NEVER upgrade your ZFS version again unless there is a new feature you NEED. This preserves the ability to roll back. I switch between CORE and SCALE all the time on two different machines. One system had the drives built during FreeNAS 11, my new system using TrueNAS 13.0-U6. It works great and I never export my pools, they just keep on supporting me. I'm not saying you should switch back and forth, but I am saying it gives you the option should you ever need it.

Haha, love the command. Actually, it could just be the ludicrous cold we've been experiencing! My heaters, vehicles, fire apparatus, and most especially me are not operating well... it might be extending to TN

I haven't tried rolling back to Core yet, but I did upgrade the pool to current on 23.10.1 so I don't have much hope. I might consider going to Core with the new pool... at this point, I'm not looking for anything more exotic than a filer. All the virtualization, etc. is on far more capable gear that's designed for that purpose. I just figured Scale was where iX wanted to go long term, so I may as well get on the boat.

I did open a bug and it's already been assigned to someone, so maybe some answers will be forthcoming:

[NAS-126809] - iXsystems TrueNAS Jira

ixsystems.atlassian.net

Important Announcement for the TrueNAS Community.

Problem expanding pool after drive upgrade

Guru

Hall of Famer

Guru

Guru

Old Man

Guru

Old Man

Not strong, but bad

Guru

Guru

Guru

Old Man

Guru

Old Man

Guru

Hall of Famer

Old Man

Guru

Old Man

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Problem expanding pool after drive upgrade"

Similar threads