Good day:
I've been away from the forums for years, but have run into a perplexing problem. Let's start with the system:
Supermicro X11SSH-LN4F motherboard
E3-1275v6 CPU
64GB ECC RAM
This system has been iteratively upgraded over the years, starting with Core and now moving to scale. The only pool that currently exists on the system consists of two RAIDZ2 VDEVs... one 6x3TB, one 6x4TB. One 6-drive set is connected to the onboard SATA controller, the other 6-drive set is connected to the venerable SAS2008 HBA properly flashed for IT mode. This hardware has been dead nuts reliable for many years.
One of the 3TB drives started throwing a couple SMART errors and, being those drives are 10 years old and have more than 80,000 hours each, I figured it might be smart to get a little proactive and swap them. Plus, hey, more space, right? So, I picked up 6 20TB Seagate Exos Enterprise drives. Each drive went through a proper break-in process... short/conveyance SMART tests, a badblocks run (geez, that took for-effing-ever), and a SMART long test. Fast forward almost 10 days and I've got 6 minty, proven drives ready to go.
I go through the standard process of carefully swapping one drive at a time, doing the replace, and letting the resilver complete. The process goes swimmingly - no errors, no downtime, but *err?* the pool doesn't grow once the 6th drive is replaced. Autoexpand is on. I try a manual expand from the GUI, which fails with an error (which I didn't write down and can't replicate without another 24 hour process) about the partitions not being written, can't inform the kernel, recommend reboot. The pool then drops to a degraded state, with one drive show unavailable.
I figure something's weird, because the very same gptid shows to be online. And, of course, the drive has ZFS data so TN really doesn't want to make it part of the pool. So, I did a complete wipe to zero from the GUI. This is after the wipe, before replacement:
I start the replace/resilver process...
That completes and all is well... buuut, have we entered a temporal causality loop?
No autoextend! And yes, autoextend is on:
Go to the GUI, tell it to expand, and... yep:
I'm kinda out of things to try on this one. I highly doubt I have a hardware issue - this stuff has been very reliable for many years. And yes, I've been through this same process multiple times with the same result.
The only possible thought I came up with was pool utilization... I was bad and let things get to 91%. Would this prevent extending the drive? And why would it cause a drive to drop offline every time?
Unless someone has a better option, my next thought (which is expensive) is to procure a 16-port SAS HBA and 6 more drives. Light 6 new drives up in a new pool, replicate everything over from the old pool (which would still be smaller than the 6x20TB), nuke all the old drives, then add the other 6x20TB drives back (after wiping them) as a second VDEV. On the plus side, even more free space! On the minus side, that's a $2.5K proposition that I'd kinda like to avoid.
Does anyone see something I'm missing?
I've been away from the forums for years, but have run into a perplexing problem. Let's start with the system:
Supermicro X11SSH-LN4F motherboard
E3-1275v6 CPU
64GB ECC RAM
This system has been iteratively upgraded over the years, starting with Core and now moving to scale. The only pool that currently exists on the system consists of two RAIDZ2 VDEVs... one 6x3TB, one 6x4TB. One 6-drive set is connected to the onboard SATA controller, the other 6-drive set is connected to the venerable SAS2008 HBA properly flashed for IT mode. This hardware has been dead nuts reliable for many years.
One of the 3TB drives started throwing a couple SMART errors and, being those drives are 10 years old and have more than 80,000 hours each, I figured it might be smart to get a little proactive and swap them. Plus, hey, more space, right? So, I picked up 6 20TB Seagate Exos Enterprise drives. Each drive went through a proper break-in process... short/conveyance SMART tests, a badblocks run (geez, that took for-effing-ever), and a SMART long test. Fast forward almost 10 days and I've got 6 minty, proven drives ready to go.
I go through the standard process of carefully swapping one drive at a time, doing the replace, and letting the resilver complete. The process goes swimmingly - no errors, no downtime, but *err?* the pool doesn't grow once the 6th drive is replaced. Autoexpand is on. I try a manual expand from the GUI, which fails with an error (which I didn't write down and can't replicate without another 24 hour process) about the partitions not being written, can't inform the kernel, recommend reboot. The pool then drops to a degraded state, with one drive show unavailable.
I figure something's weird, because the very same gptid shows to be online. And, of course, the drive has ZFS data so TN really doesn't want to make it part of the pool. So, I did a complete wipe to zero from the GUI. This is after the wipe, before replacement:
Code:
root@filestore[~]# zpool status pool: Tier3 state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub repaired 0B in 07:57:12 with 0 errors on Fri Jan 12 11:57:17 2024 config: NAME STATE READ WRITE CKSUM Tier3 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 26d2563c-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 277b82e6-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28279512-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28d8c104-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 298897cc-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 2a3c8e32-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 12098450062948849965 OFFLINE 0 0 0 was /dev/disk/by-partuuid/67f1318c-713b-4b68-abbe-0c55a7c04d77 056e938d-b5b9-449f-a6a6-acac72b82a87 ONLINE 0 0 0 87a9889c-557c-4692-b583-21b648dc7d7d ONLINE 0 0 0 478c82ba-c501-4a7c-b2c7-d2fdc631efe0 ONLINE 0 0 0 35818415-8885-4671-bee1-d3d5a30970e3 ONLINE 0 0 0 e63df09f-e344-46a2-8653-f8d266886cd1 ONLINE 0 0 0 errors: No known data errors root@filestore[~]# lsblk -o name,partuuid,fstype,size NAME PARTUUID FSTYPE SIZE sda 3.6T ├─sda1 26c0769e-a9f6-11e5-b3e9-002590869c3c 2G └─sda2 26d2563c-a9f6-11e5-b3e9-002590869c3c zfs_member 3.6T sdb 3.6T ├─sdb1 2976efe0-a9f6-11e5-b3e9-002590869c3c 2G └─sdb2 298897cc-a9f6-11e5-b3e9-002590869c3c zfs_member 3.6T sdc 3.6T ├─sdc1 2a2bb3de-a9f6-11e5-b3e9-002590869c3c 2G └─sdc2 2a3c8e32-a9f6-11e5-b3e9-002590869c3c zfs_member 3.6T sdd 3.6T ├─sdd1 28c5f5d1-a9f6-11e5-b3e9-002590869c3c 2G └─sdd2 28d8c104-a9f6-11e5-b3e9-002590869c3c zfs_member 3.6T sde 3.6T ├─sde1 2815200e-a9f6-11e5-b3e9-002590869c3c 2G └─sde2 28279512-a9f6-11e5-b3e9-002590869c3c zfs_member 3.6T sdf 18.2T sdg 3.6T ├─sdg1 2769ac8f-a9f6-11e5-b3e9-002590869c3c 2G └─sdg2 277b82e6-a9f6-11e5-b3e9-002590869c3c zfs_member 3.6T sdh 18.2T └─sdh1 056e938d-b5b9-449f-a6a6-acac72b82a87 zfs_member 2.7T sdi 18.2T └─sdi1 87a9889c-557c-4692-b583-21b648dc7d7d zfs_member 2.7T sdj 18.2T └─sdj1 35818415-8885-4671-bee1-d3d5a30970e3 zfs_member 2.7T sdk 18.2T └─sdk1 e63df09f-e344-46a2-8653-f8d266886cd1 zfs_member 2.7T sdl 18.2T └─sdl1 478c82ba-c501-4a7c-b2c7-d2fdc631efe0 zfs_member 2.7T
I start the replace/resilver process...
Code:
root@filestore[~]# zpool status pool: Tier3 state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Fri Jan 12 13:53:00 2024 14.1T / 34.8T scanned at 6.78G/s, 1.80T / 26.8T issued at 886M/s 303G resilvered, 6.72% done, 08:13:14 to go config: NAME STATE READ WRITE CKSUM Tier3 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 26d2563c-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 277b82e6-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28279512-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28d8c104-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 298897cc-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 2a3c8e32-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 12098450062948849965 OFFLINE 0 0 0 was /dev/disk/by-partuuid/67f1318c-713b-4b68-abbe-0c55a7c04d77 4e1cf4fe-3853-4952-a40d-a6b76dbbff81 ONLINE 0 0 0 (resilvering) 056e938d-b5b9-449f-a6a6-acac72b82a87 ONLINE 0 0 0 87a9889c-557c-4692-b583-21b648dc7d7d ONLINE 0 0 0 478c82ba-c501-4a7c-b2c7-d2fdc631efe0 ONLINE 0 0 0 35818415-8885-4671-bee1-d3d5a30970e3 ONLINE 0 0 0 e63df09f-e344-46a2-8653-f8d266886cd1 ONLINE 0 0 0 errors: No known data errors
That completes and all is well... buuut, have we entered a temporal causality loop?
Code:
root@filestore[~]# zpool status pool: Tier3 state: ONLINE scan: resilvered 2.55T in 04:03:34 with 0 errors on Fri Jan 12 17:56:34 2024 config: NAME STATE READ WRITE CKSUM Tier3 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 26d2563c-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 277b82e6-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28279512-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28d8c104-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 298897cc-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 2a3c8e32-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 4e1cf4fe-3853-4952-a40d-a6b76dbbff81 ONLINE 0 0 0 056e938d-b5b9-449f-a6a6-acac72b82a87 ONLINE 0 0 0 87a9889c-557c-4692-b583-21b648dc7d7d ONLINE 0 0 0 478c82ba-c501-4a7c-b2c7-d2fdc631efe0 ONLINE 0 0 0 35818415-8885-4671-bee1-d3d5a30970e3 ONLINE 0 0 0 e63df09f-e344-46a2-8653-f8d266886cd1 ONLINE 0 0 0 errors: No known data errors root@filestore[~]# zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT Tier3 38T 34.8T 3.21T - - 9% 91% 1.00x ONLINE /mnt
No autoextend! And yes, autoextend is on:
Code:
root@filestore[~]# zpool get autoexpand Tier3 NAME PROPERTY VALUE SOURCE Tier3 autoexpand on local
Go to the GUI, tell it to expand, and... yep:
Code:
root@filestore[~]# zpool status pool: Tier3 state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J scan: resilvered 2.55T in 04:03:34 with 0 errors on Fri Jan 12 17:56:34 2024 config: NAME STATE READ WRITE CKSUM Tier3 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 26d2563c-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 277b82e6-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28279512-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 28d8c104-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 298897cc-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 2a3c8e32-a9f6-11e5-b3e9-002590869c3c ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 4e1cf4fe-3853-4952-a40d-a6b76dbbff81 UNAVAIL 0 0 0 056e938d-b5b9-449f-a6a6-acac72b82a87 ONLINE 0 0 0 87a9889c-557c-4692-b583-21b648dc7d7d ONLINE 0 0 0 478c82ba-c501-4a7c-b2c7-d2fdc631efe0 ONLINE 0 0 0 35818415-8885-4671-bee1-d3d5a30970e3 ONLINE 0 0 0 e63df09f-e344-46a2-8653-f8d266886cd1 ONLINE 0 0 0 errors: No known data errors
I'm kinda out of things to try on this one. I highly doubt I have a hardware issue - this stuff has been very reliable for many years. And yes, I've been through this same process multiple times with the same result.
The only possible thought I came up with was pool utilization... I was bad and let things get to 91%. Would this prevent extending the drive? And why would it cause a drive to drop offline every time?
Unless someone has a better option, my next thought (which is expensive) is to procure a 16-port SAS HBA and 6 more drives. Light 6 new drives up in a new pool, replicate everything over from the old pool (which would still be smaller than the 6x20TB), nuke all the old drives, then add the other 6x20TB drives back (after wiping them) as a second VDEV. On the plus side, even more free space! On the minus side, that's a $2.5K proposition that I'd kinda like to avoid.
Does anyone see something I'm missing?