Sanity check - Just got an alert about a failing drive that was replaced over a month ago.

EvilRSA

Cadet
Joined
Sep 21, 2023
Messages
3
Quick back story, we inherited two TrueNAS systems for one of our clients as the other IT company was not supporting them to the degree that our client was expecting. Both TrueNAS systems are running their VMware storage via iSCSI. When we started digging into these systems we saw that their pools were nearly maxed out, and one of them had a pool of 8 "off the shelf" PNY SSDs that were all failing. We got the details and a quote in front of the client for 8 new drives and the client signed right away. The new drives came and I "Offlined" one drive at a time, installing a new drive, selecting replace, and letting the system resilver each time. All eight PNY drives were replaced with Micron 5400's and the system has been running without errors for about 45 days now.

This weekend we got the following alert:
New alert:
* Pool SSD state is DEGRADED: One or more devices has experienced an error resulting in data corruption. Applications may be affected.
The following devices are not healthy:

* Disk PNY 1TB SATA SSD PNF16222322780900258 is DEGRADED

The part I can't figure out is where did it pull the old "Disk PNY 1TB SATA..." name from?

I checked the pool status, and see that ada4 is showing as DEGRADED

I pulled the stats for the pool and the ada4

Last login: Thu Sep 21 17:32:03 on pts/25 FreeBSD 13.1-RELEASE-p7 n245428-4dfb91682c1 TRUENAS TrueNAS (c) 2009-2023, iXsystems, Inc. All rights reserved. TrueNAS code is released under the modified BSD license with some files copyrighted by (c) iXsystems, Inc. For more information, documentation, help or support, go here: http://truenas.com Welcome to TrueNAS Warning: the supported mechanisms for making configuration changes are the TrueNAS WebUI and API exclusively. ALL OTHERS ARE NOT SUPPORTED AND WILL RESULT IN UNDEFINED BEHAVIOR AND MAY RESULT IN SYSTEM FAILURE. root@truenas[~]# zpool status -v pool: Pool1 state: ONLINE scan: scrub repaired 0B in 00:18:00 with 0 errors on Sun Sep 24 00:18:00 2023 config: NAME STATE READ WRITE CKSUM Pool1 ONLINE 0 0 0 gptid/21b9ac1b-e8b0-11ec-9a8c-008cfaf09b08 ONLINE 0 0 0 errors: No known data errors pool: SSD state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub repaired 0B in 01:06:46 with 4 errors on Sun Sep 24 01:06:46 2023 config: NAME STATE READ WRITE CKSUM SSD DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 gptid/525f5ed8-3b5e-11ee-bac2-008cfaf09b08 ONLINE 0 032 gptid/d6e03b8a-3cee-11ee-bac2-008cfaf09b08 ONLINE 0 032 gptid/67814ed4-3777-11ee-bac2-008cfaf09b08 ONLINE 0 032 gptid/5a658ccc-3d17-11ee-bac2-008cfaf09b08 ONLINE 0 032 gptid/a549fa34-3842-11ee-bac2-008cfaf09b08 DEGRADED 0 032 too many errors gptid/29fa84fd-3ab6-11ee-bac2-008cfaf09b08 ONLINE 0 032 gptid/d2ddef69-3c27-11ee-bac2-008cfaf09b08 ONLINE 0 032 gptid/27f7de11-3dc3-11ee-bac2-008cfaf09b08 ONLINE 0 032 errors: Permanent errors have been detected in the following files: SSD/Bkup-Zvol:<0x1> pool: boot-pool state: ONLINE scan: scrub repaired 0B in 00:01:47 with 0 errors on Wed Sep 20 03:46:47 2023 config: NAME STATE READ WRITE CKSUM boot-pool ONLINE 0 0 0 da0p2 ONLINE 0 0 0 errors: No known data errors root@truenas[~]# root@truenas[~]# smartctl -a /dev/ada4 smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE-p7 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: Micron_5400_MTFDDAK1T9TGA Serial Number: 22443C3B5571 LU WWN Device Id: 5 00a075 13c3b5571 Firmware Version: D4MU002 User Capacity: 1,920,383,410,176 bytes [1.92 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-4 (minor revision not indicated) SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Sep 26 13:27:12 2023 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 3403) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 14) minutes. Conveyance self-test routine recommended polling time: ( 3) minutes. SCT capabilities: (0x0035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 050 Pre-fail Always - 0 5 Reallocated_Sector_Ct 0x0032 100 100 001 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1108 12 Power_Cycle_Count 0x0032 100 100 001 Old_age Always - 3 170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0 171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 172 Unknown_Attribute 0x0032 100 100 001 Old_age Always - 0 173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 3 174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 194 Temperature_Celsius 0x0022 078 067 000 Old_age Always - 22 (Min/Max 15/33) 195 Hardware_ECC_Recovered 0x0032 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0 202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0 206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0 246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 2747849698 247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 85868231 248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 25816908 180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 11980 210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 211 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 4 212 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 318 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1094 - # 2 Short offline Completed without error 00% 1070 - # 3 Short offline Completed without error 00% 1046 - # 4 Short offline Completed without error 00% 1022 - # 5 Short offline Completed without error 00% 998 - # 6 Short offline Completed without error 00% 974 - # 7 Short offline Completed without error 00% 950 - # 8 Short offline Completed without error 00% 926 - # 9 Short offline Completed without error 00% 902 - #10 Short offline Completed without error 00% 878 - #11 Short offline Completed without error 00% 854 - #12 Short offline Completed without error 00% 830 - #13 Short offline Completed without error 00% 806 - #14 Short offline Completed without error 00% 782 - #15 Short offline Completed without error 00% 758 - #16 Short offline Completed without error 00% 734 - #17 Short offline Completed without error 00% 710 - #18 Short offline Completed without error 00% 686 - #19 Short offline Completed without error 00% 662 - #20 Short offline Completed without error 00% 638 - #21 Short offline Completed without error 00% 614 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Completed [00% left] (96631296-96696831) 4 0 Last login: Thu Sep 21 17:32:03 on pts/25 FreeBSD 13.1-RELEASE-p7 n245428-4dfb91682c1 TRUENAS TrueNAS (c) 2009-2023, iXsystems, Inc. All rights reserved. TrueNAS code is released under the modified BSD license with some files copyrighted by (c) iXsystems, Inc. For more information, documentation, help or support, go here: http://truenas.com Welcome to TrueNAS Warning: the supported mechanisms for making configuration changes are the TrueNAS WebUI and API exclusively. ALL OTHERS ARE NOT SUPPORTED AND WILL RESULT IN UNDEFINED BEHAVIOR AND MAY RESULT IN SYSTEM FAILURE. root@truenas[~]# smartctl -a /dev/ada4 smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE-p7 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: Micron_5400_MTFDDAK1T9TGA Serial Number: 22443C3B5571 LU WWN Device Id: 5 00a075 13c3b5571 Firmware Version: D4MU002 User Capacity: 1,920,383,410,176 bytes [1.92 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-4 (minor revision not indicated) SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Sep 26 13:27:12 2023 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 3403) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 14) minutes. Conveyance self-test routine recommended polling time: ( 3) minutes. SCT capabilities: (0x0035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 050 Pre-fail Always - 0 5 Reallocated_Sector_Ct 0x0032 100 100 001 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1108 12 Power_Cycle_Count 0x0032 100 100 001 Old_age Always - 3 170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0 171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 172 Unknown_Attribute 0x0032 100 100 001 Old_age Always - 0 173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 3 174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 194 Temperature_Celsius 0x0022 078 067 000 Old_age Always - 22 (Min/Max 15/33) 195 Hardware_ECC_Recovered 0x0032 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0 202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0 206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0 246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 2747849698 247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 85868231 248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 25816908 180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 11980 210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 211 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 4 212 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 318 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1094 - # 2 Short offline Completed without error 00% 1070 - # 3 Short offline Completed without error 00% 1046 - # 4 Short offline Completed without error 00% 1022 - # 5 Short offline Completed without error 00% 998 - # 6 Short offline Completed without error 00% 974 - # 7 Short offline Completed without error 00% 950 - # 8 Short offline Completed without error 00% 926 - # 9 Short offline Completed without error 00% 902 - #10 Short offline Completed without error 00% 878 - #11 Short offline Completed without error 00% 854 - #12 Short offline Completed without error 00% 830 - #13 Short offline Completed without error 00% 806 - #14 Short offline Completed without error 00% 782 - #15 Short offline Completed without error 00% 758 - #16 Short offline Completed without error 00% 734 - #17 Short offline Completed without error 00% 710 - #18 Short offline Completed without error 00% 686 - #19 Short offline Completed without error 00% 662 - #20 Short offline Completed without error 00% 638 - #21 Short offline Completed without error 00% 614 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Completed [00% left] (96631296-96696831) 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@truenas[~]#

So I don't know if I missed a step in drive replacement that clears out old names or old status conditions, but I just don't know where it pulled the old drive name from. I honestly got worried that there was still another one of the failing PNY drives in the system, but looking at the disks, I don't see one.

Would love some guidance on how to best handle this.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
You seem to have a load of chksum errors - which quite often indicates a cabling issue or an HBA problem. Given its a client machine I guess it isn't the machine in your sig which mentions WD Pro's - which means that we have no idea what we are dealing with. Is this a proper server, or something made out of gaming gear.

In short please post the full hardware, including case & PSU.

Minor comment - Which model of Micron 5400's as depending on the model I would consider them as unsuitable for use as an iSCSI volume. These seem to be the PRO model - which IMHO falls into the not terribly suitable {not that that would cause this issue)


Oh - and "SSD/Bkup-Zvol:<0x1>" thats a metadata issue that will need the zvol trashing and setting up again. Given its a backup zvol I doubt this will cause a major issue - however please make sure your backups are good

Also - iSCSI and RAIDZ - that a recipe for a very underperforming pool. That coupled with Pro drives - its not a recipe for success IMHO
 
Last edited:

EvilRSA

Cadet
Joined
Sep 21, 2023
Messages
3
You seem to have a load of chksum errors - which quite often indicates a cabling issue or an HBA problem. Given its a client machine I guess it isn't the machine in your sig which mentions WD Pro's - which means that we have no idea what we are dealing with. Is this a proper server, or something made out of gaming gear.

In short please post the full hardware, including case & PSU.

Minor comment - Which model of Micron 5400's as depending on the model I would consider them as unsuitable for use as an iSCSI volume. These seem to be the PRO model - which IMHO falls into the not terribly suitable {not that that would cause this issue)


Oh - and "SSD/Bkup-Zvol:<0x1>" thats a metadata issue that will need the zvol trashing and setting up again. Given its a backup zvol I doubt this will cause a major issue - however please make sure your backups are good

Also - iSCSI and RAIDZ - that a recipe for a very underperforming pool. That coupled with Pro drives - its not a recipe for success IMHO
I can't believe I posted that without the system information, sorry about that.

The server is:
TrueNAS core: 13.0-U5.3
Chassis: 8-bay IBM Lenovo ThinkServer RD450
PSU: Dual hot-swappable 550-watt PSUs.
CPU: Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz
RAM: 16GB
Boot: 32 GB SanDisk 3.2 Gen1 microSD in a microSD to USB adapter. (I wish I was joking)
Pool "SSD": Micron 5400 PRO - SSD - 1.92 TB - SATA 6Gb/s x8
Pool "Pool1": Samsung 3.2TB NVME PCIe 3.0 x8 PM1725 x1
Log: not present
NIC: Dual 1 GB NIC on server board and PCIe 10GB SFP for iSCSI

You are correct, the Micro 5400s are the 1.92 TB PRO variants.

I don't know the history of "Bkup-Zvol" volume, but somewhere along the line, the previous company moved two live VMware guests onto this pool. I'm sure that there is corruption as when I replaced the first drive, I chose the one that seemed to be the closest to death, and that resilvering process was not completed without errors. Each drive after that seemed to yield the same error count, and I just told myself lets get these failing drives replaced before we lose the whole pool, and future me will deal with the errors. Maybe I was too panicky but seeing a drive already reporting 101 Offline uncorrectable sectors and others with an error count of 42,646 and 27,452 I felt this was a house of cards.

I don't typically mess with iSCSI, so I'm pretty inexperienced with it and will do some searching in the forums and Googling for best or better configurations, but it is good to know that we're already in a bad configuration with iSCSI running in a RAIDZ.

My plan is to see if it's possible to free up enough space on their other pools to move both of the servers stored on the "Bkup-Zvol" volume to different pools and then delete the "Bkup-Zvol" pool and recreate a new pool. In the meantime, I need to nurse this pool along until I can make that happen.

Please ask any additional questions about hardware and setup as needed.


I welcome any criticism as I didn't build these NASs, but I get to support them now. (Yay me.) There are definitely things I would do differently with these boxes and possible pitfalls I would have also landed in. Pointing out any additional things that are "wrong" or not "best practice" won't hurt my feelings, as it just helps me support this client better if I know about them and I can address them. :)

Thanks for any and all help from everyone.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
OK - things to do (IMO)
1. Bump RAM to 64GB
2. Move all VM's off this machines onto other storage / other servers, then redo the pool completely as mirrors. Note that this will significantly reduce available disk space. After that do not use > 50% of the pool as zvol space (or anything else). VM's need IOPS, RAIDZn does not supply IOPS
3. Replace the Samsung PM1725 with an Optane Drive (yes you can still get them - but get a few of them as spares, 900P or better, 1600x is another option), or an RMS300/16G or similar. Use this drive as a SLOG. You might be OK using the Samsung PM1725 as a SLOG, but its not ideal (and a waste)
4. Honestly - those CPU's (is it a dual or a single?) are a little slow for my liking
5. Longer term replace those SSD's with the MAX varient (or similar). Something thats either mixed-mode endurance or better (thus more expensive, but lasts longer)

However you should NOT be getting chksum errors - thats really bad (whilst not immediately fatal). You have no HBA, meaning that the drives are plugged into the motherboard. You could try replacing the cables - but honestly I doubt thats the problem - not 8 seperate cables. My Opinion is that you need to move the VM's elsewhere whilst you work on this server to find out whats wrong. It might be that the SATA controller is starting to fail. Actually reading the spec, the drive connectivity is SAS - meaning that there is some sort of controller. I quote
  • The RD450 provides Lenovo AnyRAID technology, a midplane RAID adapter that connects directly to the drive backplane without consuming a PCIe slot.
This is a red flag warning - this machine may be entirely*** unsuitable to use with TrueNAS. I/You need to know a lot more about how the drives are being connected to the motherboard / server and what this AnyRAID is. You might do better buying a 2nd hand LSI HBA 93xx, not a MegaRAID, and putting that into a slot and attaching the drives to that after flashing it with the right firmware. That may or may not be possible

Sorry to be the bearer of bad news. But at first glance, just about everything that could be done wrong, has been done wrong. The SSD model is a relatively minor issue - you are just likley to burn these out relatively quickly rather than cause issues and even that depends on how much actual use is made writing to the pool (you have DWPD of 1.5) - but remember you are writing to every disk at the same time, so you do not have 8 drives worth of endurance, just one. [Its still approx 2.8 TB a day - so still a lot, they won't burn out quickly]

[edit] ***entirely may be a little unfair - but I suspect that the current configuration of the machine is both poor in general and potentially data fatal in its current configuration
 
Last edited:

EvilRSA

Cadet
Joined
Sep 21, 2023
Messages
3
While trying to find more information on this "AnyRAID" controller, I found that this server is considered "Config 1" as it has 8 3.5-inch drive bays. The AnyRAID controller only applies to Config 2 through 5 which are all 2.5-inch drive bay configurations. Reading further I see that this server shipped with the following configuration of Storage controllers:
Models with 3.5-inch drives support either an embedded controller or a PCIe RAID adapter.​
Supported controllers are as follows:​
  • Embedded RAID 110i controller in the Intel C610 PCH routed via cable to the backplane. SATA drives only. RAID arrays can include at most six drives.
  • RAID 500 adapter supporting RAID 0/1/10 with optional RAID 5/50
  • RAID 710 adapter supporting RAID 0/1/5/6/10/50/60 with 1 GB cache standard. Optional CacheVault, CacheCade, and FastPath support.
So, I then ran cat /var/run/dmesg.boot to see if I could find anything that points to one of the above, and I found "Intel Wellsburg AHCI SATA controller" and a quick Google search shows that is also known as the Intel C610 Series Chipset AHCI Controller.

I also saw all the PNY drives are still listed. I'm assuming that is where TrueNAS pulled the old drive name from...is fixing this part just as simple as rebooting the server and letting it detect that ada0 through ada7 are now the Micron PRO drives?

Huge thank you to @NugentS for the help and guidance. This is server 1 of 2 that the other company cobbled together for this client, and I'm going to take a look at the other server, an HP Proliant SE326M1 that is only using 12 of its 25 2.5-inch bays, leaving 13 bays open. If this unit turns out to be a better chassis, I think we'll look into purchasing 8 drive caddies, moving the Micron drives to this server, and building a new pool. I know you said to build it as a mirror to make sure IOPS don't suffer, so I'll have to do some reading on best practices for that. I'm old school and generally just build RAID 10 for most stuff. I got burned hard on a RAID 5 once, years ago, and had to use RAID Reconstructor by Runtime Software to recover the RAID. It worked in the end, but took about 6 days to recover all the data.

I'll post the output here in case there's anything else interesting or problematic that I didn't see.

Code:
login as: root
root@10.1.10.98's password:
Last login: Tue Sep 26 13:26:12 2023
FreeBSD 13.1-RELEASE-p7 n245428-4dfb91682c1 TRUENAS

        TrueNAS (c) 2009-2023, iXsystems, Inc.
        All rights reserved.
        TrueNAS code is released under the modified BSD license with some
        files copyrighted by (c) iXsystems, Inc.

        For more information, documentation, help or support, go here:
        http://truenas.com
Welcome to TrueNAS

Warning: the supported mechanisms for making configuration changes
are the TrueNAS WebUI and API exclusively. ALL OTHERS ARE
NOT SUPPORTED AND WILL RESULT IN UNDEFINED BEHAVIOR AND MAY
RESULT IN SYSTEM FAILURE.

root@truenas[~]# cat /var/run/dmesg.boot
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.1-RELEASE-p7 n245428-4dfb91682c1 TRUENAS amd64
FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
VT(vga): text 80x25
CPU: Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz (1596.35-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x21<LAHF,ABM>
  Structured Extended Features=0x37ab<FSGSBASE,TSCADJ,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,NFPUSG>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
real memory  = 17179869184 (16384 MB)
avail memory = 16413581312 (15653 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <LENOVO SV-INT  >
FreeBSD/SMP: Multiprocessor System Detected: 6 CPUs
FreeBSD/SMP: 1 package(s) x 6 core(s)
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
ioapic0 <Version 2.0> irqs 0-23
ioapic1 <Version 2.0> irqs 24-47
Launching APs: 4 3 2 5 1
random: entropy device external interface
kbd1 at kbdmux0
vtvga0: <VT VGA driver>
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS>
padlock0: No ACE support.
acpi0: <LENOVO SV-INT>
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71,0x74-0x77 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <dasp, performance counters> at device 11.1 (no driver attached)
pci0: <dasp, performance counters> at device 11.2 (no driver attached)
pci0: <dasp, performance counters> at device 16.1 (no driver attached)
pci0: <dasp, performance counters> at device 16.6 (no driver attached)
pci0: <dasp, performance counters> at device 18.1 (no driver attached)
acpi_syscontainer0: <System Container> on acpi0
acpi_syscontainer1: <System Container> on acpi0
acpi_syscontainer2: <System Container> on acpi0
acpi_syscontainer3: <System Container> on acpi0
apei0: <ACPI Platform Error Interface> on acpi0
pcib1: <ACPI Host-PCI bridge> port 0xcf8-0xcff numa-domain 0 on acpi0
pci1: <ACPI PCI bus> numa-domain 0 on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 26 at device 1.0 numa-domain 0 on pci1
pci2: <ACPI PCI bus> numa-domain 0 on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 32 at device 2.0 numa-domain 0 on pci1
pci3: <ACPI PCI bus> numa-domain 0 on pcib3
nvme0: <Samsung PM1725> mem 0xfb500000-0xfb503fff irq 32 at device 0.0 numa-domain 0 on pci3
pcib4: <ACPI PCI-PCI bridge> irq 32 at device 2.2 numa-domain 0 on pci1
pci4: <ACPI PCI bus> numa-domain 0 on pcib4
pci4: <network, ethernet> at device 0.0 (no driver attached)
pcib5: <ACPI PCI-PCI bridge> irq 40 at device 3.0 numa-domain 0 on pci1
pci5: <ACPI PCI bus> numa-domain 0 on pcib5
pcib6: <ACPI PCI-PCI bridge> irq 40 at device 3.2 numa-domain 0 on pci1
pci6: <ACPI PCI bus> numa-domain 0 on pcib6
ioat0: <HSW IOAT Ch0> mem 0x13ffff2c000-0x13ffff2ffff irq 31 at device 4.0 numa-domain 0 on pci1
ioat0: Capabilities: 2f7<PQ,Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat1: <HSW IOAT Ch1> mem 0x13ffff28000-0x13ffff2bfff irq 39 at device 4.1 numa-domain 0 on pci1
ioat1: Capabilities: 2f7<PQ,Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat2: <HSW IOAT Ch2> mem 0x13ffff24000-0x13ffff27fff irq 31 at device 4.2 numa-domain 0 on pci1
ioat2: Capabilities: f7<Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat3: <HSW IOAT Ch3> mem 0x13ffff20000-0x13ffff23fff irq 39 at device 4.3 numa-domain 0 on pci1
ioat3: Capabilities: f7<Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat4: <HSW IOAT Ch4> mem 0x13ffff1c000-0x13ffff1ffff irq 31 at device 4.4 numa-domain 0 on pci1
ioat4: Capabilities: f7<Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat5: <HSW IOAT Ch5> mem 0x13ffff18000-0x13ffff1bfff irq 39 at device 4.5 numa-domain 0 on pci1
ioat5: Capabilities: f7<Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat6: <HSW IOAT Ch6> mem 0x13ffff14000-0x13ffff17fff irq 31 at device 4.6 numa-domain 0 on pci1
ioat6: Capabilities: f7<Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
ioat7: <HSW IOAT Ch7> mem 0x13ffff10000-0x13ffff13fff irq 39 at device 4.7 numa-domain 0 on pci1
ioat7: Capabilities: f7<Extended_APIC_ID,Block_Fill,Move_CRC,DCA,Marker_Skipping,CRC,Page_Break>
pci1: <unknown> at device 17.0 (no driver attached)
ahci0: <Intel Wellsburg AHCI SATA controller> port 0xe110-0xe117,0xe100-0xe103,0xe0f0-0xe0f7,0xe0e0-0xe0e3,0xe020-0xe03f mem 0xfb604000-0xfb6047ff irq 16 at device 17.4 numa-domain 0 on pci1
ahci0: AHCI v1.30 with 4 6Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahciem0: <AHCI enclosure management bridge> on ahci0
xhci0: <Intel Wellsburg USB 3.0 controller> mem 0x13ffff00000-0x13ffff0ffff irq 19 at device 20.0 numa-domain 0 on pci1
xhci0: 32 bytes context size, 64-bit DMA
usbus0 numa-domain 0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
pci1: <simple comms> at device 22.0 (no driver attached)
pci1: <simple comms> at device 22.1 (no driver attached)
ehci0: <Intel Wellsburg USB 2.0 controller> mem 0xfb602000-0xfb6023ff irq 18 at device 26.0 numa-domain 0 on pci1
usbus1: EHCI version 1.0
usbus1 numa-domain 0 on ehci0
usbus1: 480Mbps High Speed USB v2.0
pcib7: <ACPI PCI-PCI bridge> irq 16 at device 28.0 numa-domain 0 on pci1
pci7: <ACPI PCI bus> numa-domain 0 on pcib7
pcib8: <ACPI PCI-PCI bridge> irq 16 at device 28.4 numa-domain 0 on pci1
pci8: <ACPI PCI bus> numa-domain 0 on pcib8
igb0: <Intel(R) I210 (Copper)> port 0xd000-0xd01f mem 0xfb400000-0xfb47ffff,0xfb480000-0xfb483fff irq 16 at device 0.0 numa-domain 0 on pci8
igb0: EEPROM V3.31-0 eTrack 0x800005cc
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 4 RX queues 4 TX queues
igb0: Using MSI-X interrupts with 5 vectors
igb0: Ethernet address: 00:8c:fa:f0:9b:08
pcib9: <ACPI PCI-PCI bridge> irq 17 at device 28.5 numa-domain 0 on pci1
pci9: <ACPI PCI bus> numa-domain 0 on pcib9
igb1: <Intel(R) I210 (Copper)> port 0xc000-0xc01f mem 0xfb300000-0xfb37ffff,0xfb380000-0xfb383fff irq 16 at device 0.0 numa-domain 0 on pci9
igb1: EEPROM V3.32-0 eTrack 0x800005d0
igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 4 RX queues 4 TX queues
igb1: Using MSI-X interrupts with 5 vectors
igb1: Ethernet address: 00:8c:fa:f0:9b:09
pcib10: <ACPI PCI-PCI bridge> irq 19 at device 28.7 numa-domain 0 on pci1
pci10: <ACPI PCI bus> numa-domain 0 on pcib10
pcib11: <ACPI PCI-PCI bridge> at device 0.0 numa-domain 0 on pci10
pci11: <ACPI PCI bus> numa-domain 0 on pcib11
vgapci0: <VGA-compatible display> port 0xb000-0xb07f mem 0xfa000000-0xfaffffff,0xfb000000-0xfb01ffff irq 16 at device 0.0 numa-domain 0 on pci11
vgapci0: Boot video device
ehci1: <Intel Wellsburg USB 2.0 controller> mem 0xfb601000-0xfb6013ff irq 18 at device 29.0 numa-domain 0 on pci1
usbus2: EHCI version 1.0
usbus2 numa-domain 0 on ehci1
usbus2: 480Mbps High Speed USB v2.0
isab0: <PCI-ISA bridge> at device 31.0 numa-domain 0 on pci1
isa0: <ISA bus> numa-domain 0 on isab0
ahci1: <Intel Wellsburg AHCI SATA controller> port 0xe070-0xe077,0xe060-0xe063,0xe050-0xe057,0xe040-0xe043,0xe000-0xe01f mem 0xfb600000-0xfb6007ff irq 16 at device 31.2 numa-domain 0 on pci1
ahci1: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich4: <AHCI channel> at channel 0 on ahci1
ahcich5: <AHCI channel> at channel 1 on ahci1
ahcich6: <AHCI channel> at channel 2 on ahci1
ahcich7: <AHCI channel> at channel 3 on ahci1
ahcich8: <AHCI channel> at channel 4 on ahci1
ahcich9: <AHCI channel> at channel 5 on ahci1
ahciem1: <AHCI enclosure management bridge> on ahci1
acpi_button0: <Power Button> on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
ipmi0: <IPMI System Interface> port 0xca2,0xca3 on acpi0
ipmi0: KCS mode found at io 0xca2 on acpi
ichwd0: <Intel Wellsburg watchdog timer> on isa0
ichwd0: ICH WDT present but disabled in BIOS or hardware
device_attach: ichwd0 attach returned 6
ichwd0: <Intel Wellsburg watchdog timer> at port 0x430-0x437,0x460-0x47f on isa0
ichwd0: ICH WDT present but disabled in BIOS or hardware
device_attach: ichwd0 attach returned 6
orm0: <ISA Option ROM> at iomem 0xc0000-0xcffff pnpid ORM0000 on isa0
coretemp0: <CPU On-Die Thermal Sensors> on cpu0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
Timecounter "TSC" frequency 1596303615 Hz quality 1000
Timecounters tick every 1.000 msec
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
ipfw2 (+ipv6) initialized, divert enabled, nat enabled, default to accept, logging disabled
ugen0.1: <Intel XHCI root HUB> at usbus0
ugen2.1: <Intel EHCI root HUB> at usbus2
ugen1.1: <Intel EHCI root HUB> at usbus1
uhub0 numa-domain 0 on usbus0
uhub0: <Intel XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
uhub1 numa-domain 0 on usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
uhub2 numa-domain 0 on usbus2
uhub2: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
nvd0: <MS1PC5ED3ORA3.2T> NVMe namespace
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:14 nsid:0 cdw10:00000000 cdw11:00000000
nvd0: 3052360MB (6251233968 512 byte sectors)
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:14 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:13 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:13 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:12 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:12 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:11 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:11 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:10 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:10 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:9 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:9 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:8 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:8 cdw0:0
nvme0: ASYNC EVENT REQUEST (0c) sqid:0 cid:7 nsid:0 cdw10:00000000 cdw11:00000000
nvme0: ASYNC LIMIT EXCEEDED (01/05) sqid:0 cid:7 cdw0:0
ipmi0: IPMI device rev. 1, firmware rev. 1.34, version 2.0, device support mask 0xbf
ipmi0: Number of channels 3
ipmi0: Attached watchdog
ipmi0: Establishing power cycle handler
ses0 at ahciem0 bus 0 scbus3 target 0 lun 0
ses0: <AHCI SGPIO Enclosure 2.00 0001> SEMB S-E-S 2.00 device
ses0: SEMB SES Device
ses1 at ahciem1 bus 0 scbus10 target 0 lun 0
ses1: <AHCI SGPIO Enclosure 2.00 0001> SEMB S-E-S 2.00 device
ses1: SEMB SES Device
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada0: Serial Number PNF16222322780900295
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors)
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada1: Serial Number PNF16222322780900298
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors)
ada2 at ahcich4 bus 0 scbus4 target 0 lun 0
ada2: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada2: Serial Number PNF16222322780900297
ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada2: Command Queueing enabled
ada2: 953869MB (1953525168 512 byte sectors)
ses0: pass0,ada0 in 'Slot 00', SATA Slot: scbus0 target 0
ada3 at ahcich5 bus 0 scbus5 target 0 lun 0
ada3: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada3: Serial Number PNF16222322780900293
ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada3: Command Queueing enabled
ada3: 953869MB (1953525168 512 byte sectors)
ses0: pass1,ada1 in 'Slot 01', SATA Slot: scbus1 target 0
ada4 at ahcich6 bus 0 scbus6 target 0 lun 0
ada4: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada4: Serial Number PNF16222322780900258
ada4: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada4: Command Queueing enabled
ada4: 953869MB (1953525168 512 byte sectors)
ses0: pass2,cd0 in 'Slot 02', SATA Slot: scbus2 target 0
ada5 at ahcich7 bus 0 scbus7 target 0 lun 0
ada5: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada5: Serial Number PNF16222322780900278
ada5: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada5: Command Queueing enabled
ada5: 953869MB (1953525168 512 byte sectors)
ada6 at ahcich8 bus 0 scbus8 target 0 lun 0
ada6: <PNY 1TB SATA SSD V0414A0> ACS-3 ATA SATA 3.x device
ada6: Serial Number PNF16222322780900218
ada6: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada6: Command Queueing enabled
ada6: 953869MB (1953525168 512 byte sectors)
ada7 at ahcich9 bus 0 scbus9 target 0 lun 0
ada7: <PNY CS900 1TB SSD CS9006B5> ACS-4 ATA SATA 3.x device
ada7: Serial Number PNY2152211227010111D
ada7: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada7: Command Queueing enabled
ada7: 953869MB (1953525168 512 byte sectors)
ses1: pass4,ada2 in 'Slot 00', SATA Slot: scbus4 target 0
ses1: pass5,ada3 in 'Slot 01', SATA Slot: scbus5 target 0
ses1: pass6,ada4 in 'Slot 02', SATA Slot: scbus6 target 0
ses1: pass7,ada5 in 'Slot 03', SATA Slot: scbus7 target 0
ses1: pass8,ada6 in 'Slot 04', SATA Slot: scbus8 target 0
ses1: pass9,ada7 in 'Slot 05', SATA Slot: scbus9 target 0
cd0 at ahcich2 bus 0 scbus2 target 0 lun 0
cd0: <PLDS DVD-RW DU8A6SH DL61> Removable CD-ROM SCSI device
cd0: Serial Number DX0F84995L1CB53007GF
cd0: 150.000MB/s transfers (SATA 1.x, UDMA6, ATAPI 12bytes, PIO 8192bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
mlx4_core0: <mlx4_core> mem 0xfb200000-0xfb2fffff,0x13fff000000-0x13fff7fffff irq 34 at device 0.0 numa-domain 0 on pci4
mlx4_core: Mellanox ConnectX core driver v3.7.1 (November 2021)
mlx4_core: Initializing mlx4_core
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub0: 21 ports with 21 removable, self powered
mlx4_core0: Unable to determine PCI device chain minimum BW
mlx4_en mlx4_core0: Activating port:1
mlxen0: Ethernet address: 00:02:c9:52:90:f4
mlx4_en: mlx4_core0: Port 1: Using 6 TX rings
mlxen0: link state changed to DOWN
mlx4_en: mlx4_core0: Port 1: Using 4 RX rings
mlx4_en: mlxen0: Using 6 TX rings
mlx4_en: mlxen0: Using 4 RX rings
mlx4_en: mlxen0: Initializing port
Trying to mount root from zfs:boot-pool/ROOT/13.0-U5.3 []...
ugen1.2: <vendor 0x8087 product 0x800a> at usbus1
uhub3 numa-domain 0 on uhub1
uhub3: <vendor 0x8087 product 0x800a, class 9/0, rev 2.00/0.05, addr 2> on usbus1
ugen2.2: <vendor 0x8087 product 0x8002> at usbus2
uhub4 numa-domain 0 on uhub2
uhub4: <vendor 0x8087 product 0x8002, class 9/0, rev 2.00/0.05, addr 2> on usbus2
uhub3: 6 ports with 6 removable, self powered
uhub4: 8 ports with 8 removable, self powered
usb_msc_auto_quirk: UQ_MSC_NO_GETMAXLUN set for USB mass storage device USB SanDisk 3.2Gen1 (0x0781:0x5583)
usb_msc_auto_quirk: UQ_MSC_NO_PREVENT_ALLOW set for USB mass storage device USB SanDisk 3.2Gen1 (0x0781:0x5583)
usb_msc_auto_quirk: UQ_MSC_NO_SYNC_CACHE set for USB mass storage device USB SanDisk 3.2Gen1 (0x0781:0x5583)
ugen0.2: <USB SanDisk 3.2Gen1> at usbus0
umass0 numa-domain 0 on uhub0
umass0: <USB SanDisk 3.2Gen1, class 0/0, rev 2.10/1.00, addr 1> on usbus0
umass0:  SCSI over Bulk-Only; quirks = 0xc100
umass0:11:0: Attached to scbus11
Root mount waiting for: CAM
da0 at umass-sim0 bus 0 scbus11 target 0 lun 0
da0: <USB SanDisk 3.2Gen1 1.00> Removable Direct Access SPC-4 SCSI device
da0: Serial Number 0401465b905ca3c566391704ebf508431d610cecf1f4085246173eeb1e47
da0: 40.000MB/s transfers
da0: 29358MB (60125184 512 byte sectors)
da0: quirks=0x2<NO_6_BYTE>
ichsmb0: <Intel Wellsburg SMBus controller> port 0x580-0x59f mem 0x13ffff31000-0x13ffff310ff irq 18 at device 31.3 numa-domain 0 on pci1
smbus0: <System Management Bus> numa-domain 0 on ichsmb0
lo0: link state changed to UP
igb0: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb0 (ifp 0xfffff800051b9800), ignoring.
igb1: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb1 (ifp 0xfffff800051b8800), ignoring.
GEOM_ELI: Device nvd0p1.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_MIRROR: Device mirror/swap0 launched (2/2).
GEOM_MIRROR: Device mirror/swap1 launched (2/2).
GEOM_MIRROR: Device mirror/swap2 launched (2/2).
GEOM_MIRROR: Device mirror/swap3 launched (2/2).
GEOM_ELI: Device mirror/swap0.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap1.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap2.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap3.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap3.eli destroyed.
GEOM_MIRROR: Device swap3: provider destroyed.
GEOM_MIRROR: Device swap3 destroyed.
GEOM_ELI: Device mirror/swap2.eli destroyed.
GEOM_MIRROR: Device swap2: provider destroyed.
GEOM_MIRROR: Device swap2 destroyed.
GEOM_ELI: Device mirror/swap1.eli destroyed.
GEOM_MIRROR: Device swap1: provider destroyed.
GEOM_MIRROR: Device swap1 destroyed.
GEOM_ELI: Device mirror/swap0.eli destroyed.
GEOM_MIRROR: Device swap0: provider destroyed.
GEOM_MIRROR: Device swap0 destroyed.
GEOM_ELI: Device nvd0p1.eli destroyed.
GEOM_MIRROR: Device mirror/swap0 launched (2/2).
GEOM_MIRROR: Device mirror/swap1 launched (2/2).
GEOM_MIRROR: Device mirror/swap2 launched (2/2).
GEOM_MIRROR: Device mirror/swap3 launched (2/2).
GEOM_ELI: Device mirror/swap0.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap1.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap2.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device mirror/swap3.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
CPU: Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz (1596.30-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x21<LAHF,ABM>
  Structured Extended Features=0x37ab<FSGSBASE,TSCADJ,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,NFPUSG>
  Structured Extended Features3=0x9c000400<MD_CLEAR,IBPB,STIBP,L1DFL,SSBD>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
igb0: link state changed to DOWN
mlxen0: link state changed to UP
hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI> TSC/1/64/0x20<REA> IAP/8/48/0x3ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA,PRC> IAF/3/48/0x67<INT,USR,SYS,REA,WRI>
root@truenas[~]#
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Where you say RAID10 in ZFS it means building multiple vdevs of mirrors (lots of 2 disk mirrors (or more) striped together). So we mean the same thing - just not the RAIDZ1 you have currently which means the ZFS equavilent, approximately, of RAID5 which sucks for iSCSI and IOPS.

With the proliant, be very careful of its disk controller. This are usually RAID arrays which as far as ZFS is concerned are probably ewaste. The good news is that you can probably rip one out (if found) and replace with an LSI 93xx HBA (non MegaRAID) flashed with the correct firmware.

Given that each disk on the Lenovo is generating chksum errors - then there is probably a cabling issue or a backplane issue - can you replace either - I have no idea how things are attached - it might be that a cable just needs reattaching / replacing - or somthinmg more serious

Incidently the HP Proliant SE326M1 is a custom build - but according to one source is in fact a Proliant DL 180 with iLO2

What is clear is that these machines are elderly (trying to be kind here). Are probably using gobs of power and from the looks of things could do with a refresh. The SSD's can be re-used so the money isn't wasted (despite my reservations as to the drive model)

My suggestion is build 2 new (ish) servers from Supermicro gear with LSI 93xx HBA's (you do not need, or want, the latest and greatest). My opinion is that you always will need two seperate servers so you can swing the guest's storage from one to the other so that maintenance can be done on the iSCSI host without downing the VM's. Then junk these existing servers - or use them in a lab / for testing purposes. I would build 1 mirrored RAID10 pool of SSD's on each along with a Pool (again mirrored) of larger (but fewer - probably 2) HDD's. Then under a steady state each server contains half the VM's (but have enough capcity for all of them). Replicate the zvols between the SSD pools for short duration snapshots and also replicate for long duration snapshots to the HDD pools. Make sure you are using decent SLOG drives, on the SSD Pools (which have very specific hardware requirements) as you are leaving a lot of performance on the table, or running in an data unsafe mode. Also make sure that you have a seperate NIC for the ISCSI traffic, and one for everything else.

Of course, the client needs convincing as well - which can be the challenge.

Meantime - please make sure you have good backups. Also in my sig there is a link to Path to Success for Block Storage - please read and inwardly digest. I did not write it - it was written by our resident grinch - but he's been around so long he has probably:
1. Seen it all before - and appears to be willing to share his elderly^H^H^H^H^H^H^Hancient wisdom with us youngsters
2. Fossilized (sorry couldn't resist)

I have used his path to success for a couple of very happy clients (and my own ESX setup) - who are and continue to work well with similar requirements.
 
Last edited:
Top