issue with zpool mirrored vdev

boomboom69

Cadet
Joined
Jun 5, 2017
Messages
4
I've been running the latest TrueNas Scale 22.12.3.3 for a few weeks now and noticed my VM's weren't working and slowness of the UI. When investigating the alerts and UI say my ssd pool is degraded and 1 volume in my raidZ mirrored array was removed due to IO errors. I was trying to poke around to figure out which one it was and due to issue with UI loading screens I decided to reboot the system. Everything came back up except the ssd pool. Now it's saying my vdev mirror-5 is unavailable and cant import the pool. How do I force the drives back online and get the pool backup? Also trying to figure out how to trace which drive it is on the system and which one physically on the box that may need to be replaced so I can replace it.

Code:
root@truenas[~]# zpool status
  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:19 with 0 errors on Sun Sep 10 03:45:20 2023
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdh3    ONLINE       0     0     0
            sdg3    ONLINE       0     0     0

errors: No known data errors

  pool: hdd
 state: ONLINE
  scan: scrub repaired 0B in 01:19:53 with 0 errors on Sun Aug 27 01:19:54 2023
config:

        NAME                                      STATE     READ WRITE CKSUM
        hdd                                       ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            9fcab424-c01e-442c-bff3-b802d2e2cc55  ONLINE       0     0     0
            50167970-656b-46ad-93f4-0784bf8695cd  ONLINE       0     0     0
            65d30cb8-e086-44d5-98ee-46b9693ccde8  ONLINE       0     0     0
            e82ddad9-19a9-4e71-a8bd-4af904552108  ONLINE       0     0     0
            b73a3020-cc9b-4803-9186-0ae76659378d  ONLINE       0     0     0
            204197ad-455c-4614-a79d-eaea84b6d184  ONLINE       0     0     0
          raidz2-1                                ONLINE       0     0     0
            9c0ef1c6-e0cb-427e-9a4b-973dfb099709  ONLINE       0     0     0
            5af9785c-15af-4be6-8977-b4b985a0b6e4  ONLINE       0     0     0
            c11470c0-92c9-45e9-ae69-d5a38a7009e9  ONLINE       0     0     0
            26dcc61f-3526-490b-b841-8f39a8c8e45a  ONLINE       0     0     0
            7366b079-3174-4678-85a6-9c4cd4c13862  ONLINE       0     0     0
            ec209d32-e39e-4430-ae0b-f3328134ae4d  ONLINE       0     0     0
        logs
          mirror-2                                ONLINE       0     0     0
            818f7f28-dab1-469a-8cc9-871f1f7b088d  ONLINE       0     0     0
            10be3f7a-e6d4-4f70-b839-c9ab095254fb  ONLINE       0     0     0
        cache
          ed0aff50-64a5-4f9c-a945-a93861eb3bb5    ONLINE       0     0     0
          db6d2c6e-0931-4ca0-9ffc-e207eef6e4fe    ONLINE       0     0     0

errors: No known data errors


Code:
root@truenas[~]# zpool import
   pool: ssd
     id: 3134556081092737569
  state: UNAVAIL
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

        ssd                                       UNAVAIL  insufficient replicas
          mirror-0                                ONLINE
            92304142-710c-4661-9e02-6115f1ffb9fc  ONLINE
            5b3685de-aac8-43c4-b53d-eb38d6067488  ONLINE
          mirror-1                                ONLINE
            97690b6a-26ff-425d-a167-f7b2e0a54438  ONLINE
            e02af788-f585-40a1-936b-e554c32be84c  ONLINE
          mirror-2                                ONLINE
            97d566b7-bfab-438f-b7fa-d59fefb17844  ONLINE
            1273de4a-e522-4a32-a4c4-cd9daa8a5194  ONLINE
          mirror-3                                ONLINE
            d1d0b54c-dde9-4dd7-9f01-8f913fc2bcda  ONLINE
            161f2f75-548e-4b9f-99ee-f5f2c1f11d98  ONLINE
          mirror-4                                ONLINE
            fb7c8aac-de5c-49ea-8652-d8c6e5a0ebc0  ONLINE
            fbdf8339-25a2-43af-b7be-118f0e5c5369  ONLINE
          mirror-5                                UNAVAIL  insufficient replicas
            5a2c8a6d-20a0-4b21-9542-e76df6dc4f5a  UNAVAIL
            eea2bc41-ab3d-4e46-a83c-479009d7bfaa  UNAVAIL
          mirror-6                                ONLINE
            fd3e06cd-a936-4d1f-be8e-6e761deaf214  ONLINE
            db800d16-e2ff-4ec4-8b23-c49b74ac7390  ONLINE
          mirror-7                                ONLINE
            95f0f9cf-1e00-456c-9200-d264796c5ab8  ONLINE
            8d413c23-b50d-4b6a-8b46-675b1e634970  ONLINE
          mirror-8                                ONLINE
            37d9cffe-e800-4456-85d2-cd8624fb7b59  ONLINE
            956c4476-fdd0-4c4b-b8d7-c2b6591a60e5  ONLINE
          mirror-9                                ONLINE
            63549055-5935-4977-84bd-60d55e810fa7  ONLINE
            7b262821-46df-40db-aeef-ce18e60c6a60  ONLINE


Server specs:
SV-6048R-E1CR36N2192D32A08RCCS001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S001LSA101S Supermicro SuperStorage 6048R-E1CR36N W/ X10DRi-T4+
Processor: 2x Xeon E5-2697A v4 2.6GHz 16-Core Processors - $500.00
Memory: 256GB (8x 32GB) DDR4 Registered Memory - $288.00
Storage Controller: 12Gbps IT Mode PCIe Storage Controller (Flashed to IT Mode; No RAID Pass-Thru Only) - $59.00​

SSD pool drives:
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
ZFS is saying that 2 of your SSDs, both in "mirror-5", are gone. Unless you can get ONE of them back, your "ssd" pool is gone. ZFS redundancy is based on vDevs, loss of most vDevs is fatal. (Cache, Spare & Log vDevs can be lost without pool loss...)

This can happen sometimes when using identical model of storage devices. They would have the same failure rate, and in the case of a Mirror vDev, would get the same number of writes. (Though reads could be round robined between the SSDs.)

You may be able to find which devices are which by using;
ls -l /dev/disk/by-partuuid/XYZ
where XYZ is replaced by the partition UUID as listed by zpool status or zpool import.

Some people go to great lengths when setting up their TrueNAS, by matching serial numbers of the drives and labeling them in the GUI and on the front of the drives. Might take a hour or more, but it will save time when repairing or replacing a faulty drive.


By the way, their is no such thing as a "raidZ mirrored array". That confused me for a moment, but fortunately you had the various outputs to clear that up. The word "RAID-Z" is a parity based stripe, so you probably meant "striped mirrored array". Or just "mirrored array".
 
Last edited:

boomboom69

Cadet
Joined
Jun 5, 2017
Messages
4
Thanks for the clarification on that I remember when ZFS first came out and researching it people were calling it raidz10, which is the same as normal raid10 but with zfs filesystem benefits. Been a while since I was able to get my own system so fairly new. I understand most of the high level stuff like pools, vdevs, and other raidz levels.

I tried the `ls -l` command but the disks are not showing up. I also thought it was weird how the UI shows only 18 disks available instead of 20. I have been for a while getting these alerts on a couple of drives but smart checks have been passing.

Device: /dev/sdt, SMART Failure: WARNING: ascq=0x5.​

2023-09-12 17:54:56 (America/Chicago)

Not sure how to track down which disk that is physically or pool wise. When I was originally building the system on quite a few old ssd's I got from a previous employer that sold everything off, I literally sat in front of monitor and pulled each drive to see which one came up in the console messages to find the bad drive that needed to be replaced. After several pool rebuilds I finally went out and bought 20 of these drives for additional storage, and to have newer SSDs that hopefully wouldn't fail as much but were also cheap to get.

mirrored pool is probably overkill performance wise. My other pool is a HDD based with zeus ssd drives for log and cache I scavanged from old truenas z20 backup system I got from previous employer. If I have to rebuild the pool what would be the recommended configuration for 20 1TB Sata3 SSD drives?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
people were calling it raidz10
Are those the same people who think "trueness" is a word worthy of being an autocorrect output? Because I have some choice words for them.

Not sure how to track down which disk that is physically or pool wise.
Identifying the disk is simple at that point: use the serial number as reported by SMART and look through the disks' labels for the one that matches.
 

boomboom69

Cadet
Joined
Jun 5, 2017
Messages
4
sounds like It might be very beneficial to spend a weekend pulling all drives and making a spreadsheet of location and serial numbers.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If you're lucky, yours have labels on the face opposite the connectors.
 

boomboom69

Cadet
Joined
Jun 5, 2017
Messages
4
unfortunately not. Also 90% of the labels are on the side facing the bottom of the drive carriage so I have to completely remove them to get the info.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
On the subject of pool rebuild, having more than 1 disks worth of redundancy might be helpful. As you found out, a 2 way Mirror vDev can be lost when both disks fail. But, if you have a 3 way Mirror, then all 3 disks have to fail before data loss. Or RAID-Z2, like you have in your "hdd" pool.

Next, for RAID-Zx stripe width, more than about 10 to 12 is not recommended. You have 6 in you "hdd" pool, so that is fine. And with 20 SSDs, you could do 2 x vDevs each with 10 SSDs in a RAID-Z2.

Before you go and rebuild your pool, find the 2 failing disks.Those no longer show up in TrueNAS, so their serial numbers would not be in the GUI. Whence you have them identified, check power and data cables. On rare occasions, a server gets built with the disks in order, aka right next to each other. So maybe some common cable came loose. Remember, you only need to get ONE of them back to restore your "ssd" pool to functionality.
 
Top