Weird pool topology after power failure.

Joined
Feb 21, 2022
Messages
5
Hi guys. This is my first post to this forum, so please be gentle with me :smile:.
I use TrueNAS Core 12.0-U8 in my home lab. Never has a single issue with my NAS. However we had couple blackouts recently and I wasn't at home to properly react when the UPS finally gave up.

Long story short - one of my disks has not come up properly and TrueNAS used the spare. I have received info, that my Pool state is Degraded and

Code:
The following devices are not healthy:
Disk ATA ST5000LM000-2AN1 WCJ3EPBQ is UNAVAIL
Disk ATA ST5000LM000-2AN1 WCJ3RL7F is FAULTED


After few days of Resilvering my pool came up as healthy, however the topology looks strange and I am not sure how to make sense out of it...

I've got 4 disks in the pool:
  • da0p2
  • da1p2
  • da2p2
  • da3p2
All happy and error free. All online.

But my pool status looks somehow weird...
status.png

Can you guys advice on why there are two SPARE disks inside RAIDZ1? Why it says da0p2 is both ONLINE and UNAVAILABLE?
Is this something I should be worry about?
What can/should I do in order to bring it back to (my regular topology) 3 disks in RAIDZ1 and one spare?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Can you guys advice on why there are two SPARE disks inside RAIDZ1? Why it says da0p2 is both ONLINE and UNAVAILABLE?
Is this something I should be worry about?
What can/should I do in order to bring it back to (my regular topology) 3 disks in RAIDZ1 and one spare?
looks like da0 went offline at some point and the spare kicked in to take over... but later, da0 came back and is now (although maybe only temporarily) fine.

You could either detach da0 or detach the spare to return it to being a spare. (I'm not sure that you can do that in the GUI, so it would be zpool detach cryptohell /dev/da0p2 or da3p2 if you want to send the spare back).
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Worst case it will be out of the pool and you can add it back as a spare again, but as I understand it, it should just go back to being a spare directly with that command.

See topic 4.4.6 here: https://illumos.org/books/zfs-admin/gavwn.html

Seems to confirm what I'm saying toward the end of that section.
 
Joined
Feb 21, 2022
Messages
5
Hmmm... a little confusion:
Code:
% sudo zpool detach cryptohell /dev/da0p2

cannot detach /dev/da0p2: no such device in pool


Code:
zpool status cryptohell

  pool: cryptohell
 state: ONLINE
  scan: scrub in progress since Tue Feb 22 07:56:43 2022
    4.08T scanned at 449M/s, 2.35T issued at 259M/s, 10.4T total
    0B repaired, 22.65% done, 09:01:44 to go
config:

    NAME                                                  STATE     READ WRITE CKSUM
    cryptohell                                            ONLINE       0     0     0
      raidz1-0                                            ONLINE       0     0     0
        gptid/6b4914f3-e1b3-11ea-9b92-002590c43598.eli    ONLINE       0     0     0
        spare-1                                           ONLINE       0     0     1
          gptid/c57f6251-ef2c-11ea-88f7-002590c43598.eli  ONLINE       0     0     0
          gptid/b5da369b-7737-11ec-b453-002590c43598.eli  ONLINE       0     0     0
        gptid/750c48ba-e1b3-11ea-9b92-002590c43598.eli    ONLINE       0     0     0
    logs    
      gptid/c0717c3a-4ef0-11ec-8314-002590c43598.eli      ONLINE       0     0     0
    spares
      gptid/b5da369b-7737-11ec-b453-002590c43598.eli      INUSE     currently in use

errors: No known data errors


Is there a way to map GPTIDs to short dev names?
 
Joined
Jun 2, 2019
Messages
591
@plague_doctor

For the future, make sure you enable UPS monitoring to allow your NAS to gracefully shutdown.
 
Joined
Feb 21, 2022
Messages
5
@elvisimprsntr To be honest it is configured and (to my surprise) it worked every single time before. The last blackout was different somehow... They have been switching on and off electricity couple times, and I think this did something weird to NUT logic...
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
glabel status
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
ST5000LM000

Just a note, you're using SMR (shingled) drives, which are known to have varying degrees of bad behaviour under ZFS.

 
Top