Drive Name Changes

Joined
Apr 3, 2015
Messages
8
Recently I had two degraded drives on different arrays. I ordered the replacements. I'm going to list the steps I took for completeness.
First off I'm running TrueNAS-12.0-U6.

The first array is simple. 5 4 TB drives in raidz2 with cache and log SSDs. One of the 4TB data drives went bad. I offlined the failing drive, pulled it, replaced it. After the resliver it was fine. The second array was more complicated. 4 mirrored udevs of various sizes strung together. It was basically a bunch of old drives I didn't want to go to waste. It looked like this:
Code:
        MAGVM01           ONLINE
          mirror-0        ONLINE
            da18p2.eli    ONLINE
            da18p2.eli    ONLINE
          mirror-1        ONLINE
            da1p2.eli     ONLINE
            da0p2.eli     ONLINE
          mirror-2        DEGRADED 
            spare-0       DEGRADED 
              da13p2.eli  DEGRADED
              da21p2.eli  ONLINE
            da12p2.eli    ONLINE  
          mirror-3        ONLINE
            da2p2.eli     ONLINE
            da3p2.eli     ONLINE


The setup in mirror-2 is a little strange. I wanted all mirrored udevs but I didn't have a matching drive so I strung two smaller drives together to make spare-0. I figured spare-0 doesn't matter much because it is one set of storage in a mirror. I offline da13 and da21, they are a raid 1 udev making up the other half of a mirror. Even if I completely mess it up spare-0 the other part of the mirror is fine. I replace da13, then I need to reboot because it is encrypted. You can't online encrypted drives. When True came back on-line it renamed several of the drives. The array was now reading like this from the zpool command:

Code:
        MAGVM01           UNAVAIL  insufficient replicas
          mirror-0        ONLINE
            da18p2.eli    ONLINE
            da18p2.eli    ONLINE
          mirror-1        ONLINE
            da1p2.eli     ONLINE
            da0p2.eli     ONLINE
          mirror-2        UNAVAIL  insufficient replicas
            spare-0       OFFLINE  all children offline
              da13p2.eli  OFFLINE
              da21p2.eli  OFFLINE
            da12p2.eli    UNAVAIL  cannot open
          mirror-3        ONLINE
            da2p2.eli     ONLINE
            da3p2.eli     ONLINE


Through poking around I know that da13 is now da12. The drive names were now reordered somehow. This caused TrueNAS to think the entire mirror was now borked taking the entire array offline. Once I saw this, I shutdown, re-inserted the old degraded drive in the hopes that things would just go back to the way they were. This did not work. TrueNAS is keeping the new naming order. I know for a fact that the original da12 is still there but I don't know what the new name is. So now that the pool is in the UNAVIL state is there any way to fix this? The original da12 is there, somewhere, is there a way to find it and convince TrueNAS to use it in the correct spot? It seems to me that this should be possible. I'd rather recover the array than rebuild. While I belive I have all the critical data backed up, but I'd rather not have to rebuild. So the questions I have are as follows:

How do I query a drive to see what cluster it thinks it should belong to?
How do I convince an UNAVIAL zpool to change one of the disk members once I find it?

Any help is appreciated.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Is this the verbatim output of zpool status commands on your system or did you copy & paste from the UI?
Because if this is from the command line, your pools should not look like that at all. The disks must be referenced by gptid in TrueNAS ...
 
Joined
Apr 3, 2015
Messages
8
Is this the verbatim output of zpool status commands on your system or did you copy & paste from the UI?
Because if this is from the command line, your pools should not look like that at all. The disks must be referenced by gptid in TrueNAS ...
This is verbatim from the command line. If the array was created in 11 the non-guid destination carries over. As you replace disks in 12 the the names get replaced with guids. For example this array:

Code:
        NAME                                                STATE     READ WRITE CKSUM
        media_share                                         ONLINE       0     0     0
          raidz2-0                                          ONLINE       0     0     0
            da8p2.eli                                       ONLINE       0     0     0
            gptid/76e6a470-3e9e-11ec-8458-000f5321ea98.eli  ONLINE       0     0     0
            da9p2.eli                                       ONLINE       0     0     0
            da10p2.eli                                      ONLINE       0     0     0
            da4p2.eli                                       ONLINE       0     0     0
        logs
          gptid/e87c8085-4f7d-11eb-b334-000f5321ea98.eli    ONLINE       0     0     0
        cache
          gptid/e852e7bb-4f7d-11eb-b334-000f5321ea98.eli    ONLINE       0     0     0
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
camcontrol devlist and diskinfo should help you to identify the devices. As for the device identifiers I am pretty sure that FreeNAS 11 used GPTID, too. When constructing a pool on the command line one must make sure to use these and never device names - precisely because they reorder.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Joined
Apr 3, 2015
Messages
8
The array was created in the UI,
/me looks at user creation date,
Maybe this happened in 10? Anyway, once I figure out what the disk was renamed to how do I modify the array since it is unavail?

-Glenn
 
Joined
Apr 3, 2015
Messages
8
Ok so now this happened with the working array above. The drives are all there but the the non guid named drives have shifted and now the array will not start. How to I tell zpool to use the correct drives? This array was created through the UI. I do not know what version it was created under. But this is now caused 2 arrays to go off line. I wrote the first array off because it was used for backups. Now that the backups are gone I need to restore this array.
 
Joined
Apr 3, 2015
Messages
8
To be clear I solved the issue by rebooting, the drives magically renamed themselves again, since it was raidz2 I have just enough to run the array degraded and I am replacing each disk so they are identified an joined in the cluster by guid. That is a less than ideal solution :/ This is recovery by luck.
 
Top