Need help fixing pool - it doesn't make sense!

katit

Contributor
Joined
Jun 16, 2015
Messages
162
Hello, I am trying to fix this pool without losing data. Currently data accessible, but I can't figure out how to get rid of one disk!
1687573969060.png


Here is chronology of events:
1. I had happy pool - mirror of 2x4tb devices that's been working for years.
2. After upgrading to Truenas Scale (FreeNas=>TrueNas core=>Scale) I wanted to expand this pool.
3. I added 2x1Tb drives as another vdev (mirror) to this pool but it didn't go well, one disk had issues(sorry didn't save exact errors) and it didn't show up, but this vde did look like mirror. I think I pressed "Remove"? button to get rid of bad disk.
4. Now I am stuck with pool like on picture. Doesn't make sense because it can't be mirror+single
5. I tried to "Extend" this sda drive with another new HDD (20TB) and get error: [EZFS_BADTARGET] cannot attach /dev/disk/by-partuuid/2c9641b2-5a69-4b5e-8f31-c385f9edfda9 to /dev/disk/by-partuuid/c0803f47-f5b1-46c0-b384-21047b653990: can only attach to mirrors and top-level disks
What does it mean??
6. I tried to "Replace" with this 20Tb drive (not sure what is the logic here) - but I tried: Error: [EZFS_BADTARGET] cannot replace 11950173496626399005 with /dev/disk/by-partuuid/d5a545d8-aeca-4a2e-911a-bd52d35960e2: already in replacing/spare config; wait for completion or use 'zpool detach'
7. I tried to simply "Remove" sda:
Error: [EZFS_INVALCONFIG] cannot remove /dev/disk/by-partuuid/c0803f47-f5b1-46c0-b384-21047b653990: invalid config; all top-level vdevs must have the same sector size and not be raidz.
close

So, I am not sure what needs to be done.
I guess main question is:
1. WHAT is this config I have? Can you even create config like this if you wanted?? Or it's a glitch?
2. HOW do I fix it either by replacing with another mirror or adding disk to make it a mirror?

Also, when I look at this - how can I tell what exactly those 4 warnings is? Not clickable, no popup..
1687574605846.png
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
4. Now I am stuck with pool like on picture. Doesn't make sense because it can't be mirror+single
It absolutely can. One can stripe any kind of vdevs in a pool. Mirror, single (= 1-way mirror), raidz#… everything goes.
Of course, striping heterogenous vdevs is advised against, and single-drive vdevs are strongly advised against (loose the single drive=loose the whole pool), but this is possible, and this is where you landed.
zpool status -v in a SSH session should confirm the layout and condition of the pool.
Had the disks been properly burnt-in before attempting to add them to the pool?

2. HOW do I fix it either by replacing with another mirror or adding disk to make it a mirror?
Exactly how you've outlined it: Either add another drive to make it a safer 2-way mirror, or remove the single drive.
Removing should be allowed because the pool is all mirrors, but the middleware appears to have a glitch when it comes to single drives.
And maybe it has a a glitch when it come to extending single drives as well.

Try zpool remove -n mirror-4TB-mirror <ID_OF_SINGLE_DRIVE>
And if it comes out fine, run the command without -n to do it for real and go back to just the 2*4 TB mirror.
But how are you going to evolve from that? One 1 TB drive and one 20 TB is very lopsided.
 

katit

Contributor
Joined
Jun 16, 2015
Messages
162
Config:
pool: main-4TB-mirror
state: ONLINE
scan: scrub repaired 0B in 04:56:42 with 0 errors on Sun Oct 23 04:56:43 2022
config:

NAME STATE READ WRITE CKSUM
main-4TB-mirror ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
4c74ca49-3e52-11e5-8f43-0cc47a696c86 ONLINE 0 0 0
4cdad3ef-3e52-11e5-8f43-0cc47a696c86 ONLINE 0 0 0
c0803f47-f5b1-46c0-b384-21047b653990 ONLINE 0 0 0

errors: No known data errors


No luck removing it though.. Any other way?
root@HOME-NAS[~]# zpool remove -n main-4TB-mirror c0803f47-f5b1-46c0-b384-21047b653990
Memory that will be used after removing c0803f47-f5b1-46c0-b384-21047b653990: 4.01K
root@HOME-NAS[~]# zpool remove main-4TB-mirror c0803f47-f5b1-46c0-b384-21047b653990
cannot remove c0803f47-f5b1-46c0-b384-21047b653990: invalid config; all top-level vdevs must have the same sector size and not be raidz.
root@HOME-NAS[~]#

PS:
I didn't do any "burn in".
It's not a problem with 20TB, I have second drive coming today - they just shipped them separately. They will be in a mirror.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Then it must be the "same sector" part. Though I'm unsure how the "new" 1 TB drive ended up with a different ashift than the old 4 TB drives.

With 20 TB drives incoming, you may just build another pool (mirror), replicate from the old pool to the new mirror and destroy the old pool to solve the geometry issue.
 

katit

Contributor
Joined
Jun 16, 2015
Messages
162
Yep. Looks like I have some time now :) started burn in on 22tb drive will be a while
How do I replicate from old pool? That’s my biggest problem- I don’t want to reconfigure all shares etc.

Worst case I will copy data to new 22tb mirror and then reconfigure everything. But this is not ideal.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Replication is done through a one-off task from the GUI, or zfs send | zfs receive from the CLI if you prefer it this way.
To avoid reconfiguring you shares, you may rename the new pool. But the change is an opportunity to have a simple and "neutral" pool name, which does not needlessly include technical details such as space and geometry. :wink:
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
1. I had happy pool - mirror of 2x4tb devices that's been working for years.
This is probably what's got you tripped up; if this pool is sufficiently well-aged (like a few of mine) then the sector size value won't have been forced to a minimum of 4K size, and is probably the native 512b of the disks on the 2x4T drives. The new 2x1T drives were set to 4K, and breaking the mirror worked, but then the single-disk can't be removed because of the mismatched sector size.

Code:
root@freenas:~ # zdb -C -U /data/zfs/zpool.cache | grep 'name\|ashift'
    name: 'ThisOldPool'
    hostname: ''
            ashift: 9
            ashift: 12


You can re-run that zdb command on your own system to see what the results are but I expect it to look like the above.

5. I tried to "Extend" this sda drive with another new HDD (20TB) and get error: [EZFS_BADTARGET] cannot attach /dev/disk/by-partuuid/2c9641b2-5a69-4b5e-8f31-c385f9edfda9 to /dev/disk/by-partuuid/c0803f47-f5b1-46c0-b384-21047b653990: can only attach to mirrors and top-level disks
What does it mean??
6. I tried to "Replace" with this 20Tb drive (not sure what is the logic here) - but I tried: Error: [EZFS_BADTARGET] cannot replace 11950173496626399005 with /dev/disk/by-partuuid/d5a545d8-aeca-4a2e-911a-bd52d35960e2: already in replacing/spare config; wait for completion or use 'zpool detach'

This part though puzzles me though as the first error in #5 is a red herring (it is a top-level disk, so why isn't it letting you expand?) and there's no spare so the error #6 doesn't make sense to me either. Can you collect a debug file through the SCALE UI at System Settings -> Advanced -> Save Debug (top right) and submit this as a bug using the "Report a Bug" link at the top of the forums?
 

katit

Contributor
Joined
Jun 16, 2015
Messages
162
This is probably what's got you tripped up; if this pool is sufficiently well-aged (like a few of mine) then the sector size value won't have been forced to a minimum of 4K size, and is probably the native 512b of the disks on the 2x4T drives. The new 2x1T drives were set to 4K, and breaking the mirror worked, but then the single-disk can't be removed because of the mismatched sector size.

Code:
root@freenas:~ # zdb -C -U /data/zfs/zpool.cache | grep 'name\|ashift'
    name: 'ThisOldPool'
    hostname: ''
            ashift: 9
            ashift: 12


You can re-run that zdb command on your own system to see what the results are but I expect it to look like the above.



This part though puzzles me though as the first error in #5 is a red herring (it is a top-level disk, so why isn't it letting you expand?) and there's no spare so the error #6 doesn't make sense to me either. Can you collect a debug file through the SCALE UI at System Settings -> Advanced -> Save Debug (top right) and submit this as a bug using the "Report a Bug" link at the top of the forums?

Yes, when running command I get this:
Code:
name: 'main-4TB-mirror'
    hostname: 'HOME-NAS'
            ashift: 12
            ashift: 9


Which bug do you want me to submit? With #5 or #6? Because #6 seem to be pointless operation, why would I replace one with another if it won't let me do #5 later..

Will do it in a day or so, currently SMART tests running on those drives (Burn in)
 

katit

Contributor
Joined
Jun 16, 2015
Messages
162
Replication is done through a one-off task from the GUI, or zfs send | zfs receive from the CLI if you prefer it this way.
To avoid reconfiguring you shares, you may rename the new pool. But the change is an opportunity to have a simple and "neutral" pool name, which does not needlessly include technical details such as space and geometry. :wink:

SAMBA is my worst fear. In 2015 I had heck of a time getting it all configured. I guess I will just do it again with new UI maybe it's going to be easier.

What would be "simple and neutral" pool name?
one
pool-one
pool-1
main
? :)
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Yes, when running command I get this:
Code:
name: 'main-4TB-mirror'
    hostname: 'HOME-NAS'
            ashift: 12
            ashift: 9


Which bug do you want me to submit? With #5 or #6? Because #6 seem to be pointless operation, why would I replace one with another if it won't let me do #5 later..

Will do it in a day or so, currently SMART tests running on those drives (Burn in)
Submit with the combined error message and detail from both #5 and #6 please, under a heading of "Unexpected errors attempting to extend single device vdev" - if we need to split the ticket inside Jira I'll take care of it.

For the debug, as long as all the disks in question are in the system then go ahead and generate a debug file now - it doesn't interfere with normal operation, and ideally we want to capture any logs that might rotate over time. There's an option after submission to attach the debug as a private file so that only iX staff can see it as well.

Thanks, and hope that we can get the storage situation sorted swiftly.
 
Top