SOLVED purposefully degrade raidz2 to extract drives somewhat gracefully?

Dice · Apr 10, 2023

EDIT:
This turned into somewhat "documenting the journey"-kind of post.
________

Hello,
(Version: TrueNAS-13.0-U4)

I'd like to rebuild my pool, with the purpose to shrink my 2vdevs width from 7 to 6 drives, hopefully "somewhat gracefully".
Most data is backed up. Yet, I still need to utilize one or 2 drives from the current pool during the "rebuild window".

I expected this to be quite simple and straight forward, by first offline a drive, wipe it in GUI (crucially: quick), as the drive became available in the list of drives for pool creation, it would accept a new pool. It does, halfway through the creation until it throws this error:

Error: ('one or more vdevs refer to the same device, or one of\nthe devices is part of an active md or lvm device',)

For the sake of documenting my ill-adviced hackery;
Next I did online the drives again (one from each vdev).
What happened then surprised me a little bit.

The gui gladly accepted the freshly wiped drives, without any resilvering or scan. All boxes green?!
As the drives were wiped the gptid label was correctly lost, giving signs something did change:

Code:

        wd60efrx                                        ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/14ef1fa6-e0a4-11e5-b134-0cc47ab3208c  ONLINE       0     0     0
            gptid/15c495ba-e0a4-11e5-b134-0cc47ab3208c  ONLINE       0     0     0
            gptid/16990bee-e0a4-11e5-b134-0cc47ab3208c  ONLINE       0     0     0
            gptid/1769399b-e0a4-11e5-b134-0cc47ab3208c  ONLINE       0     0     0
            gptid/18479def-e0a4-11e5-b134-0cc47ab3208c  ONLINE       0     0     0
            gptid/1911207e-e0a4-11e5-b134-0cc47ab3208c  ONLINE       0     0     0
            da12p2                                      ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/7fe401b9-d3a0-11ec-9bc5-00259051e3f2  ONLINE       0     0     0
            da11p2                                      ONLINE       0     0     0
            gptid/81b84a85-d3a0-11ec-9bc5-00259051e3f2  ONLINE       0     0     0
            gptid/81c61691-d3a0-11ec-9bc5-00259051e3f2  ONLINE       0     0     0
            gptid/82601510-d3a0-11ec-9bc5-00259051e3f2  ONLINE       0     0     0
            gptid/a333f8c5-6a61-11ed-b21b-ac1f6bb3a54c  ONLINE       0     0     0
            gptid/8364ed1f-d3a0-11ec-9bc5-00259051e3f2  ONLINE       0     0     0

I proceed to attempt to offline the drives again, to reach into further zfs work.
At this point the drives would no longer accept being "offlined". nothing happens in the GUI.
At this point it is obvious the situation is unfolding in an unstable way.

I proceeded to CLI, attempting to offline a drive;
zpool offline wd60efrx da12p2 That did not work either (no error code, no change)
zpool list -v wd60efrx would still show the drive as online. Despite being most recently wiped.

This is when I start a scrub to at least return to a better starting point, waiting the next step, and reaching out for advice with this post.

A sketchy path that comes to mind;
Maybe it would work if I offline, wipe the drives, then export the pool. Create a "tanktemp" pool with the bufferzone drives for the migration and finally import back the larger wd60efrx pool? The part I don't like is that this seemingly carries significantly more risk, with issues of importing a "confused" (or what else to call it from experiences above?) and simultaneously degraded pool.

I hope for any advice going forward.

Cheers,

Dice · Apr 10, 2023

Some further investigation&tinkering on what the GUI quick-wipe caused;
Seemingly the 2nd partitions gptid has been lost;
Here, da11 and da12 are the one's I've wiped. They no longer register a 2nd partition gptid. I added da13 for reference:

Code:

glabel status | grep -iE 'da1[1,2,3]'
gptid/abf7ca1b-d77b-11ed-8aef-ac1f6bb3a54c     N/A  da11p1
gptid/a333f8c5-6a61-11ed-b21b-ac1f6bb3a54c     N/A  da13p2
gptid/a2c40615-6a61-11ed-b21b-ac1f6bb3a54c     N/A  da13p1
gptid/68a8a3a1-d77b-11ed-8aef-ac1f6bb3a54c     N/A  da12p1

However, gpart notices the 2nd data partition is still there. So the "wipe" is particularly weak it seems.

Code:

gpart show da11
=>        40  7814037088  da11  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

I'm debating if a gpart destroy -F da11 would finally "release my drive" from the middleware.

Snipping out from gpart list

Code:

Geom name: da11
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 7814037127
first: 40
entries: 128
scheme: GPT
Providers:
1. Name: da11p1
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r0w0e0
   efimedia: HD(1,GPT,abf7ca1b-d77b-11ed-8aef-ac1f6bb3a54c,0x80,0x400000)
   rawuuid: abf7ca1b-d77b-11ed-8aef-ac1f6bb3a54c
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 2147483648
   offset: 65536
   type: freebsd-swap
   index: 1
   end: 4194431
   start: 128
2. Name: da11p2
   Mediasize: 3998639460352 (3.6T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   efimedia: HD(2,GPT,ac0d9c97-d77b-11ed-8aef-ac1f6bb3a54c,0x400080,0x1d180be08)
   rawuuid: ac0d9c97-d77b-11ed-8aef-ac1f6bb3a54c
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 3998639460352
   offset: 2147549184
   type: freebsd-zfs
   index: 2
   end: 7814037127
   start: 4194432
Consumers:
1. Name: da11
   Mediasize: 4000787030016 (3.6T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e2

Seemingly, the rawuuid is still there, at least broadly coherent with the rest.

At this point, I'm in two missions. One is tinkering & learning while recovering from the first errors, trying to restore my situation.
- to relabel my 2nd partition with proper gptid that TN finds.
(which has some ....similarities ROFL to my endeavor in 2019 https://www.truenas.com/community/threads/gptid-lost.77861/ )

The second mission is to return to the OP, ie, how to <extract> a drive from my Z2 vdevs.

Question still remains;
- why it is no longer possible to offline these drives??

Dice · Apr 10, 2023

Dice said:
Question still remains;
- why it is no longer possible to offline these drives??

Here's a theroy:
The GUI seems to still map towards drive RAWUUID, even if it cannot be recognized by ZFS for the particular ID.
Thus, even if pool health registers OK, data is there, the drive can no longer be manipulated via GUI as expected.

I could not offline the drive again from gui, nor cli as per command above.
I resorted to <nukes> in the form of glabel...
The label was missing on da12p2 (and da11p2).

First I extracted the rawuuid for the second partition:

Code:

gpart list | grep -A 20 da12p2 | grep rawuuid
   rawuuid: 68cc9008-d77b-11ed-8aef-ac1f6bb3a54c

And recreated the glabel:

Code:

glabel create -v 68cc9008-d77b-11ed-8aef-ac1f6bb3a54c /dev/da12p2

Verified:

Code:

glabel status | grep da12

...which was partially successfull. Rather than displaying an gptiid (which happened to have the same value as my rawuuid), I get a label/68cc9008-d77b-11ed-8aef-ac1f6bb3a54c.

Somewhat happy, I could no offline the drive from the GUI and proceed to nuking it with fire.

gpart destroy -F /dev/da12 seemingly nuked partition data ^^

....thus I return to trying to create a new pool out of this drive. Still same error, here in its entirety:

Code:

Error: concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 111, in main_worker
    res = MIDDLEWARE._run(*call_args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 45, in _run
    return self._call(name, serviceobj, methodobj, args, job=job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 979, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 117, in do_create
    zfs.create(data['name'], topology, data['options'], data['fsoptions'])
  File "libzfs.pyx", line 402, in libzfs.ZFS.__exit__
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 117, in do_create
    zfs.create(data['name'], topology, data['options'], data['fsoptions'])
  File "libzfs.pyx", line 1376, in libzfs.ZFS.create
libzfs.ZFSException: one or more vdevs refer to the same device, or one of
the devices is part of an active md or lvm device
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 355, in run
    await self.future
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 391, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 975, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 734, in do_create
    raise e
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 687, in do_create
    z_pool = await self.middleware.call('zfs.pool.create', {
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1279, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1236, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/service.py", line 496, in create
    rv = await self.middleware._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1244, in _call
    return await self._call_worker(name, *prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1250, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1169, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1152, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
libzfs.ZFSException: ('one or more vdevs refer to the same device, or one of\nthe devices is part of an active md or lvm device',)

I do notice (albeit unwillingly) that I'm using a single nvme drive, split across 2 pools. That would somewhat interfere with this message.
However - I did try to create another pool out of the ditched L2ARC SSD, which proceeded without problem.
At this point, I still believe the drive is not <enough nuked> and needs further #attention...

Dice · Apr 10, 2023

At this point I've tested my patience and willingness to "resolve beautifully" the situation, to a mental state rather characterized by "the weekend is soon over, let's get to a decent state"..

First get a "full zero fill" wipe going.
Meanwhile, trying to only wipe out the last few megs of the drive to remove anything stored there.
I got a vague memory of ZFS storing some few megs of information at the end of each drive. This might be wrong.
Instead of zerofilling the entire drive, I'm looking for a shortcut.

First finding out some information of the drive;

Code:

geom disk list da12
Geom name: da12
Providers:
1. Name: da12
   Mediasize: 8001563222016 (7.3T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r0w1e0
   descr: ATA TOSHIBA HDWG180
   lunid: 5000039ab8cacd31
   ident: 2180A0BXFAUG
   rotationrate: 7200
   fwsectors: 63
   fwheads: 255

Here I take note of media size. I <guess> this is size in bytes. It corresponded well enough for my current level of patience and f*cksgiven.
The idea is to use dd to skip most of the drive, but then at the end, start blasting with zero's.
I did a rough estimate, cut some digits and ended up with something to try with:

Code:

dd if=/dev/zero of=/dev/da12 bs=1M count=100 skip=8001563 status=progress
100+0 records in
100+0 records out
104857600 bytes transferred in 0.811647 secs (129191089 bytes/sec)

Writing quite sparsely with data here induces a chance of "missing" the point. A safer, shotgun-esque approach would be to let DD write say count=40000 until the drive basically just "reaches its end".
I came to think about this meanwhile rebooting (to cancel the GUI induced zerofill that I could not figure out which process was responsible?!).

Testing to create a pool of da12 FINALLY worked.

I've now created a new pool out from the rather molested da12!
Beat, but happy camper!

WI_Hedgehog · Apr 10, 2023

I'm not really sure what you're trying to do as your description, to me anyway, appears incomplete. (I saw the 7 to 6 disk part, but it's a bit foggy as to how much data you have and what you're trying to do with 2 drives, what RAID-Zx level is active, etc...)

It also seems your process is similarly incomplete, which is causing you problems, but, maybe that too is me not understanding as I've not tried to do what you're doing.

As far as your drive containing data goes, maybe this would help:

ZFS Ubuntu 16.04 Replace drive with itself

I'm new to the ZFS filesystem. I was using a DrivePool in windows, but I wanted something that would correct errors on the fly as well as create and maintain snapshots. I was using BTRFS at one time;

unix.stackexchange.com

In my estimation, I *think* what might need to be done for your "quite simple and straight forward" process is re-create the Z-pool with 6 drives and restore from backup.

But what it sounds like you're trying to do is remove 2 drives from a 7-disk RAID-Z2 pool and use them in a 6-disk pool, then copy the data from Pool1 to Pool2, then remove Pool2 (without unconfiguring the drives). That's kind of an endeavor fraught with peril, but so is Alaskan King Crab Fishing, yet people do it.

Dice · Apr 10, 2023

WI_Hedgehog said:
(I saw the 7 to 6 disk part, but it's a bit foggy as to how much data you have and what you're trying to do with 2 drives, what RAID-Zx level is active, etc...)

Sorry about the fog.

I've a pool consisting of 2x vdev Z2, 7drive wide.
The goal is a pool of 1x vdev Z2 6drive wide.
I've mostly been able to backup files elsewhere, but not entirely. I'd need 1 drive more of "migration buffer storage space".
This buffer storage, I wanted to grab from redunancy in the Z2's. Thus, trying to "remove the drive from the systems awareness", in order to create a temporary pool of a single drive to hold some of the data.

Hope this clears some of the confusion :)

WI_Hedgehog · Apr 10, 2023

Yup, that clears up a lot of questions. How big are your data drives? And how much total data do you have?

(Hedge was unable to tell from your System Specs, which are only now showing up, perhaps the fault of:

his poor Internet connection
poor vision
hangover
all of the above

Please circle the correct answer with a #2 pencil.)

Important Announcement for the TrueNAS Community.

SOLVED purposefully degrade raidz2 to extract drives somewhat gracefully?

Dice

Wizard

Dice

Wizard

Dice

Wizard

Dice

Wizard

WI_Hedgehog

Guru

ZFS Ubuntu 16.04 Replace drive with itself

Dice

Wizard

WI_Hedgehog

Guru

Similar threads

Important Announcement for the TrueNAS Community.

SOLVED purposefully degrade raidz2 to extract drives somewhat gracefully?

Wizard

Wizard

Wizard

Wizard

Guru

Wizard

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "purposefully degrade raidz2 to extract drives somewhat gracefully?"

Similar threads