21.08-beta-1 install, can not make a pool from HDDs, works fine on SSDs

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
I am quite new to TrueNAS, moving over from Synology based systems. I installed the original TrueNAS scale beta, but then got busy and didn't get a chance to pick it up again till today. I updated to the current beta via GUI without incident. I created a mirror pool from a pair of Optane disks, and another from a pair of SATA SSDs, both completed without incident. When I tried to make a pool RAID-Z2 or RAID-Z from all six disks, it errors out
('one or more vdevs refer to the same device, or one of\nthe devices is part of an active md or lvm device',)

Searching I found references that this can be solved via using the quick wipe option on each individual HDD, but it hasn't helped me here.


Code:
Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.9/concurrent/futures/process.py", line 243, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 97, in main_worker
    res = MIDDLEWARE._run(*call_args)
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 45, in _run
    return self._call(name, serviceobj, methodobj, args, job=job)
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1098, in nf
    res = f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1186, in nf
    return func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs.py", line 112, in do_create
    zfs.create(data['name'], topology, data['options'], data['fsoptions'])
  File "libzfs.pyx", line 404, in libzfs.ZFS.__exit__
  File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs.py", line 112, in do_create
    zfs.create(data['name'], topology, data['options'], data['fsoptions'])
  File "libzfs.pyx", line 1304, in libzfs.ZFS.create
libzfs.ZFSException: one or more vdevs refer to the same device, or one of
the devices is part of an active md or lvm device
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/job.py", line 382, in run
    await self.future
  File "/usr/lib/python3/dist-packages/middlewared/job.py", line 418, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1092, in nf
    res = await f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1182, in nf
    return await func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/plugins/pool.py", line 827, in do_create
    raise e
  File "/usr/lib/python3/dist-packages/middlewared/plugins/pool.py", line 764, in do_create
    z_pool = await self.middleware.call('zfs.pool.create', {
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1305, in call
    return await self._call(
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1262, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/lib/python3/dist-packages/middlewared/service.py", line 827, in create
    rv = await self.middleware._call(
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1270, in _call
    return await self._call_worker(name, *prepared_call.args)
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1276, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1203, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1177, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
libzfs.ZFSException: ('one or more vdevs refer to the same device, or one of\nthe devices is part of an active md or lvm device',)
 
Last edited by a moderator:

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
7 years ago, I had a similar(ish) issue with freeNAS, and the solution was to use a linux machine to wipe the disks. As Scale is linux, I didn't think that would make sense here. Perhaps the WIPE in the scale GUI is not yet functional? Wiping 6 14 TB drives is a pretty massive investment of time, so if I am missing anything obvious I would appreciate a point in the right direction.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
libzfs.ZFSException: one or more vdevs refer to the same device, or one of the devices is part of an active md or lvm device
This needs better handling, because a traceback is never user-friendly, but the root cause of your problem seems to likely be old cruft on your disks, as you have determined.
I'm not familiar with the specifics of lvm or md, but whichever one you were using, read up on where on disk they store metadata and overwrite those specific areas, with nice wide buffers.
Also, wiping n drives should take about as long as a single one, as long as you can connect them all to the server.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
A quick wipe to clear out the partition tables of the old drives should be sufficient.
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
A quick wipe to clear out the partition tables of the old drives should be sufficient.
And yet it didn’t. I tried it at least twice, but perhaps I fat fingered it and missed one disk. I will pick two disks, quick wipe them once more, and then try to create a striped pool. I will report back!! Is there a way to see if any partition data remains so I can proactively check after the wipe?
Thank you for confirming that it is supposed to work the way I expected it to. I suppose the next step would be a bug report (I can search on how to do so here)
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
This needs better handling, because a traceback is never user-friendly, but the root cause of your problem seems to likely be old cruft on your disks, as you have determined.
I'm not familiar with the specifics of lvm or md, but whichever one you were using, read up on where on disk they store metadata and overwrite those specific areas, with nice wide buffers.
Also, wiping n drives should take about as long as a single one, as long as you can connect them all to the server.
Thank you for confirming that this is a simultaneous activity but I am not sure I would use TrueNAS to wipe them. If it is too beta to trust a quick wipe it might be too beta to invest a couple days to do the full wipe? I am not sure to be honest. I might try a full wipe on all disks then cancel 5 minutes later. Maybe I get lucky and the problematic crud is on the front of the disk? We will see!
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
This is a frustrating post to have to make but ... I can't explain what I see. I turned it off for a few days, picked it up again today and
quick wiped two HDD disks, created a 2 disk stripe pool, and it worked immediately. Thought strange but OK lets try another w/o another rough of quick wipe, and sure enough, it created a 2 disk stripe pool without issue. tried the last two disks and again, no issue making another 2 disk stripe pool. I am going to break up all these strip pools and try once more for a RAIDZ pool!
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
Today, I tried the RAIDZ on the 4 1TB SSD and the 6 14TB HDD, and I had no issue at all. I have no explanation other than PEBKAC.
 

sazrocks

Cadet
Joined
Oct 12, 2021
Messages
5
This is a frustrating post to have to make but ... I can't explain what I see. I turned it off for a few days, picked it up again today and
quick wiped two HDD disks, created a 2 disk stripe pool, and it worked immediately. Thought strange but OK lets try another w/o another rough of quick wipe, and sure enough, it created a 2 disk stripe pool without issue. tried the last two disks and again, no issue making another 2 disk stripe pool. I am going to break up all these strip pools and try once more for a RAIDZ pool!
I want to reply and mention that I had the exact same issue as you on SCALE-22.02-RC.1-2. No matter how many times I ran the quick wipe, pool creation would fail. What solved it, was wiping the disks, then rebooting the truenas box once, then attempting to create the pool, whereupon it succeeded first try. This seems consistent with what you experienced also. I'm not sure if this is intended functionality or not, but if it isn't, it might be worth looking into
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
I appreciate the confirmation of behaviour.
 
Top