Strange error when trying to make a pool (KeyError: '{serial_lunid}2PH9NRKT)

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
Hi All,

This is an odd one. I have 8x16GB Red Pro drives. 4 are connected to the motherboard and 4 are connected to a PCI LSI SATA card.

I created a 4 pool based on motherboard and that worked
Now try an 8 disc pool and it errors with a Key Error '{serial_lunid}13400950DDC4_500a07510950ddc4

Okay, so I work my way through the pool creating different combinations to try and track down what the disc is (has to be said that serial number isn't one of the discs in the pool). I eventually find that one of the ports on the da, the da0 seems to be the problem. So I try to work out physically which one is causing the problem and I pull each connector out and reboot and try again. But it fails for all connectors but that message started referring to a serial number and the serial was always the one NOT connected.

Now has to be said that when I did the 8 pool to start with it did create the pool. I have run smart test on each of the drives which works. They are new drives.

So what does that error mean? The full stack trace is here
Error: Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 355, in run
await self.future
File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 391, in __run_body
rv = await self.method(*([self] + args))
File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 975, in nf
return await f(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 764, in do_create
await self.finalize_zpool_create_or_import(job, pool)
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 802, in finalize_zpool_create_or_import
await self.middleware.call('disk.sync_zfs_guid', pool)
File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1305, in call
return await self._call(
File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1262, in _call
return await methodobj(*prepared_call.args)
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/disk_/zfs_guid.py", line 59, in sync_zfs_guid
self.middleware.send_event("disk.query", "CHANGED", id=event, fields=disks[event])
KeyError: '{serial_lunid}13400950DDC4_500a07510950ddc4'

Is this a real error or just some software issue? I'd read on reddit that people having issues with certain versions of TrueNAS. I am on the latest TrueNAS-13.0-RELEASE

Any ideas or clues on what I should be looking at?

Kindest
Paul
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
Maybe looking into the LSI SATA card you're talking about might be the first thing... is it using port multiplication?

https://www.truenas.com/community/r...t-multipliers-and-cheap-sata-controllers.177/
Wasn't that.

A system reset sorted it all out. From the error is looks like a bug, it's referencing a drive that isn't there any more and in fact first time I saw it I had removed a SSD I was using to check the card was working. So TrueNAS noticed that drive had gone and it's that serial number it is referencing.

So I think all is good now.

As I said, maybe a bug - not sure. I'd seen it mentioned a few times on reddit as well. I will report it anyway, just in case it is one

thanks
Paul
 

runevn

Explorer
Joined
Apr 4, 2019
Messages
63
Wasn't that.

A system reset sorted it all out. From the error is looks like a bug, it's referencing a drive that isn't there any more and in fact first time I saw it I had removed a SSD I was using to check the card was working. So TrueNAS noticed that drive had gone and it's that serial number it is referencing.

So I think all is good now.

As I said, maybe a bug - not sure. I'd seen it mentioned a few times on reddit as well. I will report it anyway, just in case it is one

thanks
Paul
I'm experiencing the same thing with TrueNAS 13.0. Did you restore configuration from a backup or did you have to manually set everything up again`?
 

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
I'm experiencing the same thing with TrueNAS 13.0. Did you restore configuration from a backup or did you have to manually set everything up again`?
Sorry for late response.

I was really still setting up for the first time so I reset the system in the web ui and started set up again.

The key for me was that the serial of the drive it was complaining about was not connected. So I’d guessed stale config issues

Cheers
Paul
 
Top