Pool/Host Identification changes after reboot

LandMineHare

Cadet
Joined
Oct 4, 2015
Messages
6
So a week or so ago my area had a rigorous windstorm that flicked the power switches for our municipality on and off for a bit. I had been at work at the time and my partner was asleep so I wasn't able to shutdown the server, and apparently my UPS battery had died at some point over the last 3 years (new one is on the way, huzzah). But now whenever I reboot the server, neither of my pools are online. After much googling initially I learned how to reimport them through ssh, which was also a pain because Terminal throws up an error about
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
so each reboot I have to clear the entry out of the known_hosts file. Then export/disconnect them through the GUI and reimport them. Nothing in the pools seems to be off, no data appears corrupted.

The disks all show up fine on reboot. TrueNAS lists them all properly, none of them have any SMART errors that I can see, but the issue is annoying nonetheless. Can't seem to find anything specific in the console or logs that say why it can't import or find the pools on launch. Any thoughts on how I can fix it?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

This looks like a problem with the boot pool and not the data pool.

For the SSH service to re-create a new host key at every boot, that means it can not find back the one he created on the previous boot up.

Can you post the result of a zpool status ?
 

LandMineHare

Cadet
Joined
Oct 4, 2015
Messages
6
Hi,

This looks like a problem with the boot pool and not the data pool.

For the SSH service to re-create a new host key at every boot, that means it can not find back the one he created on the previous boot up.

Can you post the result of a zpool status ?

I left out the portions for the non-boot pools because I figured they were inconsequential. Read the same, save for more gptids for each disk.

Code:
pool: freenas-boot
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
            still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
            the pool may no longer be accessible by software that does not support
            the features. See zpool-features(5) for details.
   scan: scrub repaired 0B in 00:07:52 with 0 errors on Wed Jan 13 03:52:52 2021
config:

NAME                                                                             STATE     READ WRITE CKSUM
    freenas-boot                                                              ONLINE       0         0          0
      gptid/d8789568-55e0-11e6-b55c-0cc47a097e04  ONLINE       0         0          0


errors: No known data errors
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Ok ; so the boot pool seems to be clean.

Would it be possible that you have a real security incident here and that your SSH sessions are hijacked ? Lets see...

SSH hijacking is possible only when doing UID / Password authentication. It is not possible when doing public key authentication. So go in your FreeNAS physical console and create a new ssh key pair :
ssh-keygen

Add the public key as an authorized key for your root account :
cat id-rsa.pub >> /root/.ssh/authorized.keys

Then, insert a USB key in your FreeNAS and copy the private key (by default named id-rsa) to that USB key.

Umount the key and take it to your client station. Inject the private key in your ssh client and connect to FreeNAS using it.

If it succeeds, your SSH session is clean. For that test to be relevant, you must propagate the key using an offline method like the one I described here with a USB key. Should you rely on SSH / SFTP to propagate the key, the test will not be relevant.
 

LandMineHare

Cadet
Joined
Oct 4, 2015
Messages
6
Sorry for the (very) delayed reply. I've been without a power outage since initially posting and got distracted with long hours at work, and after following your steps nothing seemed amiss.

However, we just had a minor power flicker, forcing the server to reboot, and now we're back to square one, with the ECDSA key fingerprint on the server being reset, and none of the pools being attached at launch, as well as most of the jails not being started once they were reimported, again.

So looks like I'm back to square one.
 

LandMineHare

Cadet
Joined
Oct 4, 2015
Messages
6
Ok ; so the boot pool seems to be clean.

So update. Had an unscheduled reboot while out of town, got back and when logging into the server had this error:

Failed to check for alert VolumeStatus: Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/alert.py", line 706, in __run_source alerts = (await alert_source.check()) or [] File "/usr/local/lib/python3.8/site-packages/middlewared/alert/source/volume_status.py", line 31, in check for vdev in await self.middleware.call("pool.flatten_topology", pool["topology"]): File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1238, in call return await self._call( File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1206, in _call return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args) File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1110, in run_in_executor return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs)) File "/usr/local/lib/python3.8/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run result = self.fn(*self.args, **self.kwargs) File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/pool.py", line 438, in flatten_topology d = deque(sum(topology.values(), [])) AttributeError: 'NoneType' object has no attribute 'values'

Maybe this is the source?

Update: I also get an error saying it's unable to validate.

Code:
Error: [Errno 13] Permission denied: './ValidateUpdate'
 
Last edited:

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

I noticed this in the latest release's notes :
  • [NAS-109020] - SSH service failed to start - After Upgrade from Freenas - Hostkey missing
Maybe what you experienced here is related to this bug ?
 

LandMineHare

Cadet
Joined
Oct 4, 2015
Messages
6
Hi,

I noticed this in the latest release's notes :
  • [NAS-109020] - SSH service failed to start - After Upgrade from Freenas - Hostkey missing
Maybe what you experienced here is related to this bug ?

I updated and that seems to have fixed all the issues I was having; including my pools being mounted but offline and not actually existing on launch/reboot. Whether or not it lasts through the next reboot, we shall see.
 
Top