Doorbell handshake failed

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
In my experience, these tend to happen in about a four to eight hour timeframe, though I think I've seen it happen after a day or two. I would be pretty comfortable with a week as "proof" if you are also throwing some workload at it to exercise it. It isn't clear that this happens due to workload so it would be best to try both exercise and quiescent periods.
 

hernanbozzano

Dabbler
Joined
Aug 3, 2018
Messages
15
In my experience, these tend to happen in about a four to eight hour timeframe, though I think I've seen it happen after a day or two. I would be pretty comfortable with a week as "proof" if you are also throwing some workload at it to exercise it. It isn't clear that this happens due to workload so it would be best to try both exercise and quiescent periods.

well... in the evening nobody uses my server, so i think it might qualify as standby period... and during the day i have plex server that some family members use, and i have been using it also hanging out in the network and doing some cloud sync tasks and stuff... so far so good.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
As a test, I reconfigured a FreeNAS 11.2-U8 virtual machine running under ESXi v6.7 (see BANDIT in 'my systems' below), increasing its memory allocation from 128GB to 192GB, and restarted it at 05:48PM yesterday. I made sure that all of its RAM was reserved (locked).

So far, so good. I'll report back after a week, or sooner if it fails with the 'Doorbell Handshake' error.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
As a test, I reconfigured a FreeNAS 11.2-U8 virtual machine running under ESXi v6.7 (see BANDIT in 'my systems' below), increasing its memory allocation from 128GB to 192GB, and restarted it at 05:48PM yesterday. I made sure that all of its RAM was reserved (locked).

So far, so good. I'll report back after a week, or sooner if it fails with the 'Doorbell Handshake' error.
My BANDIT FreeNAS 11.2-U8 VM has been running for 8 days now, with no 'Doorbell Handshake' error. The only change since this error happened over a year ago is that I've replaced the 3 x LSI SAS9210-8i HBAs with SAS9217-8i cards.

The ESXi version is 6.7.0 Update 3 (Build 17700523)

My tentative hypothesis is that SAS2008-based controllers are subject to this error, while the newer SAS2308 and SAS3008 controllers are are not.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
My tentative hypothesis is that SAS2008-based controllers are subject to this error, while the newer SAS2308 and SAS3008 controllers are are not.

Well, this really wangs chung... (obscure Futurama reference). That's not inconsistent with what I've observed, since I tend to use PERC H200's in 2U's for the connectors on the far end of the card, and those tend to be virtualized FreeNAS's.
 

ViciousXUSMC

Dabbler
Joined
May 12, 2014
Messages
49
I had this issue when I upgraded From 11.x to 12U6, I think I resolved it by simply having to reboot my entire host. Its been up and working for the last several months. I decided to update from U6 to U8.1 yesterday, had no issues until overnight and today it had crashed back to the same.

Just had to reboot the entire host to get it up and working again, I guess I will keep an eye on it. If it crashes again I'll have to see about starting a thread for support.

Also my last jail that I left in TrueNas broke, Transmission with a built in VPN. Used a guide to build that a long time ago, it has broken before from updates, and now broke again. Working on replacing it with a standard VM and using my Firewall to force VPN traffic.

So while this thread was a bit older, the issue just happened to me, wanted to key in with my experience and my remedy for it.
Never ever did I have an issue on the older 11.x and before this was new as of 12.x particularly for me I think 12U6
 

Frollo

Cadet
Joined
Feb 7, 2022
Messages
8
I'm actually experiencing this as well after upgrading from 12U8-13. I run in a VM and haven't seen these issues at all until i upgraded now 13 starts lets me log in and then it starts to error out with what appears to be disk read errors, if i restart the VM i get the doorbell errors until i reboot the host. Unfortunately the boot menu doesn't give me an option to revert back to 12 :(
 

anaxagorasbc

Dabbler
Joined
Dec 3, 2021
Messages
15
I had the same issue, went to upgrade from 12U8 to 13U2, system locked up within 10 minutes of being on 13U2, rebooted and was getting the doorbell handshake error. Had to reboot the entire host, it happened again. Downgraded back to 12U8 and the issue seems to be gone.
 

ViciousXUSMC

Dabbler
Joined
May 12, 2014
Messages
49
Recovering from Hurricane Ian, had an extended 5 day power outage. Getting everything up again and this just happened again after a fresh boot up. Shame the old version was so damn stable for me I dont know what happened with the newer versions to cause this.

I had held off on some of the newest updates in fear of breaking, but perhaps now I will give it a shot and see if this was resolved.
 

De_Gutch

Cadet
Joined
Sep 19, 2021
Messages
6
This just happened to me after upgrading from 12 to 13.

Any solutions?

Well I am back on 12 and thankfully everything works fine
 
Last edited:

Helipil0t

Cadet
Joined
Oct 27, 2022
Messages
7
Yeah I just went through the same thing. Was running TrueNAS/FreeNAS from version 10 all the way to 12U8 for years now without issues. Then this week I upgraded to 13U2 and locked up with the same "doorbell" issue associated to my SAS2008 being passed through on ESXi 6.7.
Reverted back to 12U8 and am back to being stable. I'd be curious to know if anyone is having similar issues using other virtualzation environments like Proxmox or XCP-NG. I know TrueNAS "should" be ran on bare metal, but most us homelabers don't want to have to afford doing that. We're not exactly storing mission critical data. If I lost everything, it would suck for sure.. but wouldn't be the end of the world. The important stuff is backed up. Anyway I've been thinking about switching to Proxmox or XCP-NG, anyone out there running TrueNAS as a VM on those platforms?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Anyway I've been thinking about switching to Proxmox or XCP-NG, anyone out there running TrueNAS as a VM on those platforms?

Please see the virtualization stickies. XCP-NG in particular is not known to be stable; Proxmox is stable for some people, but not others. Your best fix is probably to replace the SAS2008 controller with a SAS2308, which don't seem to suffer from the issue, and keep on running ESXi, which is known to work well on most hardware.
 

Helipil0t

Cadet
Joined
Oct 27, 2022
Messages
7
Please see the virtualization stickies. XCP-NG in particular is not known to be stable; Proxmox is stable for some people, but not others. Your best fix is probably to replace the SAS2008 controller with a SAS2308, which don't seem to suffer from the issue, and keep on running ESXi, which is known to work well on most hardware.
Fair enough. I guess I'll stay put then and look into picking up a 9207.

Thanks!
 

Helipil0t

Cadet
Joined
Oct 27, 2022
Messages
7
Fair enough. I guess I'll stay put then and look into picking up a 9207.

Thanks!
Just an update. I picked up an LSI 9207-8i hoping it would solve the problem. I don't get the "doorbell handshake failed" error and it boots properly, but randomly every 45 minutes or so I get an unexpected shutdown. Reverted back to 12U8 and it's as stable as ever. When I get some spare time I'll look at what firmware it has loaded. Maybe I can flash it with the latest to solve the instability in 13U3. Anyone else successfully running a 9207 on 13U? Should I consider a different card?
 

Helipil0t

Cadet
Joined
Oct 27, 2022
Messages
7
Just an update. I picked up an LSI 9207-8i hoping it would solve the problem. I don't get the "doorbell handshake failed" error and it boots properly, but randomly every 45 minutes or so I get an unexpected shutdown. Reverted back to 12U8 and it's as stable as ever. When I get some spare time I'll look at what firmware it has loaded. Maybe I can flash it with the latest to solve the instability in 13U3. Anyone else successfully running a 9207 on 13U? Should I consider a different card?
Updated to the latest firmware and same issue. Very unstable keeps shutting down. I'm going to stick to 12U8 for now. :(
 

Helipil0t

Cadet
Joined
Oct 27, 2022
Messages
7
Updated to the latest firmware and same issue. Very unstable keeps shutting down. I'm going to stick to 12U8 for now. :(
Ok! I updated ESXi from version 6.7 to version 7.0U3 and it seams to have resolved the issue. I've been stable for about 48 hours now. Will report back if anything changes.
 

Helipil0t

Cadet
Joined
Oct 27, 2022
Messages
7
Thought I had it solved. But TrueNAS is still crashing. No idea what to do. I've reverted to 12U8 again and am sticking with it. Maybe a fresh install of 13U is the answer as supposed to an upgrade. I'm going to leave it be or the time being. If it aint broke, don't fix it!
 

indepth

Cadet
Joined
Oct 24, 2022
Messages
1
I commend the dedication to fixing the issue. I downgraded and have been stable since. I thought about trying a different card as well, but I figured I'd just stick with this version until another brave soul is able to get it to work.
 

De_Gutch

Cadet
Joined
Sep 19, 2021
Messages
6
sorry to hear that the 9207-8i didn't solved the problem, you saved me from buying one though, thanks for your testing efforts.

Maybe ESXi 8 is the solution? Anyone running it already? I would like to upgrade but feel intimidated by the task, got the feeeling that something could go wrong and will lose all my vm's. I am not a computer expert and feel lucky to have managed to run my server without issues for the last year. :grin:
 
Top