Dell R630 FreeNAS v9-10- Crashing on many fronts

Status
Not open for further replies.

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527
If I had to guess I'd say these QLogic NetXtreme II BCM57800 NICs could be the culprit. The 2 brands we know work well with FreeBSD are Intel and Chelsio. You may have stumbled into a bug with the bxe driver. This is just a guess though.
 

Bryon Brinkmann

Explorer
Joined
Oct 7, 2016
Messages
50
If I read this right its crashing right when it starts creating the bridge/ifconfig on the lagg interface. Right Wrong or complete air ball?

So even after I break the Bond on the NIC's it still causes it to crash... I'm super bummed because I wanted to use this as a replacement for some older 1950's I'm using as ESXi's.
 

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527
Do you have any other NICs you can test with, even if they're just 1 GbE?
 

Bryon Brinkmann

Explorer
Joined
Oct 7, 2016
Messages
50
Do you have any other NICs you can test with, even if they're just 1 GbE?

I'll poke around the office and see if I can find a card that I could test with.. Some of the developers here have a lot of experience with FreeNas and these Qlogic's. Hope to post some info this evening
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You are running this on ESXi, correct?

First lets simplify things a bit, you don't need 16 CPUs, just change that to 2 or 4 CPUs. The RAM, is it locked? How do you have your NICs connected to the VM?

Lastly, what are the results when running on a bare metal machine? If people are going down the path of NICs or any other hardware, you should take ESXi out of the equation. Just my opinion. Once you can get it to work on bare metal, then try to make it work on ESXi. And what version of ESXi are you running?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Also, when you say that you are creating a jail, you really mean that you are installing a plugin, in this case, Plex?

What about if you just try to create a standard jail? What happens.

Additionally, you have a ton of RAM and a lot of CPU power, if you haven't already done so you really need to test them out. I'd run MemTest86 for a week on all that RAM and a CPU stress test for maybe 20 minutes. You should also run a stress test on your hard drives/controller card by doing some lengthy "dd" data transfers (or similar) just to see how it holds up to constant use.

As for the NIC, do you have LAGG configured in the VM? If you do then you should just simplify it, even on bare metal, get rid of the LAGG and see how it works.
 

Bryon Brinkmann

Explorer
Joined
Oct 7, 2016
Messages
50
You are running this on ESXi, correct?

First lets simplify things a bit, you don't need 16 CPUs, just change that to 2 or 4 CPUs. The RAM, is it locked? How do you have your NICs connected to the VM?

Lastly, what are the results when running on a bare metal machine? If people are going down the path of NICs or any other hardware, you should take ESXi out of the equation. Just my opinion. Once you can get it to work on bare metal, then try to make it work on ESXi. And what version of ESXi are you running?

No it is not currently on ESXi, straight bare metal. I used ESXi 6.5 u2 to test and it all worked without an issue. Plugin's, Jails, LACP ETC no crashing. As for the CPU's I agree it doesn't need it (WAY OVER KILL - It's like hunting with a rocket launcher and hand grenades) but the server has those CPU's already installed - I could take out some memory but I don't think thats the issue. Every hardware or memory test has passed without issue.
 

Bryon Brinkmann

Explorer
Joined
Oct 7, 2016
Messages
50
Also, when you say that you are creating a jail, you really mean that you are installing a plugin, in this case, Plex?

What about if you just try to create a standard jail? What happens.

Additionally, you have a ton of RAM and a lot of CPU power, if you haven't already done so you really need to test them out. I'd run MemTest86 for a week on all that RAM and a CPU stress test for maybe 20 minutes. You should also run a stress test on your hard drives/controller card by doing some lengthy "dd" data transfers (or similar) just to see how it holds up to constant use.

As for the NIC, do you have LAGG configured in the VM? If you do then you should just simplify it, even on bare metal, get rid of the LAGG and see how it works.

You are correct in stating "installing a plugin" but once the plugin tries to bond the the interface LACP or Single NIC it crashes in spectacular form. Even if I create a standalone Jail once it gets to a point (I believe bonding to the interface) it crashes. Once it reboots it tries to start the jail and it crashes again at starting up the jail and never recovers.

I've ran many hardware / memory tests (MemTest86 was one of them) without faults. I've also transferred as of now about 2TB + and still going. No issues

This is all bare mental no VM... QLogic NetXtreme II BCM57800 are the NIC's 2x 1GB 2x 10gb (10's are not connected)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
No it is not currently on ESXi, straight bare metal. I used ESXi 6.5 u2 to test and it all worked without an issue. Plugin's, Jails, LACP ETC no crashing.
I assume you meant ESXi 6.0 U2.
Also, just to be clear, there were no problems with the crashing when used as a VM on ESXi but there were crashes while on bare metal. This is what I read in the above sentence.

Also, on bare metal, as a clean install, no LACP, simple basic network setup, and it will fail when you add a jail. If this is the case, you need to submit a bug report and be very descriptive of your hardware, you can include the crash log file as well.

Good Luck Sir.
 

Bryon Brinkmann

Explorer
Joined
Oct 7, 2016
Messages
50
I assume you meant ESXi 6.0 U2.
Also, just to be clear, there were no problems with the crashing when used as a VM on ESXi but there were crashes while on bare metal. This is what I read in the above sentence.

Also, on bare metal, as a clean install, no LACP, simple basic network setup, and it will fail when you add a jail. If this is the case, you need to submit a bug report and be very descriptive of your hardware, you can include the crash log file as well.

Good Luck Sir.

Correct - How would I submit a bug?

  • ESXi 6.0 U2 - No issues at all works well.
  • Bare metal - LACP or Simple basic network fails catastrophically when adding a Jail.. I believe the root cause is the Qlogic driver. Everything else I.E Copying data, shares ETC no issue observed.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Status
Not open for further replies.
Top