Fresh virtualized TrueNAS Core-13.0-U5.3 installation keeps crashing.

Rex Raiden X

Cadet
Joined
Nov 15, 2023
Messages
4
Hello everyone, I deiced to try out TrueNAS after a lot of headache trying to get Xpenology working. I did get Xpenology to work but it was a PITA and ran it until the sacrificial HDD died. I created a new VM using FreeBSD 12 or later versions (64-bit) as the Guest OS. Allocated 8GB of reserved RAM and 20GB of storage as "Thick provisioned, eagerly zeroed" and set the LSI controller as passthrough. Added 3 NAS HDDs and after watching a few tutorials on YouTube I moved forward with the installation which, compared to Xpenology, was a walk in the park. I proceeded to create a "test" pool with RaidZ, created 2 test users, setup ACLs accordingly and created an SMB share. I stopped there and left the VM running only to come back the next day and see it shutdown. I didn't remember doing this but payed no attention, turn it back ON and forgot about it. When I checked it again half a day later it was OFF once more and I knew there was something going on. I tested this about 6 times and the VM seems to stay on for about 2hrs. I don't know enough of this to determine what is causing the sudden shutdown. I tried Googling for a solution but came up empty. DuckDuckGo was a little more helpful with the results but I still don't have a solution; just some vague idea of where the logs are located. Which brings us to now; below are the specs of my current setup. Thank you very much to everyone in advance.

Motherboard: Supermicro X9DRD-7LN4F-JBOD
CPU: Intel Xeon E5-2697 v2 2.7GHz (x2)
CPU Cooler: Noctua NH-U9DX i4 (x2)
RAM:
64GB (8x8GB) Non-ECC DDR3/DDR3L 1600MHz PC3L-12800 CL11 DIMM
Host HDD: Seagate Barracuda ST3750528AS 7200RPM 750GB SATA-300 32MB Cache
TrueNAS HDD:
4TB WD Red Plus WD40EFPX (x3)
PSU:
Seasonic Focus PX-750, 750W 80+ Platinum
Case: NZXT Phantom 530
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
What LSI card are you using? IS the VM showing as OFF or does the console show anything? What version of ESXi are you running?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Yea, I have a few questions as well... What VM software are you running? 8GB RAM is the absolute minimum but I'd recommend 16GB, that is what I use for my VM. What is OFF? The computer, VM stopped, just non-responsive? You need to be a bit clearer in your explanation of the problem.

Have you done any burn-in testing to prove hardware stability?
 

Rex Raiden X

Cadet
Joined
Nov 15, 2023
Messages
4
What LSI card are you using? IS the VM showing as OFF or does the console show anything? What version of ESXi are you running?
1 - Integrated LSI SAS2 2308 controller
2 - The VM is off, seems to turns off by itself.
3 - 6.7.0 Update 2 (Build 13006603)
 

Rex Raiden X

Cadet
Joined
Nov 15, 2023
Messages
4
Yea, I have a few questions as well... What VM software are you running? 8GB RAM is the absolute minimum but I'd recommend 16GB, that is what I use for my VM. What is OFF? The computer, VM stopped, just non-responsive? You need to be a bit clearer in your explanation of the problem.

Have you done any burn-in testing to prove hardware stability?
I can always add more RAM but since TrueNAS is just idling when it turns off I don't see how that could be the problem. Also I have not done any "burn-in testing", I don't know what that is. The VM stopped.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I can always add more RAM but since TrueNAS is just idling when it turns off I don't see how that could be the problem.
Stranger things have happened, but I won't tell you for certain it is your RAM, but for testing purposes it is easy enough to allocate another 8GB, and if the same problem happens then you know it probably not a RAM issue.

What version of TrueNAS are you running?

While I doubt ESXi 6.7 is the issue, if it only supports FreeBSD 12, maybe that is the issue. Again, I don't think that is the case but the VM should not just stop.

Have you looked in the ESXi logs? It should be listed there. Also, are you passing through an UPS port?
 

Rex Raiden X

Cadet
Joined
Nov 15, 2023
Messages
4
Stranger things have happened, but I won't tell you for certain it is your RAM, but for testing purposes it is easy enough to allocate another 8GB, and if the same problem happens then you know it probably not a RAM issue.

What version of TrueNAS are you running?

While I doubt ESXi 6.7 is the issue, if it only supports FreeBSD 12, maybe that is the issue. Again, I don't think that is the case but the VM should not just stop.

Have you looked in the ESXi logs? It should be listed there. Also, are you passing through an UPS port?
I believe I owe you an apology, you were right about the RAM. I was a little skeptical and ended up taking your lead of the ESXi logs. Looking through the logs I saw the following entry:

Event 17705 : An application (/bin/vmx) running on ESXi host has crashed (3 time(s) so far). A core file might have been created at /vmfs/volumes/6539bdec-73863b75-fe24-80615f05fb38/TrueNAS-Core/vmx-zdump.002.

Using WinSCP I SSH into ESXi and was about to download the dump but saw a file named vmware-6.log, there were other numbered 1-5 which matched the number of times I've tested this VM. Looking at the end of the log I came across the following entry:

E105: PANIC: PCI passthru device 0000:03:00.0 caused an IOMMU fault type 6 at address 0xc0000000. Powering off the virtual machine. If the problem persists please contact the device's vendor. A core file is available in "/vmfs/volumes/6539bdec-73863b75-fe24-80615f05fb38/TrueNAS-Core/vmx-zdump.001"

After a googling this I came across an article in TrueNAS forums in which it narrows the issue to the HBA/LSI card overheating. I purchased a infrared thermometer and measured the temperature of the integrated HBA card heatsink. It was around 124°F (52°C) so, still not believing the possibility of RAM being the issue, I purchased a 40x40x10 12V Noctua fan and adapted it to the heatsink. Dropped the temperature down to 94°F, started the VM and it crashed just the same in a few hours (T-T). Then I added more ram and this darn thing has been going on for 2 days now. I do not know why more RAM would change the behavior since it was not utilizing the full amount to begin with; right now it has 12.3GB free. If you know the reason behind this please do tell, I am baffled by this and, try as I might, can't come up with an explanation.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
First of all, I'm glad the problem is solved.

Second, there are a lot of things going on with TrueNAS, it's an OS and a lot of applications running. If you checked the SWAP file size, if it is using anything more than a few kbytes or SWAP space, you are low on RAM. If you use up all your SWAP space, the system crashes. And the system can crash if you use a little SWAP space. You could look into that if you desire.
 
Top