Unscheduled System Reboot

MrMeier

Cadet
Joined
Dec 8, 2022
Messages
6
First of, I've looked at every post related to this issue in here and google, but nu luck.

I recently made my first NAS out of an old PC I had laying around, running TrueNAS-13.0-U3.1.

My problem is that 5-8 times a day, mostly every 3 hours now, I get a unscheduled system reboot.
I've tried looking in the logs, but haven't seen anything weird, but then again, I'm new to all this.
First time I got the problem, I was running a HDD as boot drive, so tried installing Truenas on a SSD instead, but still same problem.

truenas problem.PNG


Specs:
500W PSU
Asus Sabertooth z77 motherboard
Intel i7 3770k CPU
8GB no ECC RAM
3x 4TB Barracuda Pro from Aug 2018 as storage
1x Samsung 830 240GB SSD as boot drive

Temps are more than fine on everything, CPU is around 30-40 and HDD's and SSD around 30-35 max.
I've got two jails, one for Transmission and one for Plex, but the problem was there, even before I made those.

Any ideas?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
8GB no ECC RAM
I've got two jails, one for Transmission and one for Plex,
Any ideas?

Double the RAM and jettison the jails. The minimum memory required for TrueNAS is 16GB. Adding jails increases the amount of memory required even more, so if you want to run Transmission and Plex, see if you can shoot for 32GB.

Once you've done this, also return to the burn-in and testing phase to verify that you don't have some other problem.

 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You may want to look into the BIOS settings for power management and turn all that stuff off... it can be problematic (I concede more known as an issue for AMD CPU/BIOS) and is one of the things that I can imagine kicks in after a few hours.
 

MrMeier

Cadet
Joined
Dec 8, 2022
Messages
6
Double the RAM and jettison the jails. The minimum memory required for TrueNAS is 16GB. Adding jails increases the amount of memory required even more, so if you want to run Transmission and Plex, see if you can shoot for 32GB.

Once you've done this, also return to the burn-in and testing phase to verify that you don't have some other problem.

It have been doing it since the beginning, even before I installed any jails.
Closing them down doesn't change anything.
I'll still go buy some more RAM, thanks :)

You may want to look into the BIOS settings for power management and turn all that stuff off... it can be problematic (I concede more known as an issue for AMD CPU/BIOS) and is one of the things that I can imagine kicks in after a few hours.

I've already looked in there, but I guess it won't hurt to just turn everything power saving related etc. off, thanks :)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It have been doing it since the beginning, even before I installed any jails.
Closing them down doesn't change anything.

Well, quite frankly, it isn't going to get more stable by making it work harder; it was a mistake to add jails if it already was unstable. You have to get it to a point where it is stable and reliable, and this is difficult to do while it is running jails or even the NASware itself. The problem with this kind of stuff is that you'll eventually discover that it's something kinda dumb like a BIOS setting or a failing part, but it is frustrating to debug in the meantime.
 

MrMeier

Cadet
Joined
Dec 8, 2022
Messages
6
Well, quite frankly, it isn't going to get more stable by making it work harder; it was a mistake to add jails if it already was unstable. You have to get it to a point where it is stable and reliable, and this is difficult to do while it is running jails or even the NASware itself. The problem with this kind of stuff is that you'll eventually discover that it's something kinda dumb like a BIOS setting or a failing part, but it is frustrating to debug in the meantime.

Yeah I get what you're saying, I'll start from a fresh and see what I can find out.
The PC did run fine before making it a NAS, but i know that doesn't mean anything really.
 

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
Physical:
Manufacturer: Supermicro
Model: H8SGL
CPU: 8 CPUs x AMD Opteron(tm) Processor 6320
Mem: 64GB ECC (correction enabled in BIOS)
GPU: NVidia GT240 (no, it can't play Crysis)
1 SSD 240GB (AHCI) - boot pool
1 SSD 120GB (AHCI) - boot pool
4 SATA HDD 8TB (AHCI) - Main Pool {Z2}
SB700 ONBOARD SATA Controller
OS
TrueNAS-13.0-U3.1 Core



FYI, I've been running into the SAME problem and I'm on completely different hardware. I've tested the HELL out of the hardware, too. I can't find a single thing wrong. I've disconnected ALL of the disks, ran memtestx86 for a day with ZERO errors, power supply testing is great, etc. There are zero event logs in the BIOS. I'm convinced it's 100% a software issue at this point. Anyways, last night I threw in the towel. I just don't know what to do.

Firstly, I started with TrueNAS-13.0-U3.1 on a VMware ESXi 7.0u3 host, as a VM, using the same machine listed above as the VM host. Things were running great but it was just mostly a test to learn the software. I then decided to go baremetal with the specs above and that's when the problems started. I'm seeing reboots about every 3 hours, JUST LIKE YOU.

Obviously, I can't use the box and migrate my data to it if this is going to happen. At first I thought it was happening because of a very large data load via SMB to my pool, so I stopped running my data load test set. Nope, still rebooting. I then disconnected my 8TB drives (all of my main pool) and re-installed fresh, this time using just a single SSD for the boot pool. Still rebooting.

I went through every single BIOS option I could, testing just about everything. I spent an entire day doing this! The only thing that truly came out of that was disabling the watchdog function in the BIOS and removing the jumper (per my mobo's documentation) to ensure watchdog couldn't run. Still rebooting.

I then decided to install the latest LEGACY version, TrueNAS CORE 12.0-U8.1 and then watched as the reboots seemed to happen even MORE frequently.

Like I said, I threw up my arms last night and re-installed ESxi 7.0U3 on the same box and it's been running all night while I slept (to include a TrueNAS-13.0-U3.1 Core VM which hasn't rebooted once)! Why?!?!??! Why does this OS run great in a virtual but on my hardware (and yours apparently), it crashes every 3 hours?

There is some kind of issue with this software because other operating systems do NOT reboot like this on my box and I'd be EXTREMELY happy to deliver any log files or core dumps to ANYONE for assistance. I've just spent a decent amount of money putting this together as a home nas server but with this level of rebooting I'm afraid it's just not possible to go forward with this operating system. I'm tempted to just abandon at this point and go check out UNRAID.

HELP?
 
Last edited:

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
The PC did run fine before making it a NAS, but i know that doesn't mean anything really.

Actually, that means a lot. If you simply install a new OS on a machine and it's failing constantly, and then switch back to another OS and it stops, it means it's not hardware. It's a software issue 100%.

To be clear, on my specs I've ran Windows Server 2016, 2019, ESXi 6.5U5, ESXi7U3, Linux Mint, Ubuntu Server and now TrueNAS CORE 12.0-U8.1 and TrueNAS-13.0-U3.1 Core. ONLY the TrueNAS operating systems are crashing every 3 hours (or more).

The worst part is when I check out DMESG or var\log\messages (where else should I be looking?) there is nothing in the logs that points to a failure at all.

Yes, I see that there is an unscheduled reboot for the 100th time, but WHY is it happening? What logs can I check?
 
  • Like
Reactions: GW2

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
To add some more, I've literally installed TrueNAS baremetal with ZERO configuration and it reboots. That's literally an installation and one single reboot after the installation with NO OTHER configuration option and I'm seeing it reboot. The most "interaction" I'm doing at this point is simply logging in to the web interface to check uptime. Result? UnScHeDuLeD RebOotz!
 
Last edited:

MrMeier

Cadet
Joined
Dec 8, 2022
Messages
6
Okay I seem to have "fixed" the problem, but not really.

Tldr: I installed Truenas Scale and it's been running strong for 4 days straight now.

Okay i startet with upgrading to 32gb RAM, no dice.
Reinstall core, no dice.
Ran every test on the hardware i could possible imagine, no dice.

I gave up, but thought why not try scale, what could go wrong?
Now everything is running flawless, haven't had ANY reboots or anything!

So what the problem was, I guess we will never really know..
 

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
Okay I seem to have "fixed" the problem, but not really.

Tldr: I installed Truenas Scale and it's been running strong for 4 days straight now.

Okay i startet with upgrading to 32gb RAM, no dice.
Reinstall core, no dice.
Ran every test on the hardware i could possible imagine, no dice.

I gave up, but thought why not try scale, what could go wrong?
Now everything is running flawless, haven't had ANY reboots or anything!

So what the problem was, I guess we will never really know..
WHAT?!?!?!

Is there a difference between driver support or ACTUAL operating system function between these OS's other than licensed features?
 

MrMeier

Cadet
Joined
Dec 8, 2022
Messages
6
WHAT?!?!?!

Is there a difference between driver support or ACTUAL operating system function between these OS's other than licensed features?
No clue tbh, but it's way easier to configure apps and stuff, so kinda glad i made the switch.
Try it out.
 

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
1671209700072.png

Here's the biggest difference, it doesn't use FreeBSD! It uses DEBIAN! MAJOR DIFFERENCE! I'm downloading it now, will reply soon.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
SB700 ONBOARD SATA Controller

Ah, you know this isn't really compatible with FreeBSD, right? Search the forums for SB700.

Also, the platform itself is not good for virtualization. In order to virtualize TrueNAS, you need to be able to do reliable PCIe passthru of the disk controller. I'm pretty sure this is a nonstarter on this platform. Please see

 

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
Ah, you know this isn't really compatible with FreeBSD, right? Search the forums for SB700.

Also, the platform itself is not good for virtualization. In order to virtualize TrueNAS, you need to be able to do reliable PCIe passthru of the disk controller. I'm pretty sure this is a nonstarter on this platform. Please see

Hey man, I hope you're day is going well.

Thanks for replying. So, not sure if you saw but my experience was that it worked in virtual just fine (although I was just testing some vmdk's, not doing pci(e) passthrough. My problem was when it was baremetal.

Now, you telling me that the SB700 isn't compatible with FreeBSD sounds like something that makes sense. Since it's the only component I can't remove and test separately (without running Core from a USB drive, which I know is frowned upon). Do you have a link to the SB700 issue? I can't find it.

With that said, I did just order a Dell H310 last night off Ebay which is SUPPOSED to already be flashed to IT MODE. Will see when it gets here.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Hey man, I hope you're day is going well.

Sorry I didn't see this sooner. I do try to at least scan for interesting subject lines but sometimes I miss 'em.

Thanks for replying. So, not sure if you saw but my experience was that it worked in virtual just fine (although I was just testing some vmdk's, not doing pci(e) passthrough. My problem was when it was baremetal.

Correct. A hypervisor environment offers your virtual machine what I like to refer to as an "idealized" environment; emulated Intel ethernet cards, abstracted virtual disks fed from vmdk files. Stuff that is rock solid and hammered on by virtually every virtual machine in the world. A hypervisor guest that cannot operate in an "idealized" environment is like an automobile that can't drive on an asphalt road -- even smoother than concrete (bare metal).

Now, you telling me that the SB700 isn't compatible with FreeBSD sounds like something that makes sense. Since it's the only component I can't remove and test separately (without running Core from a USB drive, which I know is frowned upon).

It would be fine to run Core from USB for testing. The problem is the long term, it will eat the endurance of the flash very quickly.

Do you have a link to the SB700 issue? I can't find it.

I do not have a link to a specific issue. However, if you use the search box in the upper right hand corner, and put in SB700, you will see numerous people have had problems with it. The way my sysadmin memory works is that when I see something jarring and unacceptable, somewhere in my head a vague memory is made of some of the offender's details, and over time if I see certain things - such as "SB700", or "ZFS on 4GB RAM", showing up over and over again, one day a little light goes off and associates these things, and I take a closer look to figure out what SPECIFICALLY seems to be the commonality. I am sorry if this is not as pleasant as a clearly spelled out bug report that includes maybe a hardware design defect or something solid. It's just what I have.

This is the same thing that led to identify Supermicro/Intel platforms prior to Sandy Bridge as problematic for virtualization, because their PCIe passthru is dodgy, and also discouraging use of older AMD/Opteron platforms as well, for a variety of problems such as use of bge based ethernets. But you're Supermicro and they sometimes decked those out with Intel ethernets. If so, that at least is double plus good. However, PCIe passthru may still be a nonstarter, so quite likely no virtualization. Also, how the heck are you running ESXi 7 on that? Weren't the pre-2010 Opterons all deprecated in ESXi?

With that said, I did just order a Dell H310 last night off Ebay which is SUPPOSED to already be flashed to IT MODE. Will see when it gets here.

That, on the other hand, has a pretty good chance of running bare metal. Make sure you look to see if there is a mainboard jumper or BIOS option to disable the SB700. Also feel free to report back on success or failure.
 

legisilver

Dabbler
Joined
Dec 5, 2022
Messages
14
Yup, will try again when the h310 card gets here, thanks.

I'm not a virtualization noob by any standards, but the intro was appreciated nonetheless. Honestly, that shows good empathy.

So, I just tried installing TrueNAS scale (both current and legacy) from a USB drive and after GRUB loads and I select either option to install TrueNAS Scale from the Grub menu it literally crashes and reboots my system. It never even tries to install. I've checked in the BIOS to disable secure boot but I don't see that option. I did disable Secure Virtual Machine Mode, but that didn't have a positive effect. I tried installing with 3 different USB sticks, 3 different versions of RUFUS, FAT32 and NTFS on the USB Drives, nothing...

Getting pretty discouraged...

[mod note: slightly edited for privacy -JG]
 
Last edited by a moderator:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm not a virtualization noob by any standards, but the intro was appreciated nonetheless. Honestly, that shows good empathy.

Strongly advise you change your username to something less look-up-able unless that's really what you want to happen. Let me know if you need help and I'm happy to approve the change. It's in your user profile settings.

I'm enough of a troublemaker that I run around with "J Greco" on my namebadge at conferences but many people call me by my first name anyways. Too late to remain anonymous. :smile:

In any case, I usually don't have a lot of time to try to guess the virtualization expertise level of users. I just assume they need help, which is usually true.

So, I just tried installing TrueNAS scale

I'm going to leave that for someone who knows about Linux boot. I'm a FreeBSD'er so what I could tell you is probably not applicable.
 
Last edited:
Top