Random reboots

Status
Not open for further replies.

Luc

Cadet
Joined
Aug 31, 2014
Messages
5
I'm having a strange problem that might to be specific to my hardware... as I haven't been able to find similar issues that weren't just "bad hardware". I've got 6 Dell DRSX-13PE40 servers, (Dual Intel Xeon Quad-Core L5410 2.33GHz, 8GB DDR2), I've been running FreeNAS 9.1.1 for about a year on one of them without issue... After upgrading to 9.2.1, the system reboots randomly.... I've now tried on 4 of the servers, 2 of which I purchased specifically to replace the old one, thinking a hardware problem must be responsible... but the behavior is identical on all 4 systems... 2 of them new, (well refurbished), 1 of them runs Xen server without issue. I've run memtests as well, no issues reported...

syslog shows nothing of consequence, and I've never seen the screen while this happens... worse I can't downgrade to 9.1.1 because the ZFS pool won't import in 9.1.1 anymore.. If I can't resolve the issue, I'll be forced to install a fresh 9.1.1, re-create my pool and transfer data.. but I'd rather keep the latest version of FreeNAS... I can't go that route with 9.2.1 because even with a new zfs pool, 9.2.1 always randomly reboots.

I'm now running 9.2.1.7 (64 bit) and the problem persists... I haven't tried 9.2.0 yet.. I may try that in lieu of migrating to a new pool..

Any help would be appreciated..
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Are you saying that you upgraded your ZFS pools (something you have to do manually)? If you did then yes, you might not be able to go back to 9.1.1. As a word of advice, never upgrade your pools unless you know what benefit you will gain. In general there is nothing to gain over a V28 pool unless you have specific needs for advanced functions and since you have only 8GB of RAM, I don't see you falling into that category.

One possible solution is to backup your data, destroy the pool, restore FreeNAS 9.1.1, recreate the pool, restore your data. I just don't know how many drives you have and the size of your data.

That is an odd failure for the system to reboot randomly on multiple systems so I would suspect a driver issue, not something you can easily fix.

How about giving us a full rundown on your hardware setup, and maybe the output of the dmesg file.
 

Luc

Cadet
Joined
Aug 31, 2014
Messages
5
I didn't update the pool, the ZFS pool thing is where my problems started... after a power failure, the FreeNAS box just hung on the "mounting local disks" message.. (or whatever, I forget the exact message, where it imports the pool anyway).. I assumed my pool had corrupted, I couldn't issue the proper commands to repair the pool with that version of ZFS, so I booted up in Ubuntu I think it was, the pool imported fine... diagnostics were all clear.. that's when I tried 9.2.1... the pool works fine there too... but I still can't go 9.1.1... the pool is still running the older version, I never upgraded it.

The reboot issue however started before this happened, on another server I was testing with... I had enabled encryption on that system, so I thought the reboots were related to that, but I've since ruled that out since the problem persists across installations, and physical servers... if I have too I'll do as you suggest, but I'd like to use encryption too on another server, so it would be nice to get 9.2.1.x working...

Attached are the outputs of dmesg, zpool status <poolname> and pciconf -lv.

The server exports NFS for a shared mail server and iSCSI for Xen VMs.. I don't think it's related to those services though because the encrypted test I was running didn't have any services enabled at all, I was using SCP/SFTP to transfer files only.

Thanks
 

Attachments

  • pciconf.txt
    6.9 KB · Views: 338
  • zpool.status.txt
    928 bytes · Views: 312
  • dmesg.txt
    10.9 KB · Views: 358

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
So your problems are more than just a single issue. Let's break them down first...

1. Your computer(s) reboot when you run 9.2.x so I'd submit a Bug Report (located at the top of the screen) and list as much information as possible. If you can keep one system available for troubleshooting that would be good for your problem.

2. Your pool is likely somehow corrupt which is beyond my expertise. My advice here will likely be a time saver overall is to do the following:
a. You have three 500GB drives in a RAIDZ1 configuration which means you have a small amount of data compared to most folks you see on this forum. Since you said you can access the data, back it up.
b. Destroy your pool.
c. Reload FreeNAS 9.1.1
d. Create your pool again and use the same pool name and if you want encryption, that is up to you.
e. Load your FreeNAS 9.1.1 configuration file you saved when you created your 9.1.1 system last year, assuming you have it, otherwise you will need to manually setup your system again.
f. Copy your files back. And you should be where you started before the problem started.

So you unexpectedly lost power... can I assume you do not have an UPS connected to your machine?

Good Luck.
 

Luc

Cadet
Joined
Aug 31, 2014
Messages
5
I do have a UPS, but I didn't switch the generator on in time...

I will submit a bug report; what is the best way to transfer the data? For the files hosted on NFS it's pretty straight forward, but the iSCSI containers with VM disk images... not sure... dd over ssh ?

There must be a better way... maybe using zfs replication?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
If you have a UPS then you didn't set it up properly in FreeNAS. Ideally the UPS should only be providing enough power to allow the FreeNAS box to shutdown on its own.
 

Luc

Cadet
Joined
Aug 31, 2014
Messages
5
Ideally yes, I've since purchased new ones so the NAS is the only thing attached, and I set it up in FreeNAS.. but when this happened that wasn't the case.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I've never used iSCSI but whatever method you have would be fine.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I see you are now getting some attention on this problem. Hope it's something easy to figure out.
 
Status
Not open for further replies.
Top