System crashing/hanging anytime I try to do a scrub.

Status
Not open for further replies.

Aflac_Attack

Cadet
Joined
Apr 21, 2018
Messages
8
So I have a recently installed pretty vanilla build of Freenas 11.1u4 I'm trying to use as a file/media server in a Windows environment. It's a 3-drive Raid-Z (encrypted) that I set up with a couple user accounts/groups and SMB shares and that's it. I set up the Volume/Datasets and permissions and got it working as a networked drive. Transferred a couple Terabytes of stuff over to it while I was still in the testing phase and it seemed to be working perfectly for a couple days.
However, I discovered that whenever I try to do a scrub on the array it will eventually crash/hang/hard-lock. Seems completely random when it does it too. Could be going for 5 minutes or 3 hours but eventually the system becomes completely unresponsive. The connected display will either freeze on the Freenas main settings window (The 1-12 thing, don't know what it's called) or it will just go blank. I'll try to access the GUI but can't and I'll try to ping it to see that it's no longer on the network. I have a CMD prompt on my PC's second monitor pinging it constantly so I can monitor it and it will be going for a while until it starts getting the "Destination host unreachable" or "Request timed out". As soon as it loses the network connection seems to be the same time the system crashes. A power off/reset is the only way to get it back.

Tried all manner of BIOS tweaks. Turning all power saving stuff off, turning down RAM speeds, turning off ECC, disabling virtualization stuff, ect. There is no overclock running.
CPU temperatures and voltages are not a problem. SMART status is fine.


Don't know if it's related but the SMART process seems to be failing to initialize on the boot USB
  • "freenas smartd [2614]: Unable to register device /dev/da0 (no Directive -d removable). Exiting."
Also, probably not relevant but the New GUI is failing to monitor my hardware correctly so some of the graphs will be blank. Upon opening the New GUI I get:
  • "Error 22:rrdtool failed: ERROR: opening '/var/db/collectd/rrd/localhost//aggregation-cpu-sum/cpu-user.rrd': No such file or directory"
  • " Error 201:[ENOMETHOD] Method "summary" not found in "network.general"
I can't figure it out. It can be fine for days no problems but as soon as I attempt a scrub it shits itself. Now I'm new to FreeNAS (and FreeBSD) and know next to nothing about the "Scrub" process but the only things I can think of are that in the HDD/RAM intensive operations somethings messing up and locking the system.

Components:
  • Ryzen R5 1600
  • Asrock x370 Taichi (Latest BIOS)
  • 16 GB RAM (ECC) (2x 8 GB sticks) (KVR24E17S8/8)
  • 3x WD 10 TB Red Pro
  • 1x 500GB Samsung 850 Evo for VM's eventually (Currently not used)
  • 1x 8 GB USB 3.0 drive for boot
  • GTX 9800 GT (for display purposes)
  • Seasonic Focus Plus 550W PSU (SSR-550FX)
  • Cyberpower 1350 VA UPS (Never seen the load go over 120W)
All the components are relatively brand new (except for the GPU ofc) so a hardware fault is unlikely though possible. Only possible hardware thing I think it could be would be the RAM. It's not actually on Asrocks QVL list but it has been working fine in both ECC and non-ECC mode and I ran MemTest86 on for a long time with no problems. Other than some dumb shit like maybe a bad PSU HDD power cable or some random physical fault I don't think it's a hardware issue. Just took it all apart last yesterday and rebuilt it from scratch just to make sure it's not getting any random short, or a bad CPU seating, or whatever. (I've tried it without the GPU installed and get the same problem so it's not that either.)

Anyone have any ideas or experience with this problem? Any tips or suggestion on how I fix this? Thanks.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Anyone have any ideas
It is a shame that you built a gaming system and then tried to run FreeNAS on it. You should have come to find out what hardware to use first:

FreeNAS® Quick Hardware Guide
https://forums.freenas.org/index.php?resources/freenas®-quick-hardware-guide.7/

Hardware Recommendations Guide Rev 1e) 2017-05-06
https://forums.freenas.org/index.php?resources/hardware-recommendations-guide.12/

Ryzen is too new and still has too many incompatibilities on top of which you have a gaming board with built in WiFi and sound. It is difficult to tell what hardware in that might be causing the fault.
Don't know if it's related but the SMART process seems to be failing to initialize on the boot USB
  • "freenas smartd [2614]: Unable to register device /dev/da0 (no Directive -d removable). Exiting."
SMART service won't run with this error. You have to disable SMART for that drive.
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
Well looking at Ryzen boards as so many builds with issues are showing up, I can’t say I find any gear I would trust.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
Now I'm new to FreeNAS (and FreeBSD) and know next to nothing about the "Scrub" process but the only things I can think of are that in the HDD/RAM intensive operations somethings messing up and locking the system.

You could try to run other I/O intensive applications and watch what happens. Some examples would be the solnet-array-test and badblocks as utilized in disk burnin testing.

https://forums.freenas.org/index.php?resources/hard-drive-burn-in-testing.92/
https://forums.freenas.org/index.ph...for-freenas-scripts-including-disk-burnin.28/
https://forums.freenas.org/index.php?resources/solnet-array-test.1/

Remember that badblocks when used with standard parameters is destructive, so you would need one or more disks without valuable data to run it. solnet-array-test, OTOH, can be run safely on live pools.

And don't forget that spontaneous halts/reboots might put your data at risk even if you do run only non-destructive tests. So always keep your backup up to date.
 

Aflac_Attack

Cadet
Joined
Apr 21, 2018
Messages
8
Chris Moore:
First of all it's not a gaming system, it's a file/media server that will eventually do some light VM stuff. FreeNAS has no law stating you have to use "server grade" equipment. And the reason I went with AMD over Intel is because for the same price I got 6 core/12 thread vs 4 core/4 thread and more modern motherboard functionality. And I was still able to utilize ECC memory. It being Ryzen and having "built in WiFi and sound" has nothing to do with this issue I'm having. And if I didn't clarify enough before, all non-essential functions have been turned off to reduce the variables; audio, WiFi, power saving, virtualization, ect.

MrToddsFriends:
I've already run I/O intensive burn-ins/stress tests in a separate Windows build using the same hardware with no problems. Voltages, temps, stability were all acceptable.
I've already transferred 9+ TB over to the SMB share/dataset so I really don't want to have to copy all that data back over to my other PC (over only 1Gb), and delete the data to run these destructive tests which aren't going to tell me anything I don't already know. Though I appreciate the suggestions.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
I've already run I/O intensive burn-ins/stress tests in a separate Windows build using the same hardware with no problems.

I would be very surprised if Windows and FreeNAS rely on the same drivers for the SATA stuff built in to your board.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
First of all it's not a gaming system
It is using a Gaming System Board. Look at this, this is a server board:
https://www.ebay.com/itm/NEW-Unopened-Super-Micro-Motherboard-LGA-2011-MBD-X9SRL-F-O/253300204890
This is a Server CPU:
https://www.ebay.com/itm/SR1A6-Inte...-Core-CM8063501374901-Processor-/192451846800
If you use good components like these, you would not have any problems. Voice of experience.

It also has integrated video (not in the CPU, in the system board) so no video card required.
 
Status
Not open for further replies.
Top