Hey guys,
I've got a several year old FreeNAS server (an ASRock C2550D4I w/ 8 gigs ECC memory, 2x5 TB & 2x3 TB WD Red NAS hard drives) that has been causing me trouble. It'll be fine at boot, but if I leave it alone for a little while (a day, maybe a few hours even) it tends to freeze up and require reboot. When it does this, I can't access the UI or even ssh.
It hasn't been a priority until recently when I had some time (California fires closed my work for a week, so side projects finally moved to the top of the pile). When I connected a monitor to it when its frozen, I get this. From what I've gleaned online from others with similar errors, it seems like FreeNAS is trying to pagein from a drive that is throwing errors?
Here's zpool status (note I just upgraded from FreeNAS 9 to 11 and haven't upgraded my pools yet). Still running a scrub, but haven't seen anything amiss yet.
The output of smartctl -x on the four drives is here. One thing that might indicate the problem is when I executed the command on /dev/ada2, it froze the machine for a while before eventually saying:
> Read SCT Status failed: Input/output error
> SCT (Get) Error Recovery Control command failed
During the time it was hanging, error messages saying "swap_pager: indefinite wait buffer: bufobj: 0, blink: 5021, size: 20480" came up on the screen. Once it got through the wait period, though, things seemed to recover fine.
So! What should I do? Is /dev/ada2 dead or dying? Is there a way I can further diagnose the problem? Is the 8 gigs of memory I have too little? Should I try using this script to periodically pagein?
The other possibility I was wondering about is whether the USB stick I have FreeNAS installed on could be dying. Is there any way to check that?
Thanks for any advice!
I've got a several year old FreeNAS server (an ASRock C2550D4I w/ 8 gigs ECC memory, 2x5 TB & 2x3 TB WD Red NAS hard drives) that has been causing me trouble. It'll be fine at boot, but if I leave it alone for a little while (a day, maybe a few hours even) it tends to freeze up and require reboot. When it does this, I can't access the UI or even ssh.
It hasn't been a priority until recently when I had some time (California fires closed my work for a week, so side projects finally moved to the top of the pile). When I connected a monitor to it when its frozen, I get this. From what I've gleaned online from others with similar errors, it seems like FreeNAS is trying to pagein from a drive that is throwing errors?
Here's zpool status (note I just upgraded from FreeNAS 9 to 11 and haven't upgraded my pools yet). Still running a scrub, but haven't seen anything amiss yet.
zpool status
pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0 days 00:10:43 with 0 errors on Fri Nov 16 03:55:43 2018
config:
NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
gptid/46a3905f-8166-11e4-b1b4-d050992dfacc ONLINE 0 0 0
errors: No known data errors
pool: plex
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: scrub in progress since Fri Nov 16 14:49:42 2018
2.52T scanned at 309M/s, 2.18T issued at 79.1M/s, 2.91T total
0 repaired, 75.02% done, 0 days 02:40:35 to go
config:
NAME STATE READ WRITE CKSUM
plex ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/29ba2f2f-82f2-11e4-8cac-d050992dfacc ONLINE 0 0 0
gptid/2a3ba454-82f2-11e4-8cac-d050992dfacc ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gptid/1492339f-ae78-11e4-984d-d050992dfacc ONLINE 0 0 0
gptid/154e3a7b-ae78-11e4-984d-d050992dfacc ONLINE 0 0 0
errors: No known data errors
The output of smartctl -x on the four drives is here. One thing that might indicate the problem is when I executed the command on /dev/ada2, it froze the machine for a while before eventually saying:
> Read SCT Status failed: Input/output error
> SCT (Get) Error Recovery Control command failed
During the time it was hanging, error messages saying "swap_pager: indefinite wait buffer: bufobj: 0, blink: 5021, size: 20480" came up on the screen. Once it got through the wait period, though, things seemed to recover fine.
So! What should I do? Is /dev/ada2 dead or dying? Is there a way I can further diagnose the problem? Is the 8 gigs of memory I have too little? Should I try using this script to periodically pagein?
The other possibility I was wondering about is whether the USB stick I have FreeNAS installed on could be dying. Is there any way to check that?
Thanks for any advice!