System freezes occasionally. I/O error on SATA DOM?

Status
Not open for further replies.
Joined
Sep 5, 2017
Messages
8
System specs:
Build: FreeNAS-11.1-U4
Platform: Intel Xeon CPU D-1541 @ 2.1GHz
RAM: 32627 MB

Disks:
1 x SuperMicro 128 GB SATA DOM (boot and system)
2 x Seagate 10GB SATA HDDs (RAID1)
1 x 5100 PRO M.2 480GB SATA (for read cache)

Problems started happening to me last month. I discovered my webGUI was inaccessible but I could ping and SSH into the server. However, my command line would freeze when I attempted to sudo or su to become root. I was able to run zpool status and got the following output on my SATA DOM:

Code:
$zpool status -v freenas-boot
pool: freenas-boot
state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'
see: http://illumos.org/msg/ZFS-800-HC
scan: scrub repaired 0 in 0 days 00:00:04 with 0 errors on Wed Mar 14 03:45:04 2018
config: 

NAME								  STATE		   READ	WRITE	 CKSUM
freenas-boot						  UNAVAIL			0		0		 0
1504633393467200437				   REMOVED			0		0		 0

errors: List of errors unavailable (insufficient privileges)


After this happened, I rebooted the server because I couldn't get anything done without root privilege. The SATA DOM showed back up fine on zpool status. During the time it was fine, I upgrade to FreeNAS 11.1 from 11.0U4. However, there does seem to be more issues. There will be days when I notice the WebGUI disconnects and that the server responds to pings but cannot be accessed through SSH. When I plug into the console, I see the system constantly streaming the lines:

(noperiph:ahcich1:0:0:0): xpt_scan_lun: can’t allocate CCB, can’t continue


Occasionally, it'll have a line saying:

Interrupt storm detected on “irq301”: throttling interrupt source


irq301 corresponds to:

$ vmstat -i | grep irq301
irq301: ahci0 129025 18


And on dmesg, ahci0 corresponds to:

ahci0: <Intel Lynx Point AHCI SATA controller> port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xfb412000-0xfb4127ff irq 16 at device 31.2 on pci1
ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported


With that, I'm not sure if there's an issue with my SATA controller or my SATA DOM. Any hints on how to dig deeper and get a confirmed hardware error?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Most likely the DOM. Those things are hit and miss.

If the thing doesn't support SMART, you'll just have to try some other SATA device and see if it works.
 
Status
Not open for further replies.
Top