System Crashes Weekly - No GUI or No SSH, ping will reply. Server is headless w/ no video out.

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
I was on 11.3 u4.1 (updating now) and this started about 2 weeks ago. When I reboot server comes up and works fine... like nothing happened.:rolleyes:
  • The latest mod was adding an Intel 10Gb CNA a month ago but logs don't reference an issue with it.
  • I changed all the RAM already, crashed again. And no it's not ECC RAM for those of you who need to rant today.
  • Two pools host SMB/NFS and 1 iSCSI volume.

The logs reference DA6 GONE!!! which is a Samsung 850 EVO and also referencing one of my WD drive serials in the same line... anyone know why?
Apr 7 16:25:17 NAS arcmsr_dr_handle: Target=0, lun=6, GONE!!!
Apr 7 16:25:17 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:25:17 NAS da6: <SAMSUNG SSD PM851 m R001> s/n WD-WCAVY7367596 detached

  • I have (3) SSDs in a "RAID5" yes they are all different capacities... and (4) WD 2TBs in mirrors.
  • I have never been able to turn on SMART monitoring service. Fails to turn on in gui saying there's no config file. Tried to make a config file once, to no avail.

The system and logs...

RAID Controller, all devices in JBOD
07:00.0 RAID bus controller: Areca Technology Corp. ARC-188x series PCIe 2.0/3.0 to SAS/SATA 6/12Gb RAID Controller (rev 05)

I had an LSI 9220-8i that worked beautifully until I plugged it in when the PSU had just turned itself on. Now it doesn't show any drives.

POOLZ
root@NAS:~ # zpool status
pool: SSD
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
SSD ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gptid/3d926b98-8775-11eb-a1e9-1402ec8cfd54 ONLINE 0 0 0
gptid/3d96022b-8775-11eb-a1e9-1402ec8cfd54 ONLINE 0 0 0
gptid/3d99a600-8775-11eb-a1e9-1402ec8cfd54 ONLINE 0 0 0
errors: No known data errors

pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0 days 00:01:09 with 0 errors on Tue Apr 6 03:46:09 2021
config: NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
da8p2 ONLINE 0 0 0
errors: No known data errors

pool: x
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: resilvered 64K in 0 days 00:00:00 with 0 errors on Thu Apr 8 08:40:00 2021
config:NAME STATE READ WRITE CKSUM
x ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/19bd5a1d-f760-11e8-a516-30469a7fee6a ONLINE 0 0 0
gptid/1c5b1adb-f760-11e8-a516-30469a7fee6a ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gptid/7bbc6859-1153-11e9-85bd-30469a7fee6a ONLINE 0 0 0
gptid/7ecc97f1-1153-11e9-85bd-30469a7fee6a ONLINE 0 0 0

errors: No known data errors

LOGS BEFORE/DURRING/ENDING CRASH
Apr 4 13:31:24 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 4 13:31:47 NAS kernel: ix1: link state changed to DOWN
Apr 4 13:31:47 NAS kernel: ix1: link state changed to DOWN
Apr 5 00:00:00 NAS syslog-ng[1010]: Configuration reload request received, reloading configuration;
Apr 5 00:00:00 NAS syslog-ng[1010]: Configuration reload finished;
Apr 5 08:17:51 NAS kernel: ix1: link state changed to UP
Apr 5 08:17:51 NAS kernel: ix1: link state changed to UP
Apr 5 18:46:39 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 5 18:47:16 NAS kernel: ix1: link state changed to DOWN
Apr 5 18:47:16 NAS kernel: ix1: link state changed to DOWN My desktop going to sleep
Apr 6 00:00:00 NAS syslog-ng[1010]: Configuration reload request received, reloading configuration;
Apr 6 00:00:00 NAS syslog-ng[1010]: Configuration reload finished;
Apr 6 03:45:00 NAS ZFS: vdev state changed, pool_guid=4886913228015753151 vdev_guid=14395134580434529567
Apr 6 12:07:18 NAS kernel: ix1: link state changed to UP
Apr 6 12:07:18 NAS kernel: ix1: link state changed to UP My desktop waking up
Apr 6 18:20:28 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 6 18:47:37 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 6 19:59:23 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 6 19:59:24 NAS kernel: ix1: link state changed to DOWN
Apr 6 19:59:24 NAS kernel: ix1: link state changed to DOWN My desktop going to sleep
Apr 6 22:23:24 NAS kernel: ix1: link state changed to UP My desktop waking up
Apr 6 22:23:24 NAS kernel: ix1: link state changed to UP
Apr 7 00:00:00 NAS syslog-ng[1010]: Configuration reload request received, reloading configuration;
Apr 7 00:00:00 NAS syslog-ng[1010]: Configuration reload finished;
Apr 7 04:23:43 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 7 04:25:03 NAS WARNING: 10.9.8.6 (iqn.1991-05.com.microsoft:pc): no ping reply (NOP-Out) after 5 seconds; dropping connection
Apr 7 16:25:17 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:25:17 NAS ZFS: vdev is removed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:25:17 NAS arcmsr_dr_handle: Target=0, lun=6, GONE!!!
Apr 7 16:25:17 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:25:17 NAS da6: <SAMSUNG SSD PM851 m R001> s/n WD-WCAVY7367596 detached
Apr 7 16:25:17 NAS GEOM_MIRROR: Device swap0: provider da6p1 disconnected.
Apr 7 16:25:17 NAS (da6:arcmsr0:0:0:6): Periph destroyed
Apr 7 16:25:19 NAS GEOM_ELI: Device mirror/swap0.eli destroyed.
Apr 7 16:25:19 NAS GEOM_MIRROR: Device swap0: provider destroyed.
Apr 7 16:25:19 NAS GEOM_MIRROR: Device swap0 destroyed.
Apr 7 16:25:19 NAS GEOM_MIRROR: Cancelling unmapped because of da1p1.
Apr 7 16:25:19 NAS GEOM_MIRROR: Cancelling unmapped because of da7p1.
Apr 7 16:25:19 NAS GEOM_MIRROR: Device mirror/swap0 launched (2/2).
Apr 7 16:25:19 NAS GEOM_ELI: Device mirror/swap0.eli created.
Apr 7 16:25:19 NAS GEOM_ELI: Encryption: AES-XTS 128
Apr 7 16:25:19 NAS GEOM_ELI: Crypto: software
Apr 7 16:25:22 NAS arcmsr_dr_handle: Target=0, lun=6, Plug-IN!!!
Apr 7 16:25:22 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:25:22 NAS da6: <WDC WD2002FYPS-02W3B R001> Fixed Direct Access SPC-3 SCSI device
Apr 7 16:25:22 NAS da6: Serial Number WD-WCAVY7367596
Apr 7 16:25:22 NAS da6: 600.000MB/s transfers
Apr 7 16:25:22 NAS da6: Command Queueing enabled
Apr 7 16:25:22 NAS da6: 1907729MB (3907029168 512 byte sectors)
Apr 7 16:25:22 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:25:22 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=8560731287678835544
Apr 7 16:28:47 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:28:47 NAS ZFS: vdev is removed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:28:47 NAS arcmsr_dr_handle: Target=0, lun=6, GONE!!!
Apr 7 16:28:47 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:28:47 NAS da6: <SAMSUNG SSD PM851 m R001> s/n WD-WCAVY7367596 detached
Apr 7 16:28:47 NAS (da6:arcmsr0:0:0:6): Periph destroyed
Apr 7 16:28:52 NAS arcmsr_dr_handle: Target=0, lun=6, Plug-IN!!!
Apr 7 16:28:52 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:28:52 NAS da6: <WDC WD2002FYPS-02W3B R001> Fixed Direct Access SPC-3 SCSI device
Apr 7 16:28:52 NAS da6: Serial Number WD-WCAVY7367596
Apr 7 16:28:52 NAS da6: 600.000MB/s transfers
Apr 7 16:28:52 NAS da6: Command Queueing enabled
Apr 7 16:28:52 NAS da6: 1907729MB (3907029168 512 byte sectors)
Apr 7 16:28:52 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:28:52 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=8560731287678835544
Apr 7 16:29:04 NAS arcmsr0: Target=0, Lun=6, selection timeout, raid volume was lost
Apr 7 16:29:04 NAS (da6:arcmsr0:0:0:6): Invalidating pack
Apr 7 16:29:04 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:29:04 NAS da6: <WDC WD2002FYPS-02W3B R001> s/n WD-WCAVY7367596 detached
Apr 7 16:29:04 NAS ZFS: vdev state changed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:29:04 NAS ZFS: vdev is removed, pool_guid=14400636327039552051 vdev_guid=14461087956095130297
Apr 7 16:29:05 NAS (da6:arcmsr0:0:0:6): Periph destroyed
Apr 7 16:46:42 NAS arcmsr_dr_handle: Target=0, lun=6, GONE!!!
Apr 7 16:46:43 NAS arcmsr0: Target=0, Lun=4, selection timeout, raid volume was lost
Apr 7 16:46:43 NAS (da4:arcmsr0:0:0:4): Invalidating pack
Apr 7 16:46:43 NAS GEOM_MIRRORda4 at arcmsr0 bus 0 scbus0 target 0 lun 4
Apr 7 16:46:43 NAS da4: <WDC WD2002FYPS-02W3B R001>Request failed (error=6). s/n WD-WCAVY7368555 detached
Apr 7 16:46:43 NAS da4p1[WRITE(offset=222892032, length=8192)]
Apr 7 16:46:43 NAS GEOM_MIRROR: Device swap1: provider da4p1 disconnected.
Apr 7 16:46:43 NAS arcmsr0: Target=0, Lun=5, selection timeout, raid volume was lost
Apr 7 16:46:43 NAS (da5:arcmsr0:0:0:5): Invalidating pack
Apr 7 16:46:43 NAS GEOM_MIRROR: Request failed (error=6). da5p1[WRITE(offset=222900224, length=4096)]da5 at arcmsr0 bus 0 scbus0 target 0 lun 5
Apr 7 16:46:43 NAS da5: <WDC WD2002FYPS-02W3B R001> s/n WD-WCAVY7397499 detached
Apr 7 16:46:43 NAS GEOM_ELI: g_eli_write_done() failed (error=6) mirror/swap1.eli[WRITE(offset=222900224, length=4096)]
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578708,size 4096, error 6
Apr 7 16:46:43 NAS GEOM_MIRROR: Device swap1: provider da5p1 disconnected.
Apr 7 16:46:43 NAS GEOM_MIRROR: Device swap1: provider destroyed.
Apr 7 16:46:43 NAS GEOM_MIRROR: Device swap1 destroyed.
Apr 7 16:46:43 NAS GEOM_ELI: g_eli_write_done() failed (error=6) mirror/swap1.eli[WRITE(offset=222904320, length=20480)]
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578709,size 20480, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578714,size 8192, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578716,size 8192, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578718,size 8192, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578720,size 4096, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578721,size 20480, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578726,size 8192, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578728,size 16384, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578732,size 40960, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578742,size 36864, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578751,size 8192, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578753,size 20480, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578758,size 24576, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578764,size 73728, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578782,size 12288, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578785,size 4096, error 6
Apr 7 16:46:43 NAS swap_pager: I/O error - pageout failed; blkno 578786,size 16384, error 6
Apr 7 16:46:43 NAS GEOM_ELI: Device mirror/swap1.eli destroyed.
Apr 7 16:46:43 NAS GEOM_ELI: Detached mirror/swap1.eli on last close.
Apr 7 16:46:47 NAS arcmsr_dr_handle: Target=0, lun=5, GONE!!!
Apr 7 16:46:47 NAS arcmsr_dr_handle: Target=0, lun=6, Plug-IN!!!
Apr 7 16:46:47 NAS da6 at arcmsr0 bus 0 scbus0 target 0 lun 6
Apr 7 16:46:47 NAS da6: <WDC WD2002FYPS-02W3B R001> Fixed Direct Access SPC-3 SCSI device
Apr 7 16:46:47 NAS da6: Serial Number WD-WCAVY7367596
Apr 7 16:46:47 NAS da6: 600.000MB/s transfers
Apr 7 16:46:47 NAS da6: Command Queueing enabled
Apr 7 16:46:47 NAS da6: 1907729MB (3907029168 512 byte sectors)
Apr 7 16:46:52 NAS arcmsr_dr_handle: Target=0, lun=5, Plug-IN!!!
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 529501,size 4096, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 529048,size 20480, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x800621000 with size 0x2e000 to be written at offset 0x5f000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x800621000 with size 0x2e000 to be written at offset 0x5f000 for process python3.7
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 529084,size 4096, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x800665000 with size 0x81000 to be written at offset 0xa3000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x800665000 with size 0x81000 to be written at offset 0xa3000 for process python3.7
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 529093,size 8192, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8006e7000 with size 0x116000 to be written at offset 0x124000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8006e7000 with size 0x116000 to be written at offset 0x124000 for process python3.7
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 529501,size 4096, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8007fd000 with size 0x14000 to be written at offset 0x23a000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8007fd000 with size 0x14000 to be written at offset 0x23a000 for process python3.7
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 529511,size 4096, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8009e3000 with size 0xd000 to be written at offset 0x420000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8009e3000 with size 0xd000 to be written at offset 0x420000 for process python3.7
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 530092,size 8192, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8009f6000 with size 0x9000 to be written at offset 0x433000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x8009f6000 with size 0x9000 to be written at offset 0x433000 for process python3.7
Apr 7 16:49:06 NAS swap_pager: I/O error - pagein failed; blkno 530003,size 65536, error 6
Apr 7 16:49:06 NAS vm_fault: pager read error, pid 7810 (python3.7)
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x80111f000 with size 0x11000 to be written at offset 0x4c8000 for process python3.7
Apr 7 16:49:06 NAS kernel: Failed to fully fault in a core file segment at VA 0x80111f000 with size 0x11000 to be written at offset 0x4c8000 for process python3.7
 

Attachments

  • da6.txt
    1.6 KB · Views: 160
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
07:00.0 RAID bus controller: Areca Technology Corp. ARC-188x series PCIe 2.0/3.0 to SAS/SATA 6/12Gb RAID Controller (rev 05)

I'm going to point the blame on the "shifty devices" with the Samsung/WD drives swapping around here. I'd pick up a Dell PERC H200/H310 or IBM M1015 - they should be available for US$20-$25 easily, and crossflash it to an LSI 9211-8i.

As far as the reboot I also see swap devices being created/destroyed rapidly, and following lines about failing to page core files in from swap. If there's important bits of memory in swap, and the RAID card trips over itself trying to bring that back into main memory, there's a kernel panic waiting to happen right there.
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
I was with you on this actually. I don't like this card.
When a different SSD failed the buzzer on the card wouldn't STFU until I power cycled the whole system.

So literally as I'm typing this reply the system rebooted on its own and when it came up I didn't get any explanation like before when I got this...
NAS.local had an unscheduled system reboot. The operating system successfully came back online at Sun Mar 28 10:58:03 2021.

I'm on FreeNAS-11.3-U5 since updating yesterday.
So it came back up at 10:39, no explanation.
Last thing in the log before the reboot was at 4am, maybe just coincidental...
Apr 9 04:03:42 NAS kernel: pid 3456 (qbittorrent-nox), jid 3, uid 0: exited on signal 10
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
Just wanted to update. I'm getting Plex exited with status 10. (found to be RAM related)...

After knowing it's not the modules (swapped out) I now remember fking up one of the memory DIMMs on the board.
My dumba$$ tried to squeeze one of the DIMMs by the massive Zalman heatsink and the slot receptible bent over about 15 degrees. I'm sure some pins came undone. I'll be getting a new board asap. This one is a decade old anyway.

I was looking at the X10SLL-F for price bc I'll have to buy a CPU too. I have plenty of RAM, ECC and Non-ECC so if there's a better board let me know.
Just bought that H200 PERC card for $30 so I'll probably just keep it.
Do you have any instruction links to flashing it?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The X10SLL-F is an E3 Xeon board so the only "issue" is the 32GB RAM limit, but that's often more than enough for home builds.

For the PERC H200 crossflashing, check the excellent resource here:

 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
I realized the RAM I have is 1600Mhz so I'm taking a look at the X11 boards but they're up there in the $200's + need a CPU lol.
This is registered SDRAM. Is the registered aspect more or less the same, from the motherboards perspective?





Oh yeah, Honeybadger don't give a shit!
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
I realized the RAM I have is 1600Mhz so I'm taking a look at the X11 boards but they're up there in the $200's + need a CPU lol.
This is registered SDRAM. Is the registered aspect more or less the same, from the motherboards perspective?





Oh yeah, Honeybadger don't give a ****!

Crap, actually those boards are all DDR4. I feel like I'm stuck in the middle bc the old X10 boards only support 8GB modules max.
Or at least that's what it says.
I have two 16GB DDR3 1600 Registered modules.

What do!?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I realized the RAM I have is 1600Mhz so I'm taking a look at the X11 boards but they're up there in the $200's + need a CPU lol.
This is registered SDRAM. Is the registered aspect more or less the same, from the motherboards perspective?

Crap, actually those boards are all DDR4. I feel like I'm stuck in the middle bc the old X10 boards only support 8GB modules max.
Or at least that's what it says.
I have two 16GB DDR3 1600 Registered modules.

What do!?

HoneyBadger don't give a **** but your motherboard does unfortunately. :grin:

Xeon E3's can't do "registered" DIMMs, you'll have to get an E5 and matching LGA2011 board for that. Upside is that they tend to come with six/eight/more DIMM slots, downside is often a higher pricetag.
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
Yeah socket 2011 seems to be split between DDR3 and DDR4. I found a board but as you said good sir, higher price tag.
I have an old TYAN with two Opteron 6320's? and 64GB of RAM. Can't use those procs for esx so it would be perfect but g0d d@mn it's loud.
I have half a mind to caulk all the seems and fill it with mineral oil.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Back in October 2020 I got some Supermicro X9SRi-F boards for about 120 Euros and for 16 GB RAM I paid 20 Euros. Those prices are not always available, but if you can wait ...

Just out of curiosity: What boards have you found and for what price?
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
We have a nice article about why not to use RAID cards, or really any non-LSI HBA in general...

https://www.truenas.com/community/t...s-and-why-cant-i-use-a-raid-controller.81931/

Back in the day, Cyberjock was very much a fan of his Areca cards, Please do swap it out for a nice LSI HBA, it'll save you some grief.
Hey Greco, I got a Dell H200. Just opened the box from ebay. Do you think that'll do fine if I flash it as linked above?
It has a connector for a battery but no battery. I'm reading your write-up and you say to avoid cards with cache, I'm assuming this one is ok.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The H200 is fine once crossflashed, as a matter of fact I think doing so is on my list of projects today for an R510 currently in the shop. It does not have a connector for a battery. These kind of cards typically have several connectors for various types of integrations with chassis lights and system management. Please do not try connecting a battery. :smile:
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
I flashed the H200 card to Dell6GbSASIT and then to LSI7p.bin or something... it seemed to be working well this morning until I checked pool status.
It had a few hundred errors on one of the SSDs so I shutdown and swapped cables, they followed the SSD but now it's only 8 errors. 0 on others.

After that reboot/cable swap... I'm now getting errors on one of my four 'Enterprise class' WDs (yellow label). :frown:
These four drives are about 3 years old with moderate use.

Apr 17 12:57:33 NAS (da5:mps0:0:6:0): Error 5, Retries exhausted
Apr 17 12:57:33 NAS (da5:mps0:0:6:0): READ(10). CDB: 28 00 e8 e0 88 87 00 00 01 00
Apr 17 12:57:33 NAS (da5:mps0:0:6:0): CAM status: SCSI Status Error
Apr 17 12:57:33 NAS (da5:mps0:0:6:0): SCSI status: Check Condition
Apr 17 12:57:33 NAS (da5:mps0:0:6:0): SCSI sense: NOT READY asc:4,0 (Logical unit not ready, cause not reportable)
Apr 17 12:57:33 NAS (da5:mps0:0:6:0): Retrying command (per sense data)

Reseated hot-swap cage in backplane and Online'd the disk in Pool>Status.
Same thing... like x100 entries within 1 second.

Apr 17 13:10:26 NAS (da5:mps0:0:6:0): READ(10). CDB: 28 00 e8 e0 88 87 00 00 01 00
Apr 17 13:10:26 NAS (da5:mps0:0:6:0): CAM status: SCSI Status Error
Apr 17 13:10:26 NAS (da5:mps0:0:6:0): SCSI status: Check Condition
Apr 17 13:10:26 NAS (da5:mps0:0:6:0): SCSI sense: NOT READY asc:4,0 (Logical unit not ready, cause not reportable)
Apr 17 13:10:26 NAS (da5:mps0:0:6:0): Error 5, Retries exhausted
Apr 17 13:10:26 NAS GEOM_PART: da5 was automatically resized.
Apr 17 13:10:26 NAS Use `gpart commit da5` to save changes or `gpart undo da5` to revert them. What does that mean? :oops:
Apr 17 13:10:26 NAS GEOM_PART: integrity check failed (da5, GPT)
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
After the upgrade to LSI P7, you have to go to LSI P20, and make sure you're on the 20.00.07.00 firmware.

The card firmware runs a PowerPC CPU on the card, and the card communicates with the driver on the FreeBSD host. They have to completely understand each other, and unlike a US "English" speaker talking to an actual British English speaker, where you can kinda get along even through the unusual differences, the LSI card and the host really need to be speaking the exact same language.
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
After the upgrade to LSI P7, you have to go to LSI P20, and make sure you're on the 20.00.07.00 firmware.

The card firmware runs a PowerPC CPU on the card, and the card communicates with the driver on the FreeBSD host. They have to completely understand each other, and unlike a US "English" speaker talking to an actual British English speaker, where you can kinda get along even through the unusual differences, the LSI card and the host really need to be speaking the exact same language.
Gotcha, I'll do that. Should I worry about committing changes? (see bold text)
 

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
Question, I'm thinking of doing something very stupid... should I do it?

I just installed ESXi7 on an a beefy server with "unsupported CPU". The board has two big Opteron's annnd an LSI SAS2008 chip.
I flashed the mobo LSI into IT mode and passed-through the "PCI device" to the VM, installed FreeNAS
I can see the adapter in lspci but to my surprise no drives o_O
As you have written about extensively, these 'RAID controllers' aren't going to present drives natively.
I still have the H200 but I just wanted to make sure I shouldn't bother using this onboard LSI at all.

I know the whole virtualizing FN caused some face-palms about 5 years ago when I was using RDM on v5 I think?
I stopped bc people advised against it but I'm curious to get your impression of FN on ESXi 7 these days with PCI passthrough.
 
Last edited:

arankaspar1

Dabbler
Joined
Apr 7, 2020
Messages
25
Question, I'm thinking of doing something very stupid... should I do it?

I just installed ESXi7 on an a beefy server with "unsupported CPU". The board has two big Opteron's annnd an LSI SAS2008 chip.
I flashed the mobo LSI into IT mode and passed-through the "PCI device" to the VM, installed FreeNAS
I can see the adapter in lspci but to my surprise no drives o_O
As you have written about extensively, these 'RAID controllers' aren't going to present drives natively.
I still have the H200 but I just wanted to make sure I shouldn't bother using this onboard LSI at all.

I know the whole virtualizing FN caused some face-palms about 5 years ago when I was using RDM on v5 I think?
I stopped bc people advised against it but I'm curious to get your impression of FN on ESXi 7 these days with PCI passthrough.
Whoops, nevermind. The onboard actually works.
I had plugged in the wrong pair of Mini-SAS cables on the mobo.
Backplane has 4 cables that go through a small opening in the fan wall.
Of course they were all zip tied up and I just found I had traced the wrong pair.

Camcontrol devlist shows the 3 SSDs I popped in.
I'm going to play around with this burner ssd pool before I throw my actual zfs pool in there.
I guess flashing the mobo SAS2008 to IT may or not have helped? Probably didn't hurt in the end.

Also I checked the card and I had it on version 20 already. It was good to verify it though.

Given that the mobo works, would you use the card anyway?
 
Top