Degraded zpool, SCSI sense errors

Status
Not open for further replies.

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
I built a new FreeNAS array this week and all seemed well until I started performing a write-heavy operation (VMWare converter P2V operation) from physical hardware into vSphere, iSCSI datastore is on this new FreeNAS array ("storage3").

System: Dell PowerEdge r310
Intel(R) Xeon(R) CPU X3450 @ 2.67GHz
24GB ECC Registered PC3-8500 DDR3 (Soon to be 56GB)
IBM 46M0907 PCI Express 2.0 x8 SAS Host Bus Adapter for System X - Flashed to IT mode p16
Habey DS-1280 SAS Enclosure 12 bay
8x WD RED 3TB (2 in PowerEdge bays, 6 in Habey Enclosure)
6x Barracuda XT 3T (All in Habey Enclosure)
2x Barracuda 7200.11 3TB (in PowerEdge Bays)
2 Port Intel Nic for MPIO iSCSI on separate subnets
2 Port onboard Nic, server stack LAGG0

FreeNAS-9.3-STABLE-201509282017

Now, down to the problem
This morning (I started the P2V operation last night) I received a series of email alarms from storage3:

Code:
8:09PM The volume Storage3-A (ZFS) state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.
8:10PM The volume Storage3-A (ZFS) state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.
9:29PM The volume Storage3-A (ZFS) state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.
9:30PM The volume Storage3-A (ZFS) state is DEGRADED: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.
1:08AM The volume Storage3-A (ZFS) state is DEGRADED: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.
1:08AM The volume Storage3-A (ZFS) state is DEGRADED: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.


Upon further investigation I found more information.

The first thing I noticed was some SCSI Sense errors in the logs. They repeat quite often. I have read about 10 other threads on this but my situation seems to differ slightly, so I am documenting this separately. This is only on the (brand new) WD 3TB disks that are in the Habey enclosure, there are two other WD 3TB disks in the Dell r310 chasis that are working fine.

Code:
(da7:mps0:0:13:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)


This is the ouptut of zpool status. Somehow the array is still functioning though!

Code:
/var/log# zpool status Storage3-A
  pool: Storage3-A
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 4.46M in 0h0m with 0 errors on Wed Oct 14 13:09:57 2015
config:

    NAME                                            STATE     READ WRITE CKSUM
    Storage3-A                                      DEGRADED     0     0     0
     raidz2-0                                      DEGRADED     0    25     0
       gptid/8467b285-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/85d6b049-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/86a1e71f-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/87b75965-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/887a8ed8-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/894a3304-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/8a5b2e59-71f5-11e5-8731-001517fd0d2a  DEGRADED     0   305    15  too many errors
       gptid/8b689e60-71f5-11e5-8731-001517fd0d2a  DEGRADED     0   518     1  too many errors
       gptid/8c7d5e87-71f5-11e5-8731-001517fd0d2a  DEGRADED     0   414    31  too many errors
       gptid/8d8c9b8d-71f5-11e5-8731-001517fd0d2a  DEGRADED     0   396    15  too many errors
       gptid/8e9dfc25-71f5-11e5-8731-001517fd0d2a  DEGRADED     0   378    14  too many errors
       gptid/8fad7082-71f5-11e5-8731-001517fd0d2a  DEGRADED     0   321    15  too many errors
       gptid/90906ac9-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/91604b28-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/92384e9b-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0
       gptid/93069446-71f5-11e5-8731-001517fd0d2a  ONLINE       0     0     0

errors: No known data errors


I checked these disk IDs and all 6 disks that say degraded in the GUI are only the WD RED 3TB disks (brand new) that are in the Habey enclosure. There are two other WD RED 3TB disks in the Dell r310 chasis connected to the Intel 3400 controller that are not showing any errors. After seeing this I began checking the smart status of all the drives. Two of the drives have SMART errrors, but they are not the ones showing degraded in zpool status:

Code:
smartctl -H /dev/da9
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p26 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
190 Airflow_Temperature_Cel 0x0022   061   045   045    Old_age   Always   In_the_past 39 (Min/Max 19/40)

Code:
smartctl -H /dev/da11
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p26 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
190 Airflow_Tmperature_Cel 0x0022   062   045   045    Old_age   Always   In_the_past 38 (Min/Max 19/39)


I am running IT Mode p16 firmware on the 9211-8i based IBM 46M0907, although I have gotten an alert from FreeNAS on this and other arrays:
Code:
Firmware version 16 does not match driver version 20 for /dev/mps0


Code:
/var/log# sas2flash -listall
LSI Corporation SAS2 Flash Utility
Version 16.00.00.00 (2013.03.01)
Copyright (c) 2008-2013 LSI Corporation. All rights reserved

    Adapter Selected is a LSI SAS: SAS2008(B2)

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS2008(B2)     16.00.00.00    10.00.00.06    07.31.00.00     00:05:00:00

    Finished Processing Commands Successfully.
    Exiting SAS2Flash.


Other diagnostic information:

Code:
ifconfig
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
    description: connected to ISCSI-SW2 (1/0/16)
    options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
    ether 00:15:17:fd:0d:2a
    inet 10.0.3.29 netmask 0xffffff00 broadcast 10.0.3.255
    nd6 options=9<PERFORMNUD,IFDISABLED>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
    description: connected to ISCSI-SW1 (1/0/16)
    options=4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
    ether 00:15:17:fd:0d:2b
    inet 10.0.4.29 netmask 0xffffff00 broadcast 10.0.4.255
    nd6 options=9<PERFORMNUD,IFDISABLED>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
bce0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    description: connected to ServerStack (1/0/8)
    options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
    ether 78:2b:cb:0a:4f:9a
    nd6 options=9<PERFORMNUD,IFDISABLED>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
bce1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    description: connected to ServerStack (2/0/40)
    options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
    ether 78:2b:cb:0a:4f:9a
    nd6 options=9<PERFORMNUD,IFDISABLED>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536
    nd6 options=9<PERFORMNUD,IFDISABLED>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
    options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
    inet6 ::1 prefixlen 128
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x8
    inet 127.0.0.1 netmask 0xff000000
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
    ether 78:2b:cb:0a:4f:9a
    inet 10.0.50.45 netmask 0xfffffe00 broadcast 10.0.51.255
    nd6 options=9<PERFORMNUD,IFDISABLED>
    media: Ethernet autoselect
    status: active
    laggproto lacp lagghash l2,l3,l4
    laggport: bce1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
    laggport: bce0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>


Code:
pciconf -lv
hostb0@pci0:0:0:0:    class=0x060000 card=0x02a31028 chip=0xd1308086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor DMI'
    class      = bridge
    subclass   = HOST-PCI
pcib1@pci0:0:3:0:    class=0x060400 card=0x02a31028 chip=0xd1388086 rev=0x11 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Core Processor PCI Express Root Port 1'
    class      = bridge
    subclass   = PCI-PCI
pcib2@pci0:0:5:0:    class=0x060400 card=0x02a31028 chip=0xd13a8086 rev=0x11 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Core Processor PCI Express Root Port 3'
    class      = bridge
    subclass   = PCI-PCI
none0@pci0:0:8:0:    class=0x088000 card=0x00000000 chip=0xd1558086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor System Management Registers'
    class      = base peripheral
none1@pci0:0:8:1:    class=0x088000 card=0x00000000 chip=0xd1568086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor Semaphore and Scratchpad Registers'
    class      = base peripheral
none2@pci0:0:8:2:    class=0x088000 card=0x00000000 chip=0xd1578086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor System Control and Status Registers'
    class      = base peripheral
none3@pci0:0:8:3:    class=0x088000 card=0x00000000 chip=0xd1588086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor Miscellaneous Registers'
    class      = base peripheral
none4@pci0:0:16:0:    class=0x088000 card=0x00000000 chip=0xd1508086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor QPI Link'
    class      = base peripheral
none5@pci0:0:16:1:    class=0x088000 card=0x00000000 chip=0xd1518086 rev=0x11 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Core Processor QPI Routing and Protocol Registers'
    class      = base peripheral
ehci0@pci0:0:26:0:    class=0x0c0320 card=0x02a31028 chip=0x3b3c8086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '5 Series/3400 Series Chipset USB2 Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
pcib3@pci0:0:28:0:    class=0x060400 card=0x02a31028 chip=0x3b428086 rev=0x05 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '5 Series/3400 Series Chipset PCI Express Root Port 1'
    class      = bridge
    subclass   = PCI-PCI
pcib4@pci0:0:28:4:    class=0x060400 card=0x02a31028 chip=0x3b4a8086 rev=0x05 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '5 Series/3400 Series Chipset PCI Express Root Port 5'
    class      = bridge
    subclass   = PCI-PCI
ehci1@pci0:0:29:0:    class=0x0c0320 card=0x02a31028 chip=0x3b348086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '5 Series/3400 Series Chipset USB2 Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
pcib5@pci0:0:30:0:    class=0x060401 card=0x02a31028 chip=0x244e8086 rev=0xa5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801 PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
isab0@pci0:0:31:0:    class=0x060100 card=0x02a31028 chip=0x3b148086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '3400 Series Chipset LPC Interface Controller'
    class      = bridge
    subclass   = PCI-ISA
atapci0@pci0:0:31:2:    class=0x01018f card=0x02a31028 chip=0x3b208086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '5 Series/3400 Series Chipset 4 port SATA IDE Controller'
    class      = mass storage
    subclass   = ATA
atapci1@pci0:0:31:5:    class=0x010185 card=0x02a31028 chip=0x3b268086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '5 Series/3400 Series Chipset 2 port SATA IDE Controller'
    class      = mass storage
    subclass   = ATA
em0@pci0:4:0:0:    class=0x020000 card=0x125e8086 chip=0x105e8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
em1@pci0:4:0:1:    class=0x020000 card=0x125e8086 chip=0x105e8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
mps0@pci0:5:0:0:    class=0x010700 card=0x30201000 chip=0x00721000 rev=0x03 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]'
    class      = mass storage
    subclass   = SAS
bce0@pci0:2:0:0:    class=0x020000 card=0x02a31028 chip=0x163b14e4 rev=0x20 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'NetXtreme II BCM5716 Gigabit Ethernet'
    class      = network
    subclass   = ethernet
bce1@pci0:2:0:1:    class=0x020000 card=0x02a31028 chip=0x163b14e4 rev=0x20 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'NetXtreme II BCM5716 Gigabit Ethernet'
    class      = network
    subclass   = ethernet
vgapci0@pci0:1:3:0:    class=0x030000 card=0x02a31028 chip=0x0532102b rev=0x0a hdr=0x00
    vendor     = 'Matrox Graphics, Inc.'
    device     = 'MGA G200eW WPCM450'
    class      = display
    subclass   = VGA


storage3 dmesg: http://pastebin.com/TBcnh9uz

Code:
camcontrol devlist
<ATA ST33000651AS CC44>            at scbus0 target 0 lun 0 (pass0,da0)
<ATA ST33000651AS CC44>            at scbus0 target 1 lun 0 (pass1,da1)
<ATA WDC WD30EFRX-68E 0A82>        at scbus0 target 8 lun 0 (pass2,da2)
<ATA WDC WD30EFRX-68E 0A82>        at scbus0 target 9 lun 0 (pass3,da3)
<ATA WDC WD30EFRX-68E 0A82>        at scbus0 target 10 lun 0 (pass4,da4)
<ATA WDC WD30EFRX-68E 0A82>        at scbus0 target 11 lun 0 (pass5,da5)
<ATA WDC WD30EFRX-68E 0A82>        at scbus0 target 12 lun 0 (pass6,da6)
<ATA WDC WD30EFRX-68E 0A82>        at scbus0 target 13 lun 0 (pass7,da7)
<ATA ST33000651AS CC44>            at scbus0 target 14 lun 0 (pass8,da8)
<ATA ST33000651AS CC44>            at scbus0 target 15 lun 0 (pass9,da9)
<ATA ST33000651AS CC44>            at scbus0 target 16 lun 0 (pass10,da10)
<ATA ST33000651AS CC44>            at scbus0 target 17 lun 0 (pass11,da11)
<ST3000DM001-1CH166 CC29>          at scbus1 target 0 lun 0 (pass12,ada0)
<WDC WD30EFRX-68EUZN0 82.00A82>    at scbus1 target 1 lun 0 (pass13,ada1)
<ST3000DM001-1ER166 CC25>          at scbus2 target 0 lun 0 (pass14,ada2)
<WDC WD30EFRX-68EUZN0 82.00A82>    at scbus2 target 1 lun 0 (pass15,ada3)
<TEAC DVD-ROM DV-28SW R.2A>        at scbus3 target 0 lun 0 (pass16,cd0)
<Kingston DataTraveler 2.0 PMAP>   at scbus6 target 0 lun 0 (pass17,da12)
<Kingston DataTraveler 2.0 PMAP>   at scbus7 target 0 lun 0 (pass18,da13)


Bottom Line: Any postulations as to what is going on or suggestions of what to try first? This enclosure was in service for years attached to a RAID HBA on linux and worked fine, but then again it appears to be working the way it is now. The most interesting detail in my mind is that only the WD RED 3TB disks are having the SCSI sense error even though they are all in the same enclosure! I am not against buying a new enclosure since this one was already used but I need to exhaust other troubleshooting options before spending that kind of money.

These are the steps that I can think of to start with:
  • Replace SFF-8087 cable
  • Flash controller to p20 IT firmware
  • Reseat drives
  • Remove drives that might be problematic and resliver then see if issues go away
Finally, related to the last suggestion (which I got from https://forums.freenas.org/index.php?threads/errors-in-kernel-log.14136/#post-67418) Does anyone know if this tool that he mentions is in the GUI now or how I get it or get to it?
 
Last edited:

Arnie

Cadet
Joined
Oct 15, 2015
Messages
1
What size (wattage) power supply do you have in the enclosure? Have you tested it to see if it is outputting the correct voltages? I see in the specs for the enclosure that is comes with a 350W PSU. This should be beefy enough, as long as it is operating to spec. You did mention that the enclosure is a few years old, maybe the capacitors are on the was out.

I've had issues in the past where drives have dropped off the SATA bus during write operations due to the power supply not supplying enough power. HDDs draw more power during write ops, so if the power supply is nearing it's wattage limit, the voltages will fluctuate when the disks write.

Try a higher wattage power supply.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
350 W with 16 drives is asking for problems. And a vdev of 16 drives is bad too, usually the maximum we recommend is 11-12 drives per vdev ;)
 
Joined
Oct 2, 2014
Messages
925
350 W with 16 drives is asking for problems. And a vdev of 16 drives is bad too, usually the maximum we recommend is 11-12 drives per vdev ;)
This ^
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
More power, fewer drives in a vdev and use the correct firmware on your controller.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Also, if you think a 16 disk RAIDZ2 is going to be fit to run virtual machines, you are in for a big surprise.

And doing a -H for smartctl to get disk status is NOT a good way to identify health. Read up on what tells a drive to report PASS or FAIL and you'll understand why. If you look around the forums we never ask for -H. ;)
 
Last edited:

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
350 W with 16 drives is asking for problems. And a vdev of 16 drives is bad too, usually the maximum we recommend is 11-12 drives per vdev ;)

Only 12 drives are in the enclosure, but I am considering upgrading the power supply anyway.

More power, fewer drives in a vdev and use the correct firmware on your controller.

So I should be running p20 on my controller? I saw some people on the forums saying it wasn't ready or something to that effect.

Also, if you think a 16 disk RAIDZ2 is going to be fit to run virtual machines, you are in for a big surprise.

And doing a -H for smartctl to get disk status is NOT a good way to identify health. Read up on what tells a drive to report PASS or FAIL and you'll understand why. If you look around the forums we never ask for -H. ;)

I'm not running production VMs on the array per se. It is holding .vmdk files that contain backups only, where the workload should be mostly sequential writes and speed and latency are not as important. The boot disks and the rest of my environment are on EqualLogic iSCSI SANs. In regards to the smartctl command, is there more relevant information I can provide or is that information just not considered useful?
 
Joined
Oct 2, 2014
Messages
925
Flashed card to p20, that didn't help.
We didnt sat that flashing the card was your only issue, nor did we state it was a fix all. You still have a pool that is too wide (too many drives), could try running the SMART script and see where you stand, as @cyberjock stated -H doesnt provide the info we need.. Could also see here as someone asked how to read the SMART results. https://forums.freenas.org/index.php?threads/interpreting-smart-results.38169/#post-229736
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
I just swapped the PSU that shipped with the enclosure (which turned out to be 400 watts) out with a nice 750 watt PSU that I had in stock. It is still throwing SCSI Sense Errors on heavy write workloads.

I'm just trying things one at a time :). I did some research on the stripe width recommendations including the forum posts which all seem to point at the same few blog posts and it seems to me that having a wider raidz2 setup is a performance limitation:

Choose a RAID-Z stripe width based on your IOPS needs and the amount of space you are willing to devote to parity information. If you need more IOPS, use fewer disks per stripe. If you need more usable space, use more disks per stripe. Trying to optimize your RAID-Z stripe width based on exact numbers is irrelevant in nearly all cases.

Is there some reason low performance alone would cause the disks to go offline or throw that error?
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
Something about the Habey DS-1280 I noticed: 2 of the drives have their own channel on the SFF multilane cable, and the other 10 are running of 2 channels according to the SAS BIOS. I researched this a little and found this review: http://www.newegg.com/Product/SingleProductReview.aspx?reviewid=3634269

SAS expander scheme is all wrong. of the 4 SATA 3.0Gb channels going into the device, 2 get wasted on one drive per channel. Then the remaining 2 SATA channels get divided between the remaining 10 drive bays. Do a little math and you get theoretical max 600Mbit/s per drive of those ten. Divide by 8bits per byte, and you're at about 75MB/sec max speed per drive on those 10 expanded drives. modern SATA2 and SATA3 drives can run at 110-150MB/sec, so you're just throwing bandwidth in the garbage.

I wonder if this could be related to the issue?
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
I just deleted my zpool and created a new one, this time with two striped 8 disk raidz2's:

Code:
[root@storage3] ~# zpool status Storage3-A
  pool: Storage3-A
state: ONLINE
  scan: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    Storage3-A                                      ONLINE       0     0     0
     raidz2-0                                      ONLINE       0     0     0
       gptid/cf19a70c-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d156342b-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d1f344e0-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d308b7ba-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d3ca4897-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d488853d-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d59a280f-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d6abb3f7-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
     raidz2-1                                      ONLINE       0     0     0
       gptid/d7cca114-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d8db79ff-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/d9ec36e1-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/daf8220d-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/dbcdda7b-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/dc9645eb-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/dd5cbe01-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0
       gptid/de259a89-7825-11e5-a441-001517fd0d2a  ONLINE       0     0     0

errors: No known data errors


I'm still getting SCSI Sense errors on heavy writes!:

Code:
(da2:mps0:0:8:0): WRITE(10). CDB: 2a 00 00 40 d7 48 00 00 08 00
(da2:mps0:0:8:0): CAM status: SCSI Status Error
(da2:mps0:0:8:0): SCSI status: Check Condition
(da2:mps0:0:8:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da2:mps0:0:8:0): Retrying command (per sense data)
(da3:mps0:0:9:0): WRITE(10). CDB: 2a 00 00 40 d7 48 00 00 08 00
(da3:mps0:0:9:0): CAM status: SCSI Status Error
(da3:mps0:0:9:0): SCSI status: Check Condition
(da3:mps0:0:9:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da3:mps0:0:9:0): Retrying command (per sense data)
    (da7:mps0:0:13:0): WRITE(10). CDB: 2a 00 00 43 02 20 00 01 00 00 length 131072 SMID 904 terminated ioc 804b scsi 0 state c xfer 0
(da7:mps0:0:13:0): WRITE(10). CDB: 2a 00 00 43 02 20 00 01 00 00
(da7:mps0:0:13:0): CAM status: SCSI Status Error
(da7:mps0:0:13:0): SCSI status: Check Condition
(da7:mps0:0:13:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da7:mps0:0:13:0): Retrying command (per sense data)
    (da2:mps0:0:8:0): WRITE(10). CDB: 2a 00 00 48 ee 10 00 00 e0 00 length 114688 SMID 244 terminated ioc 804b scsi 0 state c xfer 0
(da2:mps0:0:8:0): WRITE(10). CDB: 2a 00 00 48 ee 10 00 00 e0 00
(da2:mps0:0:8:0): CAM status: SCSI Status Error
(da2:mps0:0:8:0): SCSI status: Check Condition
(da2:mps0:0:8:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da2:mps0:0:8:0): Retrying command (per sense data)
    (da7:mps0:0:13:0): WRITE(10). CDB: 2a 00 00 4e 0c f0 00 01 00 00 length 131072 SMID 692 terminated ioc 804b scsi 0 state c xfer 0
(da7:mps0:0:13:0): WRITE(10). CDB: 2a 00 00 4e 0c f0 00 01 00 00
(da7:mps0:0:13:0): CAM status: SCSI Status Error
(da7:mps0:0:13:0): SCSI status: Check Condition
(da7:mps0:0:13:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da7:mps0:0:13:0): Retrying command (per sense data)
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
I deleted that zpool and created two separate ones, now there are 2 raidz2 arrays completely separately shared to vmware. The second one contains all the disks that have been throwing the scsi sense errors and also happens to be the part of the array that has most of the port multiplier ports.

Current setup:

Code:
zpool status
  pool: Storage3-A
state: ONLINE
  scan: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    Storage3-A                                      ONLINE       0     0     0
     raidz2-0                                      ONLINE       0     0     0
       gptid/0286921c-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/03933b59-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/042ee987-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/053e3271-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/06006866-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/06dc9921-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/07f0f5a3-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/09070e78-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0

errors: No known data errors

  pool: Storage3-B
state: ONLINE
  scan: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    Storage3-B                                      ONLINE       0     0     0
     raidz2-0                                      ONLINE       0     0     0
       gptid/6456c2d6-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/65198e60-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/65e006e1-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/66a4fcec-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/6774ce2a-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/68420ac7-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/6914f34b-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0
       gptid/69dd79e2-782f-11e5-99d7-001517fd0d2a  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: resilvered 1.92G in 0h57m with 0 errors on Sun Oct 11 18:22:31 2015
config:

    NAME                                            STATE     READ WRITE CKSUM
    freenas-boot                                    ONLINE       0     0     0
     mirror-0                                      ONLINE       0     0     0
       gptid/691f8ba7-705d-11e5-94aa-001517fd0d2a  ONLINE       0     0     0
       gptid/eb671443-7066-11e5-9764-001517fd0d2a  ONLINE       0     0     0

errors: No known data errors


I forgot to mention earlier that I am using file-based extents. I guess I could try it with zvol device based ones instead to see if that makes a difference but from everything I've read it probably shouldn't matter.

Edit: I tried that and FreeNAS errored out when creating an extent as a Block device:

Code:
Request Method: POST
Request URL: http://storage3.brewerscience.com/admin/services/iscsitargetextent/add/
Software Version: FreeNAS-9.3-STABLE-201509282017
Exception Type: AttributeError
Exception Value:
'NoneType' object has no attribute 'message'
Exception Location: /usr/local/www/freenasUI/../freenasUI/freeadmin/middleware.py in process_response, line 206
Server time: Wed, 21 Oct 2015 15:57:32 -0500
Traceback

Request information
GET
No GET data

POST
Variable Value
iscsi_target_extent_avail_threshold u''
iscsi_target_extent_type u'Disk'
__all__ u''
iscsi_target_extent_filesize u'0'
iscsi_target_extent_insecure_tpc u'on'
iscsi_target_extent_comment u''
iscsi_target_extent_name u'Storage3-B'
iscsi_target_extent_rpm u'SSD'
iscsi_target_extent_disk u'zvol/Storage3-B/S3B-zvol'
iscsi_target_extent_blocksize u'512'
__form_id u'dialogForm_Extent'
iscsi_target_extent_serial u'001517fd0d2a01'
iscsi_target_extent_path u''
FILES
No FILES data

COOKIES
Variable Value
hubspotutk '77c40e9c9901d0783875dd54e882af5d'
hsfirstvisit 'http%3A%2F%2Fpublic-dev.brewerscience.com%2F||1395341145170'
__hstc '186563766.77c40e9c9901d0783875dd54e882af5d.1395341145172.1442414001224.1445370631493.80'
engine_ssl_ 'enabled'
PASSWORD '4b764453552f51466a334a595a63653075424c4944413d3d'
SCREEN_NAME '582f4241693758575464493d'
LOGIN '746d6165727a'
REMEMBER_ME 'true'
__hssrc '1'
sessionid '5rl4hbw1m7y42b8zvfqfak90ddd1akt0'
csrftoken 'TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
fntreeSaveStateCookie 'root%2Croot%2F1%2Croot%2F1%2F5%2Croot%2F56%2Croot%2F56%2F57%2Croot%2F58%2Croot%2F58%2F59%2Croot%2F127%2Croot%2F83%2Croot%2F132%2Croot%2F58%2F59%2F60%2Croot%2F135%2Croot%2F137'
__utma '170675443.1238331688.1395341161.1406642296.1407183371.26'
_ga 'GA1.2.1238331688.1395341161'
META
Variable Value
HTTPS ''
SERVER_SOFTWARE 'nginx/1.6.2'
REQUEST_URI '/admin/services/iscsitargetextent/add/'
QUERY_STRING ''
GATEWAY_INTERFACE 'CGI/1.1'
CONTENT_TYPE 'application/x-www-form-urlencoded'
wsgi.version (1, 0)
HTTP_COOKIE 'hsfirstvisit=http%3A%2F%2Fpublic-dev.brewerscience.com%2F||1395341145170; __utma=170675443.1238331688.1395341161.1406642296.1407183371.26; LOGIN=746d6165727a; PASSWORD=4b764453552f51466a334a595a63653075424c4944413d3d; REMEMBER_ME=true; SCREEN_NAME=582f4241693758575464493d; engine_ssl_=enabled; _ga=GA1.2.1238331688.1395341161; __hstc=186563766.77c40e9c9901d0783875dd54e882af5d.1395341145172.1442414001224.1445370631493.80; __hssrc=1; hubspotutk=77c40e9c9901d0783875dd54e882af5d; sessionid=5rl4hbw1m7y42b8zvfqfak90ddd1akt0; fntreeSaveStateCookie=root%2Croot%2F1%2Croot%2F1%2F5%2Croot%2F56%2Croot%2F56%2F57%2Croot%2F58%2Croot%2F58%2F59%2Croot%2F127%2Croot%2F83%2Croot%2F132%2Croot%2F58%2F59%2F60%2Croot%2F135%2Croot%2F137; csrftoken=TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
wsgi.input <flup.server.fcgi_base.InputStream object at 0x81589e110>
DOCUMENT_ROOT '/usr/local/etc/nginx/html'
DOCUMENT_URI '/admin/services/iscsitargetextent/add/'
HTTP_REFERER 'http://storage3.brewerscience.com/'
REMOTE_ADDR '192.168.45.152'
HTTP_X_REQUESTED_WITH 'XMLHttpRequest'
SERVER_ADDR '10.0.50.45'
CSRF_COOKIE u'TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
SERVER_NAME 'localhost'
HTTP_X_CSRFTOKEN 'TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
HTTP_CONTENT_TYPE 'application/x-www-form-urlencoded'
REQUEST_METHOD 'POST'
wsgi.errors <flup.server.fcgi_base.TeeOutputStream object at 0x81589e2d0>
SCRIPT_NAME u''
REMOTE_PORT '54815'
wsgi.multithread True
HTTP_ACCEPT_LANGUAGE 'en-US,en;q=0.8'
SERVER_PROTOCOL 'HTTP/1.1'
wsgi.url_scheme 'http'
HTTP_CONTENT_LENGTH '419'
HTTP_HOST 'storage3.brewerscience.com'
PATH_INFO u'/admin/services/iscsitargetextent/add/'
SERVER_PORT '80'
CONTENT_LENGTH '419'
HTTP_USER_AGENT 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36'
wsgi.run_once False
wsgi.multiprocess False
HTTP_ORIGIN 'http://storage3.brewerscience.com'
HTTP_CONNECTION 'keep-alive'
HTTP_ACCEPT_ENCODING 'gzip, deflate'
REDIRECT_STATUS '200'
HTTP_ACCEPT '*/*'
Request Method: POST
Request URL: http://storage3.brewerscience.com/admin/services/iscsitargetextent/add/
Software Version: FreeNAS-9.3-STABLE-201509282017
Exception Type: AttributeError
Exception Value:
'NoneType' object has no attribute 'message'
Exception Location: /usr/local/www/freenasUI/../freenasUI/freeadmin/middleware.py in process_response, line 206
Server time: Wed, 21 Oct 2015 15:57:32 -0500
Traceback

Request information
GET
No GET data

POST
Variable Value
iscsi_target_extent_avail_threshold u''
iscsi_target_extent_type u'Disk'
__all__ u''
iscsi_target_extent_filesize u'0'
iscsi_target_extent_insecure_tpc u'on'
iscsi_target_extent_comment u''
iscsi_target_extent_name u'Storage3-B'
iscsi_target_extent_rpm u'SSD'
iscsi_target_extent_disk u'zvol/Storage3-B/S3B-zvol'
iscsi_target_extent_blocksize u'512'
__form_id u'dialogForm_Extent'
iscsi_target_extent_serial u'001517fd0d2a01'
iscsi_target_extent_path u''
FILES
No FILES data

COOKIES
Variable Value
hubspotutk '77c40e9c9901d0783875dd54e882af5d'
hsfirstvisit 'http%3A%2F%2Fpublic-dev.brewerscience.com%2F||1395341145170'
__hstc '186563766.77c40e9c9901d0783875dd54e882af5d.1395341145172.1442414001224.1445370631493.80'
engine_ssl_ 'enabled'
PASSWORD '4b764453552f51466a334a595a63653075424c4944413d3d'
SCREEN_NAME '582f4241693758575464493d'
LOGIN '746d6165727a'
REMEMBER_ME 'true'
__hssrc '1'
sessionid '5rl4hbw1m7y42b8zvfqfak90ddd1akt0'
csrftoken 'TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
fntreeSaveStateCookie 'root%2Croot%2F1%2Croot%2F1%2F5%2Croot%2F56%2Croot%2F56%2F57%2Croot%2F58%2Croot%2F58%2F59%2Croot%2F127%2Croot%2F83%2Croot%2F132%2Croot%2F58%2F59%2F60%2Croot%2F135%2Croot%2F137'
__utma '170675443.1238331688.1395341161.1406642296.1407183371.26'
_ga 'GA1.2.1238331688.1395341161'
META
Variable Value
HTTPS ''
SERVER_SOFTWARE 'nginx/1.6.2'
REQUEST_URI '/admin/services/iscsitargetextent/add/'
QUERY_STRING ''
GATEWAY_INTERFACE 'CGI/1.1'
CONTENT_TYPE 'application/x-www-form-urlencoded'
wsgi.version (1, 0)
HTTP_COOKIE 'hsfirstvisit=http%3A%2F%2Fpublic-dev.brewerscience.com%2F||1395341145170; __utma=170675443.1238331688.1395341161.1406642296.1407183371.26; LOGIN=746d6165727a; PASSWORD=4b764453552f51466a334a595a63653075424c4944413d3d; REMEMBER_ME=true; SCREEN_NAME=582f4241693758575464493d; engine_ssl_=enabled; _ga=GA1.2.1238331688.1395341161; __hstc=186563766.77c40e9c9901d0783875dd54e882af5d.1395341145172.1442414001224.1445370631493.80; __hssrc=1; hubspotutk=77c40e9c9901d0783875dd54e882af5d; sessionid=5rl4hbw1m7y42b8zvfqfak90ddd1akt0; fntreeSaveStateCookie=root%2Croot%2F1%2Croot%2F1%2F5%2Croot%2F56%2Croot%2F56%2F57%2Croot%2F58%2Croot%2F58%2F59%2Croot%2F127%2Croot%2F83%2Croot%2F132%2Croot%2F58%2F59%2F60%2Croot%2F135%2Croot%2F137; csrftoken=TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
wsgi.input <flup.server.fcgi_base.InputStream object at 0x81589e110>
DOCUMENT_ROOT '/usr/local/etc/nginx/html'
DOCUMENT_URI '/admin/services/iscsitargetextent/add/'
HTTP_REFERER 'http://storage3.brewerscience.com/'
REMOTE_ADDR '192.168.45.152'
HTTP_X_REQUESTED_WITH 'XMLHttpRequest'
SERVER_ADDR '10.0.50.45'
CSRF_COOKIE u'TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
SERVER_NAME 'localhost'
HTTP_X_CSRFTOKEN 'TP2HbCUv58ONmfkV8ydVDikVjJGBXAby'
HTTP_CONTENT_TYPE 'application/x-www-form-urlencoded'
REQUEST_METHOD 'POST'
wsgi.errors <flup.server.fcgi_base.TeeOutputStream object at 0x81589e2d0>
SCRIPT_NAME u''
REMOTE_PORT '54815'
wsgi.multithread True
HTTP_ACCEPT_LANGUAGE 'en-US,en;q=0.8'
SERVER_PROTOCOL 'HTTP/1.1'
wsgi.url_scheme 'http'
HTTP_CONTENT_LENGTH '419'
HTTP_HOST 'storage3.brewerscience.com'
PATH_INFO u'/admin/services/iscsitargetextent/add/'
SERVER_PORT '80'
CONTENT_LENGTH '419'
HTTP_USER_AGENT 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36'
wsgi.run_once False
wsgi.multiprocess False
HTTP_ORIGIN 'http://storage3.brewerscience.com'
HTTP_CONNECTION 'keep-alive'
HTTP_ACCEPT_ENCODING 'gzip, deflate'
REDIRECT_STATUS '200'
HTTP_ACCEPT '*/*'
 
Last edited:

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340
I can't speak to the write errors, but I think you should test the performance of that datastore.

I think zvol is the recommended approach I saw in some posts. That's what I'm using. Also, you should test your zpool's performance doing VM stuff. I used to use a 4x vdev wide set of 5 disk Z1s. I now use mirrors as recommended by most everyone doing virtual hosting. Your performance will not be very good for loads of reading and writing with that Z2. I know it sucks to give up the space. Just depends on what you need.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Why are you still trying to use RAIDZ2? Did you read around on the forums? I already said that RAIDZ2 isn't going to work...
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
Why are you still trying to use RAIDZ2? Did you read around on the forums? I already said that RAIDZ2 isn't going to work...

What is wrong with an 8 disk RAIDZ2?

I'm not here to talk about performance, but since everyone is stuck trolling on that, I set up a striped mirrored vdev and I'm still getting the same disk issues. I also got the zvol to work but it still has the same errors.

I've said this before but this array is for storing backups and needs very little performance. Just because there are some VM disks on it doesn't mean it has to be super fast. No VMs are ever going to boot off of it. A bunch of data is going to get written to it and then the daily workload is going to be delta changes that are small. For my workload, a RAIDZ2 should be fine. My critical VMs are all on a group of enterprise SANs.

Focusing on the actual issue, I am almost certain this is related to the SAS expander which is not supposed to be used with SATA drives on ZFS. I've specced out a better enclosure. This one doesn't use a SAS expander and it gives a dedicated connection to each disk:

SC936A-R1200B : http://www.amazon.com/dp/B004C25UFW/?tag=ozlp-20
LSI SAS 9201-16e: http://www.amazon.com/dp/B004D8PJXS/?tag=ozlp-20
SFF-8088 Cables 1M x4 http://www.monoprice.com/product?p_id=8185
2ft SFF-8087 Cables >2ft x4 http://www.newegg.com/Product/Produ...3034&cm_re=C-SFF8087-D-_-16-133-034-_-Product
8087-8088 adatper x2 http://www.newegg.com/Product/Produ...3056&cm_re=C-8087-8088-_-16-133-056-_-Product
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340
...I am almost certain this is related to the SAS expander which is not supposed to be used with SATA drives on ZFS.
Is this the only enclosure you use with your system and had problems with? (Habey DS-1280 SAS Enclosure 12 bay)

I'm using an HP MSA70 and D2700 and they both supported not only using SATA, but both at the same time (though it drops your speed down to SATA1 on an old MSA70). Also, I'm using SuperMicro's backplane which is an expander and it worked fine when I have both drive types in there.
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
Yes this is the only SAS expanding enclosure I have. I know is the problems happen only with disks on the expander ports on the enclosure (some of the ports are not on the SAS expander). I am assuming that the cheapest 12 port direct attach controller available in the world probably also has a cheap SAS expander, so maybe that is the problem. I am certainly not trying to preach that SAS expanders can't be used with SATA drives for everyone else, I've just eliminated most other variables in this build at this point.

Where did I get the idea that SAS expanders aren't great? Some blog. Is it credible? I don't know but you can read it here: http://garrett.damore.org/2010/08/why-sas-sata-is-not-such-great-idea.html

I've also had a good read through this thread: https://forums.freenas.org/index.php?threads/sas-expanders.36394/... This guy (quillo) is having similar errors as me: http://hardforum.com/showthread.php?t=1548145
 

zeroluck

Dabbler
Joined
Feb 12, 2015
Messages
43
Update: The Supermicro JBOD build is done and installed. The errors are gone on heavy sequential and/or random writes, during scrubs, and benchmarks, all while running on a 16 drive RaidZ2. I'm probably not going to leave it that way, but this does verify that having a wide RaidZ2 doesn't cause the problems I was having.

I don't know if any SAS expander would have done this but the Habey DS-1280 is not a great one to experiment with if you are wanting to go down that road.

Note: The SuperMicro CSE-PTJBOD-CS3 JBOD chassis controller w/IPMI is awesome.
 
Status
Not open for further replies.
Top