Worried about chance of failure on recent upgrade

Status
Not open for further replies.

Teva

Dabbler
Joined
May 16, 2014
Messages
10
I recently upgraded from 16GB non-ecc to 8GB ecc (also swapped MB). I'm *slightly* concerned about only having 8GB of ram now, but in the last few days the server has been running I haven't noticed any problems. I am not running any jailed apps, just CIFS and iSCSI mounts for 3 clients.
My current hard drive configuration is 8 x 3TB drives in a volume consisting of 2 RaidZ2 sets (4 drives each).

My initial goal was to buy 1 stick of ECC ram and verify the board supported it and then buy a 2nd identical stick but the egg went out of stock in the week it took the first stick to get here. My new goal is to buy a 16gb set and go up to 24GB of ram, but i want to know if this should be an urgent issue or if it can wait a few months.

Are there any horror stories of ZFS failure/corruption on 8GB for a ZFS pool with 24TB of raw space? I know the usual warnings of 4GB system and data going, but is this something to worry about with 8GB?

I guess I would also like to know if anyone has had *ANY* problems with a similar setup in RE to the hardware used on my system:

Code:
Mem: 185M Active, 128M Inact, 6147M Wired, 183M Buf, 1418M Free
ARC: 5611M Total, 1340M MFU, 4233M MRU, 784K Anon, 28M Header, 9978K Other
Swap: 16G Total, 16G Free

The hardware used:
FreeNAS 9.2.1.5 64bit (upgraded from 9.1 something)
Asus Sabertooth 990FX (rev1)
ECC enabled in the bios
Using 4 of the 8 onboard sata, 2 are useless because of pseudo raid and i wanted save the other 2 MB ports for ssds in the future
Almost everything onboard is disabled: nic/sound/raid ports
AMD 1045T
1 stick of 8GB Kingston ECC unbuffered ram (KVR16LE11/8I)
2 x Syba SD-SA2PEX-2IR (flashed in IT mode - where the rest of the HD's sit)
1 x LSI 1064e chipset card (flashed in IT mode, for 2TB or less drives, i have some 2TB drives and was thinking of using these as a backup device)
Server class pci-e intel 82571EB dual port 1gb nic, but only using 1 of the ports.

PCICONF:
Code:
siis0@pci0:1:0:0:       class=0x010600 card=0x31321095 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image, Inc.'
    device     = 'SiI 3132 Serial ATA Raid II Controller'
    class      = mass storage
    subclass   = SATA
mpt0@pci0:2:0:0:        class=0x010000 card=0x30901000 chip=0x00561000 rev=0x08 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'SAS1064ET PCI-Express Fusion-MPT SAS'
    class      = mass storage
    subclass   = SCSI
em0@pci0:3:0:0: class=0x020000 card=0x7044103c chip=0x105e8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
em1@pci0:3:0:1: class=0x020000 card=0x7044103c chip=0x105e8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
siis1@pci0:5:0:0:       class=0x010600 card=0x31321095 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image, Inc.'
    device     = 'SiI 3132 Serial ATA Raid II Controller'
    class      = mass storage
    subclass   = SATA


Side question: is this the best way to verify ECC is working: dmidecode --type 16, it returns "Error Correction Type: Multi-bit ECC", memtest has been less than useful for displaying ecc info.

The server is on a UPS: Back-UPS NS 1250 (newer black usb lcd). it is configured to gracefully shutdown the server at 37% battery.

The HD's are arranged so that 2 of the drives in each RaidZ2 set are going to the MB, and then the other two drives are split between the 2 Syba cards
ex: raidz2-0 2xMB, 1xsyba-port1-pci:slot1, 1xsyba-port1-pci:slot2 ditto for RaidZ2-1 but port 2 for the other card.
I hope this ensures that in the event of a complete failure of any SATA controller the zfs data will live on.

ZFS Layout, paranoia in effect:
Code:
        NAME                                                STATE     READ WRITE CKSUM
        %hidden%                                            ONLINE       0     0     0
          raidz2-0                                          ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
          raidz2-1                                          ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
dmidecode only works for Intel, and only for certain chipsets. There is no 'end-all' for identifying that ECC works for your board except to trust the manufacturer when they say they use ECC that they actually use the function.

The corruption is with there not being enough RAM to keep the system stable. 8GB seems to be the smallest regardless of pool size. If you want to do 8GB of RAM with a 24TB pool you are welcome too. I did that, and one day you'll wonder why you can't even do 1MB/sec from your server. It'll be something that runs fine today and tomorrow you are wondering why your server can't function. If you go to 16GB of RAM you'll probably be fine with 24TB though... even long term.
 

master-richie

Dabbler
Joined
Jan 1, 2015
Messages
41
I recently upgraded from 16GB non-ecc to 8GB ecc (also swapped MB). I'm *slightly* concerned about only having 8GB of ram now, but in the last few days the server has been running I haven't noticed any problems. I am not running any jailed apps, just CIFS and iSCSI mounts for 3 clients.
My current hard drive configuration is 8 x 3TB drives in a volume consisting of 2 RaidZ2 sets (4 drives each).

My initial goal was to buy 1 stick of ECC ram and verify the board supported it and then buy a 2nd identical stick but the egg went out of stock in the week it took the first stick to get here. My new goal is to buy a 16gb set and go up to 24GB of ram, but i want to know if this should be an urgent issue or if it can wait a few months.

Are there any horror stories of ZFS failure/corruption on 8GB for a ZFS pool with 24TB of raw space? I know the usual warnings of 4GB system and data going, but is this something to worry about with 8GB?

I guess I would also like to know if anyone has had *ANY* problems with a similar setup in RE to the hardware used on my system:

Code:
Mem: 185M Active, 128M Inact, 6147M Wired, 183M Buf, 1418M Free
ARC: 5611M Total, 1340M MFU, 4233M MRU, 784K Anon, 28M Header, 9978K Other
Swap: 16G Total, 16G Free

The hardware used:
FreeNAS 9.2.1.5 64bit (upgraded from 9.1 something)
Asus Sabertooth 990FX (rev1)
ECC enabled in the bios
Using 4 of the 8 onboard sata, 2 are useless because of pseudo raid and i wanted save the other 2 MB ports for ssds in the future
Almost everything onboard is disabled: nic/sound/raid ports
AMD 1045T
1 stick of 8GB Kingston ECC unbuffered ram (KVR16LE11/8I)
2 x Syba SD-SA2PEX-2IR (flashed in IT mode - where the rest of the HD's sit)
1 x LSI 1064e chipset card (flashed in IT mode, for 2TB or less drives, i have some 2TB drives and was thinking of using these as a backup device)
Server class pci-e intel 82571EB dual port 1gb nic, but only using 1 of the ports.

PCICONF:
Code:
siis0@pci0:1:0:0:       class=0x010600 card=0x31321095 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image, Inc.'
    device     = 'SiI 3132 Serial ATA Raid II Controller'
    class      = mass storage
    subclass   = SATA
mpt0@pci0:2:0:0:        class=0x010000 card=0x30901000 chip=0x00561000 rev=0x08 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'SAS1064ET PCI-Express Fusion-MPT SAS'
    class      = mass storage
    subclass   = SCSI
em0@pci0:3:0:0: class=0x020000 card=0x7044103c chip=0x105e8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
em1@pci0:3:0:1: class=0x020000 card=0x7044103c chip=0x105e8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
siis1@pci0:5:0:0:       class=0x010600 card=0x31321095 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image, Inc.'
    device     = 'SiI 3132 Serial ATA Raid II Controller'
    class      = mass storage
    subclass   = SATA


Side question: is this the best way to verify ECC is working: dmidecode --type 16, it returns "Error Correction Type: Multi-bit ECC", memtest has been less than useful for displaying ecc info.

The server is on a UPS: Back-UPS NS 1250 (newer black usb lcd). it is configured to gracefully shutdown the server at 37% battery.

The HD's are arranged so that 2 of the drives in each RaidZ2 set are going to the MB, and then the other two drives are split between the 2 Syba cards
ex: raidz2-0 2xMB, 1xsyba-port1-pci:slot1, 1xsyba-port1-pci:slot2 ditto for RaidZ2-1 but port 2 for the other card.
I hope this ensures that in the event of a complete failure of any SATA controller the zfs data will live on.

ZFS Layout, paranoia in effect:
Code:
        NAME                                                STATE     READ WRITE CKSUM
        %hidden%                                            ONLINE       0     0     0
          raidz2-0                                          ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
          raidz2-1                                          ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0
            gptid/RandomRandomRandomRandomRandomRandom.eli  ONLINE       0     0     0

sorry for the necro post but wondering how your SD-SA2PEX-2IR card worked out for you? am new to FreeNAS and picked up a hp microserver to install on but need to expand my sata ports to run msata drives for my ZIL. this card is perfect for my needs but can't find much info about how well it works with FreeNAS out of the box. thanks!
 

Teva

Dabbler
Joined
May 16, 2014
Messages
10
The SD-SA2PEX-2IR works perfectly. The flashing process (to IT mode) was a bit annoying as i had to plug them into a windows pc to do it because the bios update program i found was windows only. I wouldn't use those for a ZIL or anything SSD as they are only SATA II. Instead plug some of your existing HD's into the cards and put the ssd on the motherboard controller.

I'm still running FreeNAS-9.2.1.5-RELEASE-x64 (80c1d35) because my backup solution needs improvement and i'm afraid of losing data. I do have backups but they consist of copying the data to harddrives attached to my windows 7 pc every month - my plan was to build another freenas server and place it offsite and sync to it but haven't gotten around to it. The server has been online for 141 days since last boot, it was shutdown due to a power outtage before that. Performance is very good, mostly iscsi mounts to an ESX host and my workstation that i use to play games from steam on, there is also a few large samba shares for family access (photos/movies).

No complaints. My layout: 17.7TB allocated, 4.05TB free
ZVOL - 8 x 3TB seagate disks
raidz2-0
hd0 - motherboard
hd1 - motherboard
hd2 - SD-SA2PEX-2IR #1
hd3 - SD-SA2PEX-2IR #2
raidz2-1
hd4 - motherboard
hd5 - motherboard
hd6 - SD-SA2PEX-2IR #1
hd7 - SD-SA2PEX-2IR #2
 

master-richie

Dabbler
Joined
Jan 1, 2015
Messages
41
when you say you wouldn't use them for SSD - is it because of the bandwidth bottleneck going from III to II or actual performance problems from the ZIL? the pci-e x16 slot on my HP is destined to be taken by a dual or quad GB NIC so that only leaves my pci-e x1 slot which doesn't leave any Sata III options in addon cards.
 

Teva

Dabbler
Joined
May 16, 2014
Messages
10
Yup, just because of the bandwidth. I'm not that familiar with the HP server, if it has SATA III on board those would serve you better with SSD's attached and put the spinning rust on the PCI-e sata ii cards.
 

master-richie

Dabbler
Joined
Jan 1, 2015
Messages
41
it has a SAS connector for the 4 drive cage and one onboard sata connector for the optical drive that has been unlocked from sata I to sata II thanks to a modded ROM update.
 
Status
Not open for further replies.
Top