Hello,
Not a FreeNAS user (yet), but I feel like sharing my personal experience with an IBM M1015.
I am quite new at using ZFS, but I feel that it might save some time to other people facing the same problems in the future.
So, context: recently ordered an IBM M1015 from Ebay (2/3 weeks ago), received it quickly, no problem (the label on it says FRU: 46C8933, unlike the usual 46M0861 and 46M0831 that I read about on the web. Not sure what it means. Newer version?).
I don't have a dedicated server for it (yet), so I just put it in my desktop computer for now: the processor is an Intel i7 3770, 16GB of DDR3 RAM (so, no ECC) and the motherboard is a Gigabyte GA-Z77X-UD3H (the M1015 is in a physical PCIe 16x port, actually connected with 4x lanes). The motherboard didn't see the M1015 until I updated the BIOS (of the motherboard, not the M1015) with the latest available version.
Good, now the M1015 is recognized. Gonna flash it to IT mode. Did it with firmware P20 and the ROM BIOS. Got a few issues on the way (yay UEFI...), but I have finally been able to do it.
Connected 6 disks WD Red 6TB to it. Booted my system (Linux Mint 17 using the ZFS-on-Linux package from the PPA) and created the pool (RAIDZ2). Played with it for a week, and while I didn't lost any file, I got a bunch of weird things:
- The pool sometimes failed to mount automatically at startup.
- Writing seemed good (about 600MB/s), but read seemed slow, and scrubing was painfully slow (about 5MB/s).
- Scrubing reported a lot of checksum errors and "repaired data" (+ some read and write errors, sometimes resulting in a resilvering), even if I scrubed just before.
- zpool status sometimes reported permament errors on some files, but those errors disappeared after a partial scrub and a reboot, and the concerned files (some JPG images) looked visually OK.
There was clearly a problem somewhere. Memtest looked Ok, but when I fired smartctl on the 6 disks, I saw that they all had their UDMA CRC error count around 80'000 (after only one week. The disks were brand new). And that number kept increasing when using the disks.
Disconnected 3 of the disks from the M1015 and connected them directly on the motherboard. Launched a scrub again and kept an eye on those CRC errors. The disks connected on the M1015 detected more of them, the disks connected on the motherboard did not.
- Switched the M1015 to another PCIe port: no change
- Changed the SFF-8087 cables (maybe 1m was too long, I changed them for 50cm): no change
I searched on the web and ended up on this topic, where 9C1 Newbee recommended P16 firmware. At this point, it was worth a try. So I reflashed the M1015 with P16 (and without the ROM BIOS this time). And yeah, you guessed it, it solved my problem: no more CRC error and my scrubing is now at 400+MB/s.
So while I can't say for sure what was the source of the problem (driver incompatibility with P20? P20 not compatible with IBM M1015 46C8933? Screwed up the flash procedure the first time? ROM BIOS present? Don't know), flashing to P16 without the ROM BIOS solved my situation.
Just for those who would be interested, I used this procedure to flash my card (reminder: I'm using Linux Mint 17):
- Sandisk Cruzer Extreme (32GB) flash drive
- Used GParted to create a single FAT32 partition on the flash drive (ms-dos table partition, of course)
- Used UNetbootin to install FreeDOS 1.0 on the flash drive
- Copied the following files from LSI at the root of the FAT32 partition: 2118it.bin, 2118ir.bin, mptsas2.rom and sas2flsh.exe (the one in the sas2flash_dos_rel folder)
- Copied the following file from LSI at the root of the FAT32 partition: sas2flash.efi
- Copied the following files from some tutorial at the root of the FAT32 partition: sbrempty.bin, MegaRec.exe and dos4gw.exe (I guess they are the exact same files that you can find in the other flashing tutorials for the M1015, but I have not found the officiel source for them)
- Retrieved the UEFI Shell from here and put that file on the FAT32 partition with the path EFI/boot/bootx64.efi (apparently the path the UEFI uses seems to depend quite heavily on your motherboard, that one works for my Gigabyte GA-Z77X-UD3H)
Shutdown the computer insert the flash drive in an USB 2.0 port (for some reason, it seems the BIOS can't use USB 3.0 ports) and boot from the flash drive (normal boot mode). Start FreeDOS in live mode and I guess you know the commands, but just in case here is mine (the "C: path may change depending on your configuration I guess):
Code:
C:
megarec.exe -writesbr 0 sbrempty.bin
megarec.exe -cleanflash 0
Reboot (Ctrl + Alt + Delete) and boot again from the flash drive, but in UEFI mode this time (I get the famous "PAL" error with this motherboard with a normal boot and the EXE binary). Again the "fs0" path and the "sasadd" parameter may change depending on your system:
Code:
fs0:
sas2flash.efi -o -f 2118it.bin
sas2flash.efi -o -sasadd 500605bxxxxxxxxx
Reboot on your normal system. Enjoy.
PS: Not a native English speaker, feel free to point the eventual mistakes. I will gladly correct them.
TL;DR
P20 firmware with BIOS caused problems on my system, P16 firmware without BIOS is fine.