SMART data; HBA/JBOD mode vs RAID mode w bunch of R0 arrays

jenksdrummer · Jun 27, 2019

Doing some testing with FreeNAS on my new server, I have a 3108 Controller (as well as a 3008 onboard). 3109 does JBOD mode as well as RAID mode. Tested with JBOD mode, SMART reporting works as it should; FreeNAS can see the disks exactly as they are. Placed it in RAID mode, made a bunch of R0 arrays. It's a LOT quicker. But, running smartctl reports information missing. Also of mention, I opted for the super-capacitor for the 3108, so it's cache is non-volatile..

All that said to ask....

RAID Controllers do SMART Polling, among a lot of other checks; as I suspect FreeNAS does as well; at least the critical ones pertaining to what should be considered a resilient/reliable storage OS. My curiosity is what advantage is there with FreeNAS polling SMART data over a RAID Controller? To that effect, couldn't one configure a Hardware RAID and use ZFS as a file system and still gain the advantages such as bitrot protection, checksumming, CoW, etc...

Thank you

kdragon75 · Jun 27, 2019

Well, there's a lot more to it than just smart. In the case of a buck of raid 0 disks, you also have the potential for unexpected performance due to two layers of caching that are not aware of each other. You also need to consider the scsi sense codes the drives use to report various details of what's happening. Generally raid card hide or alter this information. If your raid card is doing true passthrough in jbod mode that's perfectly acceptable. RAID 0 disks are not.

kdragon75 · Jun 27, 2019

Always feel free to do what you like. Just dont complain if you have issues on non recommended or supported setups

kdragon75 · Jun 27, 2019

Side note, most RAID cards won't report intermittent drive errors by default but wait for a disk to fully die before bells and whistles go off. This can be an issue if you have two disks about to fail at the same time.

jenksdrummer · Jun 27, 2019

Thanks for the replies. Mainly what I'm thinking is the RAID Card would (possibly: ideally) know better of the status of the drive; though, by using single-disk R0 arrays presented to FreeNAS, it can still make for a Z2 array.

Referring to the cache, I'm sure that's a big part of the performance I'm seeing with testing - it's roughly 5x faster; because FreeNAS says commit this data, and the controller reports back it's done as soon as it hits the controller's cache; then the controller writes out that data to the individual R0 arrays. In my case, that controller case is backed with a super-cap, so in the event of a loss event, that data should be just as intact just as it would have been with a SLOG; in essence, that controller cache is a RAM based SLOG that FreeNAS doesn't control/know about. It just sees some REALLY fast disks....

Supported vs unsupported; in either case, I'll be replicating my more-critical data.

kdragon75 · Jun 28, 2019

FreeNAS and ZFS would be better suited to act on the smart data and SCSI data than the RAID card as they are more intelligent and know what's actually happening with your data... Good luck.

kdragon75 · Jun 28, 2019

Now that I have gotten some sleep I remember I have also had clients with ESXi hosts using local RAID based storage that lost data because the RAID card reported healthy with multiple failing disks. One died and the other had a number of uncorrectable errors. Since then we have installed vendor provided sim-s providers and implemented collection tools. In the last month we have proactively replaced 5 disks across a number of clients. If this was ZFS, it could all have been avoided by not using RAID.

Arwen · Jul 1, 2019

jenksdrummer said:
...

...
My curiosity is what advantage is there with FreeNAS polling SMART data over a RAID Controller? To that effect, couldn't one configure a Hardware RAID and use ZFS as a file system and still gain the advantages such as bitrot protection, checksumming, CoW, etc...

Thank you

When using a Hardware RAID-1 or higher LUN(s) with ZFS, on the ZFS side, you only get error detection on data. ZFS can't correct data errors because ZFS has no redundancy. It's in the lower level that ZFS has no knowledge of. (Note that ZFS' metadata IS redundant! Even on a single disk.) I've done this with EMC disk array LUNs in data centers, and it can work fine as long as you have backups. But that was a enterprise grade disk array, almost certainly data path bug free.

When using Hardware RAID-0 LUNs and using either Mirroring or RAID-Zx with ZFS, then you can get error detection and correction. But, on single disk failures, the entire RAID-0 LUN has to be re-built. ZFS has no way to rebuild the hole in the LUN. In theory, a simple scrub on the fixed RAID-0 LUN will find all the bad data and using redundancy, fix it. But, this is not the normal method to replace a disk. You will get litterally thousands, if not millions of read or checksum errors that had to be fixed. This can find corner cases, (aka BUGS), that can cause un-expected behaviour, (aka crashes, data loss or trashed pools).

Last, ANY data cache that ZFS does not know about and or can't force the flush, has the potential to trash your pool. I say potential, because some Hardware RAID controllers perform out of sequence cache flushes, (aka elevator flushes).

All that said, it's your data. Most of us here on the FreeNAS forums are conservative, simply because we want a problem free NAS device. (Or at least known problems like a failed disk...) Your NAS might be just fine. Good luck.

Edit: Clarifed RAID-0, may get read errors. RAID-0, added data loss as un-expected behaviour.

Important Announcement for the TrueNAS Community.

SMART data; HBA/JBOD mode vs RAID mode w bunch of R0 arrays

jenksdrummer

Patron

kdragon75

Wizard

kdragon75

Wizard

kdragon75

Wizard

jenksdrummer

Patron

kdragon75

Wizard

kdragon75

Wizard

Arwen

MVP

Similar threads

Important Announcement for the TrueNAS Community.

SMART data; HBA/JBOD mode vs RAID mode w bunch of R0 arrays

Patron

Wizard

Wizard

Wizard

Patron

Wizard

Wizard

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "SMART data; HBA/JBOD mode vs RAID mode w bunch of R0 arrays"

Similar threads