First draft at 80TB FreeNAS for Enterprise Storage

Status
Not open for further replies.

Kumba

Cadet
Joined
May 24, 2017
Messages
9
I have a suite of video editors who edit locally then store their raw and cut footage on a NAS. Someone decided to get a regular 12-TB NAS and it died within a month. I won't say who the manufacturer was of the NAS but lets just say it's a cousin of the Bison. Long story short they are asking me to find a NAS solution that will keep up with the work load they have.

The general use case scenario will be multiple Video Editors, around 12, transferring anywhere from 12 to 300GB worth of files at a go. The editors do not edit directly from the NAS so it's primarily used as a data repository. This results in 90% of the reads/writes being sequential with the only random access at the file system being multiple concurrent users. Most of these Video Editors are going to be using Windows Desktops with a couple Macs sprinkled in. Samba support/performance is the primary focus since ideally the video editors will just drag/drop their files to and from a mapped network drive unless someone has a better suggestion. They do have a Windows 2016 Domain Controller if that is of any help but it's network connectivity is limited by comparison. A secondary consideration/priority is the ability for some Linux servers to be able to connect via FTP for uploading relatively small files and then being able to retrieve/browse these via HTTP(S), primarily for Backups.

My current hardware selection is as follows:
- SuperMicro 826BE1C-R920LPB Chassis - x12 3.5" Hot-Swap on a 12GB LSI-SAS Expander Backplane, x2 2.5" Rear Hot Swap (for OS), Redundant 920w PSU
- SuperMicro X10SRL-F - Single Socket 2011-3, x4 DDR 2400Mhz, Gen3 PCIe with four x8 slots off the CPU, IPMI
- Intel E5-1620V4 - 4c/8t, 3.50Ghz base, 10MB L2
- x2 Kingston KVR24R17D4/32MA 2400Mhz DDR4 Registered ECC 32GB so total of 64GB Memory, room to expand to 256GB
- x12 Seagate ST10000NM0206 - 3.5" 10TB Enterprise Capacity 'Helium' 7200-RPM SAS 4kN
- x2 Seagate ST9500620NS - 2.5" 500GB Enterprise Constellation.2 7200-RPM SATA (OS-Drive)
- Chelsio S320E-CXA - 10GBase-CX4 dual-port NIC (The core switch already has CX4 ports, I would have to order SFP+ or 10GbT ports, and a CX4 cable is cheaper)
- LSI (Broadcom/Avago/owner of the week) 9300-4i - 12GB SAS3 HBA

So if I am understanding (or not) all the things I've read from google searches and this forum, here is how I would do the FreeNAS config:
- Two 6-drive RAID-Z2 vdevs in a storage zpool, giving me 80TB capacity
- One 2-Drive vdev in an OS/boot zpool

Does that seem like a correct hardware build and FreeNAS build? Or what recommendations/changes would you make?


I read in a few places that the maximum drive span is 8 drives plus redundancy. I originally thought of doing a 12-Drive RAID-Z2 or Z3 but decided to change it. The mentions I see to this 8-drive limit is pretty dated (2014) so I was just wondering if that is still a consideration or if it's changed.

I read in a few places that the general memory footprint should be an 8GB base plus 1GB per 1TB of storage, but then I also read info that said that's not the case. Is the 64GB I have in the server going to be an issue? Would you recommend going to 128 just for the fun of it or wait and see?

I know samba is very single-threaded performance based so I went with the E5-1620 due to the high CPU clock. Would I be better off going with an E5-1650v4 (6c/12t) or sticking with the current quad-core? There's a marginal difference in CPU clock between the two but the 1650 is twice the price. Not sure if there's a potential bottle neck with just 4 cores.

I did not include a ZIL/SLOG because Samba writes asynchronously by default. Is that the best performance option or would performance be improved by having Samba use synchronous writes and using a pair of Intel 750 400GB NVMe drives as a ZIL/SLOG?

I did not include an L2ARC because the editors all work on different aspects of a project which also means different assets. Most of the time the closest an L2ARC would get to being useful would be when one of the video editors drops a 50GB file on the NAS that another one pulls down to edit locally. Since this is more of a one-off thing and not one guy delivering it to 10 people I didn't see the need for it. I decided it would be best to add an L2ARC after the fact if it is deemed it would be helpful to the overall system performance.

Is Chelsio still the 10GbE NIC of choice? They seem to have a pretty stable and broad support in FreeBSD so I went with them. I know the card I spec'd is a bit old but their Core switch already has a couple unused CX4 ports in it. Cost to add SFP+ or 10GbT would be around $800 for 2 ports. I can buy a 2-meter CX4 cable for $30 and a nic for $80 and be done. If there's a reason I should do this, or if the NIC I selected it known for being unreliable, please let me know.

The other thing is that I believe in an old goverment saying of "Why buy one when you can buy two for twice the price", which leads me into having two of these. If I wanted the secondary to serve as a back-up to the primary, is my go-to going to be rsync? Or does ZFS have some sort of zpool network mirror ability? The intent of the secondary server is to be a 24-hour backup and not a redundant copy. This is because the editors tend to be idiots and squash each other's files all the time. So losing work in a 24-hour span of time that is most likely still sitting on their local machine is fine, but losing more then that is not.

And lastly, is FreeNAS 9.10 capable of working with that hardware list? Or is FreeNAS 11.0 more or less a safe bet at this point?

Thanks for reading my long-winded post and I hope I got most of it right. If you have any recommendations or a different approach please let me know! I want to try and do it right the first time, or do it well enough that I don't have to scrap and rebuild it from scratch.
 
Last edited:

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
- Two 6-drive RAID-Z2 vdevs in a storage zpool, giving me 80TB capacity
...Closer to 65TB Usable.
as per the <the calculator> for ZFS. https://forums.freenas.org/index.php?threads/zfs-raid-size-and-reliability-calculator.28191/
x2 Kingston KVR24R17D4/32MA 2400Mhz DDR4 Registered ECC 32GB so total of 64GB Memory, room to expand to 256GB
Id bump RAM. Preferably to 128GB. It is the savior for a lot of things. Plus, will enable you to actually use a L2ARC efficiently.
I read in a few places that the maximum drive span is 8 drives plus redundancy. I originally thought of doing a 12-Drive RAID-Z2 or Z3 but decided to change it. The mentions I see to this 8-drive limit is pretty dated (2014) so I was just wondering if that is still a consideration or if it's changed.
Mah, meh. The vdev limitations flexible. While there are some users running 12drive wide raidz2 successfully - success is very situational. Recommendations should also be seen in the context of 'getting newbies safely into the FN-haven'.
With your workload you'd best split them up into vdevs just like you've suggested!
Would you recommend going to 128 just for the fun of it or wait and see?
It is a legit strategy to wait and see. However, see comment above.

Not sure if there's a potential bottle neck with just 4 cores.
Keyword is potential. The 1650 is the recommended choice for a high performing FS.
A reminder (for other contributors) would be to take into account the 10GBit connectivity that elevates requirements. I'm not sure if we're at a bottleneck situation here or not.

The last thing is that I believe in an old goverment saying of "Why buy one when you can buy two for twice the price", which leads me into having two of these. If I wanted the secondary to serve as a back-up to the primary, is my go-to going to be rsync? Or does ZFS have some sort of zpool network mirror ability? The intent of the secondary server is to be a 24-hour backup and not a redundant copy. This is because the editors tend to be idiots and squash each other's files all the time. So losing work in a 24-hour span of time that is most likely still sitting on their local machine is fine, but losing more then that is not.
I like the idea of having two. But - I don't think it is necessarily wise to build identical systems if you never intend to have the users hit this machine.
ZFS has "snapshots" which could be saved locally on the primary box. That would enable you to "roll back" a share's state to the previous snapshot. These can be had at various intervals. Check the manual to get an idea. [edited:]The cool thing is that these snapshots are stored locally and can be synced to another machine.

In your situation, that would be replicated to the second server. However, there is no need to have a particularly powerful box for such duties.
I'd suggest a recipe of X11 SSL, 32-64GB RAM, + LSI 9211-8i, KabyLake G4620 (or bump to E3-1230v6), one 10GBit NIC, 8TB WD REDs, ...then maximize the space efficiency by running approx 10drive wide Raidz2. No L2ARC.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
One 2-Drive vdev in an OS/boot zpool
@Dice has commented on most of the points you raise, so I'll just touch on this. While you certainly can mirror boot SSDs, SSDs are reliable enough (and boot device replacement a low-enough-stress event if you're prepared) that it seems like wasted effort. You could use the other bay for an L2ARC device if you wish.
If I wanted the secondary to serve as a back-up to the primary, is my go-to going to be rsync? Or does ZFS have some sort of zpool network mirror ability?
ZFS replication is what you're going to want here.
- Chelsio S320E-CXA - 10GBase-CX4 dual-port NIC
IIRC, that model is old enough that it can't actually reach 10 Gb throughput; it's limited to 6.something by its PCIe interface. Obviously better than a 1 Gb NIC, but you'd really want to look at a T4 or T5 model for better performance. I don't know if those are available with CX4 interfaces, though.
Is that the best performance option or would performance be improved by having Samba use synchronous writes and using a pair of Intel 750 400GB NVMe drives as a ZIL/SLOG?
Async will always be faster than sync.

A couple of other thoughts:
  • Consider potential expansion needs. If you buy a 12-bay chassis and fill it immediately, you don't have room to grow. You might want to consider a SC846 (or even 847) to give headroom for expansion; these would give you 24 or 36 bays, respectively, in 4U.
  • Also consider quoting a system from iXSystems. As the developers of FreeNAS, they know pretty well what hardware would be required for which applications, and their support is said to be very good. I believe you have the option of either a TrueNAS server, for which they'd provide both hardware and software support, or a "Certified FreeNAS" server, for which they'd provide only hardware support, but it would still be using known-compatible components.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The cool thing is that these snapshots can either be stored locally or shipped off to another machine.
Well, to be clear on this, snapshots will always be on the local pool. They can also be replicated to another pool (local or remote), but they're going to be on the local pool as well.
 

enemy85

Guru
Joined
Jun 10, 2011
Messages
757
- x2 Seagate ST9500620NS - 2.5" 500GB Enterprise Constellation.2 7200-RPM SATA (OS-Drive)

As @danb35 already suggested, that seems really overkilled and you won't gain any benefit instead of 2 usb. I'll save that money
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Ah, I'd missed that the proposed boot drives were spinning rust. No, bad call, and way oversize. Use a single small (30 GB would be more than enough) SSD.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Consider potential expansion needs. If you buy a 12-bay chassis and fill it immediately, you don't have room to grow. You might want to consider a SC846 (or even 847) to give headroom for expansion; these would give you 24 or 36 bays, respectively, in 4U.
I definitely agree on this path.
FreeNAS's expansion pattern relies on getting a bunch of drives at once, matching the number of drives in your vdev. In the current setup, that would be 6.
Getting a 4U case out the gate makes the build last WAAY longer.
RAM and CPU horse-power "out ranges" the number of drives in your config. By a large margin.
Even so, future expansion patterns could include getting another HBA, a JBOD box and continue on the same motherboard and hardware...

Now, the key metric to keep in mind (since you already slightly overestimated the amount of usable space) is that ZFS loves free space. The way to deal with performance declines from fragmentation is to get more free space. Ideally, you'd start considering adding another vdev at around 70-80% utilized space.

On the other hand, there is not really a need to get the best enterprise drives. Rather ZFS is designed to accomodate "crappy" consumer drives. In practice, the forum stands by the WD Reds, non-pro. But faster and more reliable drives is better yes? ...well yes and meh. The meh part refers to the reality - adding more free space and additional vdevs will give far better performance. Drive reliability is not a worry once the drives are properly burned in (look for that on the forum, qwertymodo has a great post).

Some suggestions already made, but sort of bundled:
Save cash on getting WD REDs, or HGST NAS, or Seagate Ironwolf rather than the enterprise grade drives.
Add cash into RAM and an expandable box. Ebay is filled with great SM stuff.

Finally, just to put additional things to consider - getting a recycled box ready out the gate.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Finally, just to put additional things to consider - getting a recycled box ready out the gate.
Indeed. Here's one I ran across last night, just over $1k shipped. Not as current as what you're looking at, to be sure, but fairly recent hardware nonetheless. It's SAS2 rather than SAS3, but spinning rust really makes that a non-issue. You'd want to add RAM though. That chassis has room for brackets to add up to four 2.5" drives (like SSDs) internally, though they wouldn't be hot-swappable. This one bumps the RAM to 128 GB, the CPUs to E5-2660s rather than E5-2620s, and adds a SATA DOM as a boot device. It's very similar to what I'm running right now.

If you have to have SAS3, you're going to pay more for it, but it's available. Here's a barebone box in a similar chassis to the above, but this one adds two 2.5" hot-swap bays, with integrated SAS3 HBA on the motherboard and in the backplanes. Add CPU(s), RAM, and disks and you're set.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
This one bumps the RAM to 128 GB, the CPUs to E5-2660s rather than E5-2620s, and adds a SATA DOM as a boot device. It's very similar to what I'm running right now.
That machine is cool.
Only concern is to what multiple users hitting it from SMB, on 10Gbit, possibly two 10Gbit cards would cause the CPU to bottleneck out.
Im not familiar with 10Gbit setups, but really curios on some data on CPU loads for various scenarios.
 

Kumba

Cadet
Joined
May 24, 2017
Messages
9
I do like the idea of moving to a 24-drive chassis. With SM the drives are enumerated by column and then row. So that makes it simple where zpool1 would be all of the first column of 6 drives, zpool2 would be all of the second column of 6 drives, leaving the third and fourth column empty for expansion. I'd then print some labels so I could number them as vdev:drive so 1:3 would be vdev1 drive 3, etc. Makes explaining which drive to swap out to the trained monkey on the other end of the phone a lot easier. I also have a couple of 24-drive Xeon chassis' laying around from where I moved to a 2U higher density design in my colo. Good Rack Space and power ain't cheap! Fortunately this client has a mostly empty rack onsite and plenty of capacity hanging off of their 40KVA Liebert so they won't care. Also gives me better options since I can now fit full-height half-length cards without issue.

The two mirrored OS drives is a creature of habit. I'll replace them with a single 128GB SSD. I've tried using USB devices before as boot partitions but had a few too many bad experiences. Since the SSD is only $55 I'm just going to go with it. Not worth the potential hassle of the USB bus flaking out and the thumb drive disappearing until power cycled. I still might put two SSD's in it and mirror them just for peace of mind unless that makes the installation or recovery of a system a nightmare. I have tried to get SuperMicro's SuperDOM before but they are hard to find and pricey when I do find them. Does anyone have any good recommendations on a DOM that works flawlessly on SuperMicro and also draws power from the DOM SATA ports like the SuperMicro ones?

12GB SAS3 isn't a requirement for the storage array. 6GB SAS2 would have more then enough bandwidth for all 12 drives to operate at 100% I/O without a bottleneck. I just went with it since that is what the chassis had in it and there's a minimal price difference between a 9300 and a 9207 HBA. If anything I was looking at it as giving me possible expansion into SSD's if needed. The 24-drive chassis' I have are 6GB SAS2 expander backplanes so I will be switching to a 9207. In my research I saw references to using Firmware '16' on these controllers for best performance and reliability. Is that still the recommended firmware, and controller, for 6GB SAS2?

I'll just bump the ram to 128gb. Not worth trying to save $600 when RAM is the best price to performance value out of anything in the entire system.

Since I am going to be using a 24-drive chassis a 10TB drive is not needed anymore. So I am bumping the drives down to 8TB with the idea of expanding it for capacity. The Seagate Enterprise Capacity drives are only $30 more then the IronWolf Pro and the WD Red Pro drives are essentially the same price. So for the $360 price difference I'm going to go with the Enterprise drives. I like the SAS connection for error reporting and they do perform a little better then NAS drives.

Chelsio does have a T420-CX but finding one of those is like pulling teeth. Does anyone have any other recommendations on a 10GBit CX4 NIC? Or should I just bite the bullet and buy the $800 interface card for the switch? Or even if the S3 card is PCIe limited to 6GBit, am I even going to hit that limit given my array setup?


So I've modified my hardware and config to be as follows:
- SuperMicro 846E16-R1200B Chassis - 4U, x24 3.5" Hot-Swap on a 6GB LSI-SAS Expander Backplane, x2 2.5" Rear Hot Swap, Redundant 1200w PSU
- SuperMicro X8DTN+ - Dual Socket 1366, x18 DDR3 1600Mhz, Gen2 PCIe with two x8 slots, Gen1 PCIe with two x8 slots, IPMI expansion card
- Dual Intel X5570 - 2.93Ghz Clock, 4c/8t, 8MB L2
- x9 Kingston KVR16LR11D4/16HB 1600Mhz DDR3 Registered ECC 16GB so total of 144GB Memory
- x12 Seagate ST8000NM0065 - 3.5" 8TB Enterprise Capacity 7200-RPM SAS 4kN (almos same price as 'pro' NAS drives)
- Intel 600p 128GB m.2 SSD
- Chelsio S320E-CXA - 10GBase-CX4 dual-port NIC (The core switch already has CX4 ports, I would have to order SFP+ or 10GbT ports, and a CX4 cable is cheaper)
- LSI (Broadcom/Avago/owner of the week) 9207-8i - 6GB SAS2 HBA


Current drive layout:
- Two 6-drive RAID-Z2 vdevs in a storage zpool, giving me 64TB drive capacity before ZFS and formatting
- One SSD for boot
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
I'd then print some labels so I could number them as vdev:drive so 1:3 would be vdev1 drive 3,
The best help to find a particular drive is a numeric table with serial number first and foremost. Second, gptid.
I still might put two SSD's in it and mirror them just for peace of mind unless that makes the installation or recovery of a system a nightmare.
It won't. The difference during installation is one additional checkbox. Also, a second mirror to the OS drive can be attached once installed.
 

Kumba

Cadet
Joined
May 24, 2017
Messages
9
The best help to find a particular drive is a numeric table with serial number first and foremost. Second, gptid.

Does the web interface show you any SAS info like Enclosure ID and Position?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Does the web interface show you any SAS info like Enclosure ID and Position?
Unfortunately it doesn't. The problem is that such things aren't at all standardized, so something that would work for a Supermicro chassis might not for a Dell.

The tools are there, and it seems it'd be possible to (for example) detect when a disk is faulted, determine which position it's in, and light up the red LED for that spot. Maybe someone smarter than me can script that, as it should be relatively compatible across Supermicro gear, and there seems to be quite a bit of that used by FreeNAS folks.

Edit: It looks like someone has. I'll have to take a look at this...
 
Last edited:

Kumba

Cadet
Joined
May 24, 2017
Messages
9
I do have a rather large bias towards using SM gear. Looks like this NAS is a go so I'll give it a shot next week.

Reasons like this are why I keep telling myself to keep a bad drive so I can make sure it gets detected right.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

Kumba

Cadet
Joined
May 24, 2017
Messages
9
It's not an overly huge ordeal. I can SSH to the FreeNAS remotely and just run sas2ircu to see which drive it is. The output will tell me serial, model, and position.

Just always nicer when they call and you can say "It's the one with the big red light on it!"

Out of curiousity, is there a way to pull the status of the vdev from the CLI? Like with a Linux MD RAID you can do a 'cat /proc/mdstat' and it will show you the drives and which ones have failed. Reason I ask is with MD RAID there can be soft drive failures where the drive drops out of the array but SMART still shows as OK. On my Linux stuff I just monitor the raid array directly.

I'm guessing that ZFS doesn't quite work in the same manner, but getting the vdev status straight from zfs is probably more reliable then looking at sas2ircu output.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Status
Not open for further replies.
Top