BUILD Advice on VFX Studio File Server

Status
Not open for further replies.

wuxia

Dabbler
Joined
Jan 7, 2016
Messages
49
Hi guys,
It's time to finally retire our ancient RHEL file server as we're growing and in need of more space. I'm new to FreeNAS so your help is greatly appreciated. Here are the specs:

Chasis: SC846BE16
MB: X9DRE-TF+ 24x DIMM slots + dual port 10GBase-T (changed from: X9DR7-TF+)
CPU: 2x Intel Xeon E5-2637 v2 (changed from: 2x Intel Xeon E5-2630 v2)
RAM: 256gb (16 x 16GB) ECC DDR3 (haven't decided on the model yet)
HBA: ServeRAID M1015 cross-flashed
Storage: 24x 6TB SAS (haven't decided on the model yet)
in 3 x 8 drives Raid-Z2 configuration
Boot Drive: 16GB SATA DOM (changed from 16GB flash drive)
SLOG device: n/a
L2Arc device: (not sure if needed)

Read speed is much more important than write speed as writes are rare (render times are slow). The clients will be mostly Linux NFS with a handful of SMB clients (Windows and OSX machines). NFS mounts will be cached on local SSDs using cachefilesd to minimise load.

I'm also considering disabling the ZIL if it won't cause filesystem corruption cause NFS clients will be mounted using async anyway and data lost due to power loss is not that critical as those lost frames could be re-rendered anytime.

The two 10Gb ports will be connected using LACP to a 10Gb switch. Most of the clients are 1Gb with only a few 10Gb.

1. Is this system balanced and good enough to serve 20-30 clients?
2. Do I benefit from L2Arc in my use case?
3. Is disabling ZIL safe for the file system and do I understand correctly that I don't need it?

As much as I'd love a turnkey system the transport cost to my country will be high and there are budget limits. Please, let me know what you think?

Thanks
 
Last edited:

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
I'm not qualified to advise you generally, but a couple of questions. How much storage space do you need to use? Have you decided how to arrange the storage, mirrors, RAIDZ1 or 2, and how many vdevs, or do you want advice about this? What are backup arrangements and how serious would it be to lose data?
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
I am but a lowly n00b, but here are a handful of thoughts:

Memory type - buy memory that's on the Supermicro approved list, available on their website.
Drives - most of the SAS drives I've seen are 7200RPM, which means more power and more heat. I run them, because the drives were appropriated. If I were buying new, for a storage/archive/non-VM workload, I'd go with WD Reds at 5900RPM. Less heat, less power.
10GbE NICs - check the primer in the Networking subforum to determine how well that chipset is supported in FreeNAS.
L2ARC - I'd say start with the RAM, look at hit rates, then make a decision.
ZIL - every ZFS pool has a ZIL... separating it out onto an SLOG (separate physical device) is what you're asking about. Not typically required for a non-VM workload.
Boot drive - why would you spend all of this money on a very nice FreeNAS system, then trust it to a $10 flash key that are prone to failure and are verrrry slooooooooow (this is a real problem when updating)? Buy a MCP-220-84603-0N from Supermicro, which lets you mount 2 fixed 2.5" drives internally, add two cheap SSDs (repurposed 40GB SSDs from other applications will do nicely), and mirror them for the boot pool.
 

wuxia

Dabbler
Joined
Jan 7, 2016
Messages
49
Thanks for your replies.

How much storage space do you need to use? Have you decided how to arrange the storage, mirrors, RAIDZ1 or 2, and how many vdevs, or do you want advice about this?
We need around 100TB usable space so I was thinking 3 x 8 HDD or 4 x 6 HDD Raid-Z2 vdevs. I read that wider vdevs will hurt IOPS and give better sequential speed but IOPS are not that high in this use case I think.
What are backup arrangements and how serious would it be to lose data?
I was thinking of building a less powerful FreeNAS with our older drives (mostly 4TB and 2TB) of the same size and replicating the server snapshots there. In addition to that I was planing to use an LTO6 tape we already have. Losing the latest data (like from power loss) is not that important. We have a powerful UPS but even if it still happens it's not that bad as long as the file system is not corrupted. Loosing all data is of course extremely bad. 24 bay chassis will not be enough for the backup so I was thinking like something inexpensive like the Backblaze Storage Pod or reuse the old server and some of the fibre channel storage we already have. I don't know. I still haven't decided on the hardware yet but we'll have at least a daily backup for sure plus tape backup.

Memory type - buy memory that's on the Supermicro approved list, available on their website.
Drives - most of the SAS drives I've seen are 7200RPM, which means more power and more heat. I run them, because the drives were appropriated. If I were buying new, for a storage/archive/non-VM workload, I'd go with WD Reds at 5900RPM. Less heat, less power.
10GbE NICs - check the primer in the Networking subforum to determine how well that chipset is supported in FreeNAS.
L2ARC - I'd say start with the RAM, look at hit rates, then make a decision.
Thanks I'll look into those.

ZIL - every ZFS pool has a ZIL... separating it out onto an SLOG (separate physical device) is what you're asking about. Not typically required for a non-VM workload.
I was under the impression that you can disable ZIL? Anyway even with ZIL enabled not having a separate SLOG device is still OK for my case, right? If not - I have a 80GB Fusion-io - would it be good enough?

Boot drive - why would you spend all of this money on a very nice FreeNAS system, then trust it to a $10 flash key that are prone to failure and are verrrry slooooooooow (this is a real problem when updating)? Buy a MCP-220-84603-0N from Supermicro, which lets you mount 2 fixed 2.5" drives internally, add two cheap SSDs (repurposed 40GB SSDs from other applications will do nicely), and mirror them for the boot pool.
Thanks. You have a point here. I see that the MB has a DOM power connector so maybe a 16GB SATA DOM will do. I'm keeping the 2.5'' bays free as they could be needed for L2Arc later.

BTW I just realised that the X9DRE-TF+ is a better MB as I don't really care for the built in SAS ports so I'm replacing the X9DR7-TF+.
 
Last edited:

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
When calculating storage don't aim to fill the pool more than about 80%, so 3 x 8 6TB RAIDZ2 is only going to give you about 80TiB of usable storage (my rough calculation, there are more accurate calculators around).
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Regarding ZIL - you're overthinking it. An SLOG typically wouldn't be recommended in your case. So, set up your drive through the GUI and be happy.
Boot drives - yes, the SATADOM will work as well. I believe you can put two of those 2-drive brackets in that case - you can in the bigger brother, the 847.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
(my rough calculation, there are more accurate calculators around).

Here is the popular calculator. After taking into account the 80% 'rule of thumb' for not overfilling the pool.
https://jsfiddle.net/Biduleohm/paq5u7z5/1/embedded/result/

This is also a good read on the setup/striping of raidZ. Directly from the heart of ZFS developers.
http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/

I'm no expert either, yet from the readings I've done here are some input:
The rationale behind that statement is departing from a multiple of 24.
For best space efficiency, less 'striped' vdevs are favourable. Ie, at best, 2x 12drive raidz2. (~94TB)
For higher number of iops capacity- you would rather benefit from a 4x 6drive@raidz2. ~75TB "usable space"
The middle road would be the 3x 8xdrive@raidz2. (84TB)

Cheers /
 
Last edited:

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
IMO, 2x12 is getting pretty wide... I would stick with 3x8. You could also upgrade to the 847 36-drive chassis (some people have seen significantly higher drive temps on the lower rear bays... that hasn't been my experience, but I don't have much back there in the way of spinning rust - yet... so, YMMV) which would let you run 4x8 vdevs. Each vdev would give you 25.77TiB usable space (per @BiduleOhm's calculator... this accounts for metadata, the 80% rule, etc.), or 103.08TiB usable as a pool.

If you're concerned about the cooling, you can stick with the 846 chassis and add a JBOD box.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
IMO, 2x12 is getting pretty wide... I would stick with 3x8.
Yeah, I was just revising this comment while cooking dinner, coming back now to make an edit (ie, this reply).
Agree - 2x12 raidz2 is probably not a good idea. As far as I've understood a 2x12 raidz3 would be even slower.
3x8 seems like a sweetspot :)
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Regarding ZIL - you're overthinking it. An SLOG typically wouldn't be recommended in your case. So, set up your drive through the GUI and be happy.
Boot drives - yes, the SATADOM will work as well. I believe you can put two of those 2-drive brackets in that case - you can in the bigger brother, the 847.
I don't think he's overthinking anything, I think it's just that none of us currently contributing understand the significance of disabling sync writes or doing without a ZIL altogether. I have no idea if it is a good idea for this box, but I do know it can be done!
 
Last edited:

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
I found two reads on ZIL's,
This, which its most important point I think is to 'evaluate first to understand behavior, then take action'
http://www.richardelling.com/Home/scripts-and-programs-1/zilstat

And in the ZFS primer, there were some notes that might help OP to identifiy common traits to their use case:
https://forums.freenas.org/index.php?threads/zfs-primer.38927/#post-238057

And in the cyberjock guide, search for ZIL:
https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/

Cheers /
 
Last edited:

wuxia

Dabbler
Joined
Jan 7, 2016
Messages
49
Thanks, I didn't know about that 80% pool rule. Maybe I could switch to bigger drives. I'll take a look at the ZIL articles and will probably test with the 3 x 8 Raid-Z2 configuration first to see if its good enough.

I've seen that the Intel X540 network in this motherboard is not without problems but probably have to read a little bit more before I decide on this.

Generally do you think the hardware I'm considering will be good enough or maybe I need something with more power? I know it's hard to tell but since I don't have experience with ZFS I'm not sure if there isn't a hidden bottleneck somewhere and I need to order the parts this week. For example is it possible to get the maximum drive read and write speed in one local dd test (18 data drives * roughly 120 MB/s ~ 2GB/s) or that's too much to ask from that configuration?
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Thanks, I didn't know about that 80% pool rule.
Here's an answer I got to this question just recently:
For the 80 % rule: ZFS switch from speed optimization to space optimization at 90 % and you really don't want to fill the pool to 100 % (big trouble as it's a CoW FS you need some space to even delete files...). FreeNAS warns you at 80 % so you have time to do something before hitting 90 % but if you know what you're doing you can fill it up to 90 % without problem ;)

For example is it possible to get the maximum drive read and write speed in one local dd test (18 data drives * roughly 120 MB/s ~ 2GB/s) or that's too much to ask from that configuration?

Be careful on how to make those benchmarks later on, there have been some confusion on what freenas does and does not do for proper benchmarks. There are stickes to read about this.
I'm not qualified to make guesses on speeds from your configuration.

Cheers /
 
Last edited:

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Here's an answer I got to this question just recently:




Be careful on how to make those benchmarks later on, there have been some confusion on what freenas does and does not do for proper benchmarks. There are stickes to read about this.
I'm not qualified to make guesses on speeds from your configuration.

Cheers / Dice
We really need someone like @jgreco or @cyberjock to advise the OP. My take is that CPU and memory are fine, but we need more advice on networking and storage. I'm tempted to think something like 45 6TB drives mirrored would be more suitable, but I simply don't have the knowledge to make recommendations.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
We need around 100TB usable space so I was thinking 3 x 8 HDD or 4 x 6 HDD Raid-Z2 vdevs. I read that wider vdevs will hurt IOPS and give better sequential speed but IOPS are not that high in this use case I think.

You're working with large files?

IOPS sorta implicitly means "random I/O" rather than large sequential amounts of I/O. RAIDZn is pretty damn good at the large sequential amounts of I/O but not so hot at random.

I was under the impression that you can disable ZIL?

No, but you can disable synchronous writes if it's a problem.

Anyway even with ZIL enabled not having a separate SLOG device is still OK for my case, right? If not - I have a 80GB Fusion-io - would it be good enough?

Probably. ZFS still needs the ZIL for things such as metadata updates. However, unless you've got mad amounts of file opening/closing directory opening/closing etc. types of activity, ... eh. You're probably better off starting without.

BTW I just realised that the X9DRE-TF+ is a better MB as I don't really care for the built in SAS ports so I'm replacing the X9DR7-TF+.

Yes, the X9DR7-TF+ is an awesome board, but not really for FreeNAS. Great hypervisor though. :smile:
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Here is the popular calculator. After taking into account the 80% 'rule of thumb' for not overfilling the pool.

To maintain speed, you may actually want or need to stay well below 80%. What'll happen is that as your pool passes ~50% capacity, ZFS struggles a little more to allocate blocks. The problem gets worse rapidly as you pass 85, 90%. If you're not writing lots of stuff to the pool constantly, and only writing large stuff when you do, the problem doesn't get bad as quickly.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thanks, I didn't know about that 80% pool rule. Maybe I could switch to bigger drives. I'll take a look at the ZIL articles and will probably test with the 3 x 8 Raid-Z2 configuration first to see if its good enough.

The bigger drives will tend to get you better write speeds if they help to keep the pool utilization lower (plus, it's always nice in a pinch to have elbow room)

I've seen that the Intel X540 network in this motherboard is not without problems but probably have to read a little bit more before I decide on this.

Try it. If it sucks, go get one of the Chelsios and call it a day. I do not promise the Intels will suck, but OTOH I also don't promise they won't :smile:

Generally do you think the hardware I'm considering will be good enough or maybe I need something with more power? I know it's hard to tell but since I don't have experience with ZFS I'm not sure if there isn't a hidden bottleneck somewhere and I need to order the parts this week. For example is it possible to get the maximum drive read and write speed in one local dd test (18 data drives * roughly 120 MB/s ~ 2GB/s) or that's too much to ask from that configuration?

That's always too much to ask. It's software RAID and you've got a soft CPU. The 2630 v2 is a 2.6GHz part. It isn't the fastest.

To give you a more realistic-ish idea, I have a virtualized FreeNAS box here on an X9DR7-TF+ and four cores of an E5-2697 v2 hypervisor box (2.6 GHz). It has an 11-drive RAIDZ3 of 4TB drives that are capable of about 180MB/sec. The entire vdev only gives about 350MB/sec read on noncached files, though there's some background activity so maybe it'd be a little higher. Between some tuning tweaks, a faster CPU, RAIDZ2 instead of Z3, getting rid of ESXi, etc, I'm pretty sure that could be gotten up a little past 500MB/sec. because this pool isn't tuned for speed. Not sure exactly how much. I'd expect your 3 x 8 RAIDZ2 could exceed 1GByte/sec if done right. Asking for more means making lots of careful choices and tuning, though.
 

wuxia

Dabbler
Joined
Jan 7, 2016
Messages
49
You're working with large files?
Sometimes yes, but the typical and most demanding scenario is reading a sequences of 1-10 MB files by several users at once. Writing is typically not very demanding as render times insert pauses between the writes with one exception when you're actually loading several TBs of client material to the server. There are roughly several hundred GBytes being accessed during a typical day.

Probably half of the files we work with are compressible and half are already heavily compressed so probably there is some sense of leaving compression on, right?

Try it. If it sucks, go get one of the Chelsios and call it a day. I do not promise the Intels will suck, but OTOH I also don't promise they won't :)
Lets hope that the X540 TSO bug is fixed in 9.3.

That's always too much to ask. It's software RAID and you've got a soft CPU. The 2630 v2 is a 2.6GHz part. It isn't the fastest.

To give you a more realistic-ish idea, I have a virtualized FreeNAS box here on an X9DR7-TF+ and four cores of an E5-2697 v2 hypervisor box (2.6 GHz). It has an 11-drive RAIDZ3 of 4TB drives that are capable of about 180MB/sec. The entire vdev only gives about 350MB/sec read on noncached files, though there's some background activity so maybe it'd be a little higher. Between some tuning tweaks, a faster CPU, RAIDZ2 instead of Z3, getting rid of ESXi, etc, I'm pretty sure that could be gotten up a little past 500MB/sec. because this pool isn't tuned for speed. Not sure exactly how much. I'd expect your 3 x 8 RAIDZ2 could exceed 1GByte/sec if done right. Asking for more means making lots of careful choices and tuning, though.
You're talking read or write speed? While 1GByte/sec (read speed) will probably be enough it's a little disappointing that the full speed of the drives can't be used. I'm thinking of using cachefilesd daemon on the Linux boxes as it will hopefully ease a lot of the load from repeating reads. If I understand correctly raw CPU speed is more important than cores, right? If that's the case I could look into faster CPU. Anyway I have a speedy workstation which I currently use as a FreeNAS testing system with 8x1TB drives connected and I could test different scenarios. I will for sure check CPU load during reading and writing Raid-Z2 to get a sense of how much power is needed.
 

wuxia

Dabbler
Joined
Jan 7, 2016
Messages
49
I did some tests with my test system 8 x 1TB SATA Raid-Z2. Single disk speed measured was ~ 80MB/s and the whole pool speed was close to the maximum speed of six disks ~ 530MB/s while CPU load was close to 70℅ (on a 12 cores 2.9GHz CPU).

So its kind of hard to tell what speeds are to be expected form 3 Raid-Z2. It really depends on how well zfs will utilize multiple threads or if dd being single threaded will max out a single core. If it's a thread per zraid it could reach very high speeds. If it's a thread per file it'll be a lot less. I need to read more to understand how it scales.

So I guess that probably E5-2637 v2 will be better suited for FreeNAS (fewer cores, more MHz)
 
Last edited:
Status
Not open for further replies.
Top