Disk Layout Advise

Status
Not open for further replies.

L3192

Dabbler
Joined
Jan 25, 2016
Messages
22
Hello,

I am new to FreeNas.

I am looking for some advise on setting up a test system for evaluating FreeNas 9.3

System Informatoin:
SuperMicro 6027R-WRF
* 256G RAM
* 2 512G sata disks for the OS
* 20 4TB drives

System activity:
* Very heavy read activity of small files via NFS
* Very heavy writes on NFS

I will be using RaidZ2 for the extra protection and will be setting up 2-10 disk VDEV pools.
My questions are the following:

1. since I have so much memory, should I used it instead of L2ARC to gain some increased performance
2. The system is on UPS; therefore I am thinking that I can avoid using a ZIL log and turn off sync n each pool for NFS activity.

Does the above sound right or should I be doing something else? I'm looking to improve read/write performance via NFS but cannot afford to use RAID0+1 mirroring for this application.


Thanks!
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
1. since I have so much memory, should I used it instead of L2ARC to gain some increased performance
Nope, the RAM will be used as the ARC (not L2).
2. The system is on UPS; therefore I am thinking that I can avoid using a ZIL log and turn off sync n each pool for NFS activity.
A UPS won't protect against a random server crash or reboot. IOW, a UPS is not the answer for disabling sync. A low latency and high-write endurance SLOG with power protection is the answer.

And how many IOPS are you expecting to need? With 2 RAIDz2 vdevs in your pool, you are looking at ~300. This doesn't seem like it will support what you are trying to do (lots of small reads and writes).

Suggestions:
- Consider adding a SLOG (I think the Intel DC3700 is the latest recommendation).
- Consider using striped mirrors for much greater IOPS performance.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Hello,
* 2 512G sata disks for the OS
That seems fairly wasteful.

I will be using RaidZ2 for the extra protection and will be setting up 2-10 disk VDEV pools.
Is this for VM storage? If so, it'd probably be better to do striped mirrors. Otherwise, it might be a good idea to make a zpool consisting of four 6-disk RAIDZ2 vdevs.

1. since I have so much memory, should I used it instead of L2ARC to gain some increased performance
You didn't provide any information regarding your proposed L2ARC device. As far as whether you'd benefit from L2ARC, maybe? probably? definitely? It depends on your workload.

2. The system is on UPS; therefore I am thinking that I can avoid using a ZIL log and turn off sync n each pool for NFS activity.
Not a great idea.

I'm looking to improve read/write performance via NFS but cannot afford to use RAID0+1 mirroring for this application.
You can try the zpool setup I mentioned above, but get a proper SLOG device.

edit: and @depasseg ninja'd my response. :mad:
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
You don't really have a choice as to whether or not to use your RAM for ARC; FreeNAS will use it as ARC.

With 256GB of RAM and lots of small files being read, you might be well served by a decent L2ARC device. You could go with something like the Intel 750 1.2TB, or a pair of the 400's might be slightly zippier, but it really depends on how your pool is and what the file access patterns are like. What do you mean by "small files"? 1KBytes? 1MByte? Is there any locality effects that can be expected from the access, or is it totally random?

If there's no need for writes to be 100% reliable, a SLOG device can be omitted. If the data was something like VM disk data, database, or transactional information, then that'd be a bad idea. Since you haven't indicated what's being stored, it's impossible to know.

How much space are you expecting to need? "cannot afford RAID0+1" throws up alarm bells. If you're planning to use most of your space, then what'll happen is that things will seem awesome-fast at first, until things start to get a little full and fragmentation rears its ugly head, then things will get progressively slower and slower over time. You need to throw extra resources at a ZFS pool in order to maintain "very heavy writes." If you're planning to use more than 30-40TB of that 64TB pool, you need to add more space now. Use 6TB disks instead.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Quick look at this system shows only 8x 3.5" Hot-swap SAS/SATA HDD bays. Either there is another option (2.5" bays), it can house 12 drives internally, you are using an external JBOD or I am missing something... /Leaning towards me missing something...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
And how many IOPS are you expecting to need? With 2 RAIDz2 vdevs in your pool, you are looking at ~300. This doesn't seem like it will support what you are trying to do (lots of small reads and writes).

It *could* if the access patterns allow the working set to be held in ARC and L2ARC. This system *could* be successful if it was storing data that had some sort of locality effect (temporal, etc) that allowed it to be soaked up by the ARC/L2ARC, and also had enough free space maintained that ZFS could allocate contiguous ranges for its transaction groups most of the time. Under such circumstances, it could *appear* to have ten times (or more) the IOPS the pool is physically capable of.

But that's why I was a little worried about the hesitancy to go with mirrors. :smile:
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
It *could* if the access patterns allow the working set to be held in ARC and L2ARC. This system *could* be successful if it was storing data that had some sort of locality effect (
Boy, someone is optimistic today. :smile:

And even if both of those cases were true, I don't understand the hesitancy to go with striped mirrors.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Boy, someone is optimistic today. :)

That's not optimistic. I can make ZFS go fast. Part of the trick in architecting solutions is to understand the nature of the challenge, and then to do it with a modest resource expenditure.

And even if both of those cases were true, I don't understand the hesitancy to go with striped mirrors.

Right. That's the concern here. That plus a near-total lack of insight into the workload.
 

L3192

Dabbler
Joined
Jan 25, 2016
Messages
22
@depasseg
* Is there a best practice for controlling or limiting how much RAM is use for ARC
* Will definitely look into use an SLOG with the fast DC3700
* The stripped mirrors could be an option for some pools but not all. Just hate the thought of losing everything if the wrong two disks go bad a the same time

@anodos
* will definitely use a smaller disk such as a DC 240 SSD
* Not running any VM storage at all
* The 4 6-disk RAIDZ2 VDEVS are a good idea. Have you seen this configuration improve performance?
* will definitely use the SLOG as mentioned above

@jgreco
* I like the idea of using the Intel 750 1.2TB for L2ARC
* Will dong lots of reads on file sizes ranging from 100M-4K-150K for up to 4G in directory size or larger 10G mostly flat files and some binary files
* Mirrored is still a concern, but as mentioned above I could setup some pools with striped mirrors and the rest with RAIDz2
* I guess setting up the smaller pools to create more vdevs with mitigate or at least slow down the fragmentation that will cause the slodown and latency issues you mentioned.
is this correct or is there another benefit?
* I will probably add an additional 16 SSD drives to the pool; which is now tempting me to use the mirror but the throught of potentiall losing everything along with the restore
and recover time from tape is just not very appealing.

@Mirfster
* the system is a JBOD box which can hold up to 72 drives.



A few more questions to every one besides those specific ones above:

1. Regarding the RAM and ARC relationship with ZFS.
* Is anyone currently having a need to or has had a need to control the size of ARC to deal with system performance?

2. If I have several pools, will I need individual SLOGs for each pool or can this be shared?
* If so, does the SLOG device need to be partitioned and shared for each pool?

3. Same for the L2ARC device is this one device for all individual pools?
if so, will the device need to be partitioned to share with each pool?

4. Does anyone run striped mirrored in production?
If so, what strategy do yo use to mitigate potentially login two disks from the same mirrored pair?

5. Do most admins use the FreeNas GUI or do you ssh into the box and perform these tasks from command line?

6. What strategies are typically used to deal with NFS latency ?

Some of these are general questions, but I would like to find out folks do for these tasks.

Apologies for the long post.

Thanks for all the great information, this is very help.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
@anodos
* The 4 6-disk RAIDZ2 VDEVS are a good idea. Have you seen this configuration improve performance?
It'll probably improve performance for random IO. http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/
Of course, if you are going to fill these to the brim with data then all bets are off.

2. If I have several pools, will I need individual SLOGs for each pool or can this be shared?
* If so, does the SLOG device need to be partitioned and shared for each pool?
Why do you have several pools? You potentially need a separate SLOG device for each pool (assuming all pools are NFS shares). Ditto for L2ARC.

4. Does anyone run striped mirrored in production?
If so, what strategy do yo use to mitigate potentially login two disks from the same mirrored pair?
Setting up email alerting, smart tests, keeping an eye on things. Replicate to a second FreeNAS system for backups.

5. Do most admins use the FreeNas GUI or do you ssh into the box and perform these tasks from command line?
You shouldn't be using the CLI to administer FreeNAS. This will change in FreeNAS 10, which has a CLI that can do everything the webgui does.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
How can a system be a JBOD? :) Out of curiousity, which 72-drive chassis? How are you planning to connect the drives (which HBA or RAID card)?

The question was,

Quick look at this system shows only 8x 3.5" Hot-swap SAS/SATA HDD bays. Either there is another option (2.5" bays), it can house 12 drives internally, you are using an external JBOD

It is common in larger NAS/SAN systems to have a separate head unit, with JBOD units to hold the disks. The answer makes complete sense in context. You take a head unit and use external SAS to attach a JBOD shelf such as any of the stuff at

http://www.supermicro.com/products/nfo/chassis_storage.cfm

For FreeNAS, you just use an HBA with external ports, or a standard 9211-8i with an internal-to-external adapter.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
The question was,



It is common in larger NAS/SAN systems to have a separate head unit, with JBOD units to hold the disks. The answer makes complete sense in context. You take a head unit and use external SAS to attach a JBOD shelf such as any of the stuff at

http://www.supermicro.com/products/nfo/chassis_storage.cfm

For FreeNAS, you just use an HBA with external ports, or a standard 9211-8i with an internal-to-external adapter.
Ahh, never mind. Reading comprehension fail on my part. It's not like my job requires attention to detail... Crap. :oops:
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
@depasseg
* Is there a best practice for controlling or limiting how much RAM is use for ARC

Yes. Don't even think about it. You actually want to use MORE RAM for ARC, if anything, since FreeNAS is a little conservative and saves a bit much for itself.

* The stripped mirrors could be an option for some pools but not all. Just hate the thought of losing everything if the wrong two disks go bad a the same time

Right, which is why you can use a three-way mirror. That describes our VM filer here, where the design requirement was that a single disk failure may not compromise redundancy. This implies three (or more) disks per mirror.

* I like the idea of using the Intel 750 1.2TB for L2ARC

Yes, though it might be a bit fat depending on the specific application.

* Will dong lots of reads on file sizes ranging from 100M-4K-150K for up to 4G in directory size or larger 10G mostly flat files and some binary files
* Mirrored is still a concern, but as mentioned above I could setup some pools with striped mirrors and the rest with RAIDz2
* I guess setting up the smaller pools to create more vdevs with mitigate or at least slow down the fragmentation that will cause the slodown and latency issues you mentioned.
is this correct or is there another benefit?

Setting up smaller pools doesn't mitigate fragmentation. Storing less data, supplying more space, or being very careful about the manner in which you store data can mitigate fragmentation.

* I will probably add an additional 16 SSD drives to the pool; which is now tempting me to use the mirror but the throught of potentiall losing everything along with the restore
and recover time from tape is just not very appealing.

What potential of losing everything?

1. Regarding the RAM and ARC relationship with ZFS.
* Is anyone currently having a need to or has had a need to control the size of ARC to deal with system performance?

When you get into larger systems, FreeNAS reserves too much memory for itself, and you can tune it to allow the ARC to get larger. Can you explain why you have this recurring idea that ARC size is bad/needs to be controlled?

ARC == awesomest use of RAM on a FreeNAS box.

2. If I have several pools, will I need individual SLOGs for each pool or can this be shared?
* If so, does the SLOG device need to be partitioned and shared for each pool?

You need a SLOG for each pool that you wish to have a SLOG. FreeNAS does not support partitioning of devices for this purpose, but it can potentially be done by hand ("voids warranty").

3. Same for the L2ARC device is this one device for all individual pools?
if so, will the device need to be partitioned to share with each pool?

Yeah, don't do that. Just add two smaller L2ARC devices. Two L2ARC devices are nicer even for a single pool.

4. Does anyone run striped mirrored in production?
If so, what strategy do yo use to mitigate potentially login two disks from the same mirrored pair?

Of course LOTS of sites run striped mirrors in production. You can have hot spares standing by for a quick rebuild. If you're not confident that disk two in a two-disk mirror will be able to recover sufficiently quickly and the idea of losing redundancy drives you nuts, you can utilize three-way or four-way mirrors. Our rules here cause us to have three-way mirrors and hot spares. 52TB of disk burned to deliver 7TB of highly failure-resistant storage.

5. Do most admins use the FreeNas GUI or do you ssh into the box and perform these tasks from command line?

You're supposed to treat it as an appliance.

6. What strategies are typically used to deal with NFS latency ?

Latency is usually due to mandatory pool access. Provide large amounts of free space on the pool, which reduces fragmentation, which reduces the likelihood that ZFS will be writing more slowly than you're throwing data. Provide a SLOG device, to reduce write latency for sync writes. Provide gobs of ARC and L2ARC to accelerate reads. Use mirrors instead of RAIDZ to increase the IOPS capacity of your pool.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I will probably add an additional 16 SSD drives to the pool; which is now tempting me to use the mirror but the throught of potentiall losing everything along with the restore and recover time from tape is just not very appealing.
Have you looked into zfs replication to a second FreeNAS system? It has a higher up-front cost, but makes recovery much easier than pulling data from tapes (and less costly in terms of down-time and labor spend pulling backup tapes).
 

L3192

Dabbler
Joined
Jan 25, 2016
Messages
22
Have you looked into zfs replication to a second FreeNAS system? It has a higher up-front cost, but makes recovery much easier than pulling data from tapes (and less costly in terms of down-time and labor spend pulling backup tapes).

The like the replication solution. When you say "higher up-front cost" Can you elaborate this a bit more? Thanks
 

L3192

Dabbler
Joined
Jan 25, 2016
Messages
22
Yes. Don't even think about it. You actually want to use MORE RAM for ARC, if anything, since FreeNAS is a little conservative and saves a bit much for itself.



Right, which is why you can use a three-way mirror. That describes our VM filer here, where the design requirement was that a single disk failure may not compromise redundancy. This implies three (or more) disks per mirror.



Yes, though it might be a bit fat depending on the specific application.



Setting up smaller pools doesn't mitigate fragmentation. Storing less data, supplying more space, or being very careful about the manner in which you store data can mitigate fragmentation.



What potential of losing everything?



When you get into larger systems, FreeNAS reserves too much memory for itself, and you can tune it to allow the ARC to get larger. Can you explain why you have this recurring idea that ARC size is bad/needs to be controlled?

ARC == awesomest use of RAM on a FreeNAS box.



You need a SLOG for each pool that you wish to have a SLOG. FreeNAS does not support partitioning of devices for this purpose, but it can potentially be done by hand ("voids warranty").



Yeah, don't do that. Just add two smaller L2ARC devices. Two L2ARC devices are nicer even for a single pool.



Of course LOTS of sites run striped mirrors in production. You can have hot spares standing by for a quick rebuild. If you're not confident that disk two in a two-disk mirror will be able to recover sufficiently quickly and the idea of losing redundancy drives you nuts, you can utilize three-way or four-way mirrors. Our rules here cause us to have three-way mirrors and hot spares. 52TB of disk burned to deliver 7TB of highly failure-resistant storage.



You're supposed to treat it as an appliance.



Latency is usually due to mandatory pool access. Provide large amounts of free space on the pool, which reduces fragmentation, which reduces the likelihood that ZFS will be writing more slowly than you're throwing data. Provide a SLOG device, to reduce write latency for sync writes. Provide gobs of ARC and L2ARC to accelerate reads. Use mirrors instead of RAIDZ to increase the IOPS capacity of your pool.

----

Ok, so the three way mirror sound expensive, but may be doable and more reliable.

How would you layout 33 960G DC SSD drives? (assuming we're doing mostly heavy sequential reads/writes for the most part in the file sizes mentioned previously)
*Would you just have one pool or multiple?
* Would you have two Intel 750 400G SSD PCIe cards for L2ARC or one 1.2TB 750 card installed
* What size SLOG would you suggest and how many?
* Also what percentage of disk usage would you say is optimal as a ratio of used/free space on the disk to sustain decent performance?
* Would turning off compression benefit performance when every bit counts?
 

L3192

Dabbler
Joined
Jan 25, 2016
Messages
22
You want to replicate to a second FreeNAS server. Unless you have an unused one lying around, you will need to purchase more hardware.

I"m guessing the replication will occur via the zfs-send/recieve tool or is their something else used.
* How often would you typically replicate to avoid overloading the production server?
* What types of connections would you use to improve the transfer speed(10G nics? trunked etc..)?

This would be an ideal solution depending on the speed and frequency of the replication.
 
Status
Not open for further replies.
Top