Questions about RAM, Drives and Network cards for a FreeNAS cloud server....

Status
Not open for further replies.

dotpoka

Cadet
Joined
Sep 12, 2011
Messages
7
Hello ! First, I love FreeNAS... its an excellent product... I have been running a FreeNAS server in a farm environment for about 6 months now just after 8.0 was released.

I need some guidance on my hardware. I am using FreeNAS with ZFS2 in iSCSI mode to support 4 servers operating a Cloud of virtual machines. I have a 12 bay server with each bay filled with WD RE4 hard drives connected to 2 PCI-X 64bit controllers (6 on each).

---------
HARD DRIVES ? how best to allocate to STORAGE vs. SPARE vs. MIRROR vs. CACHE
---------

Each of my 12 drives are 2TB Western Digital RE4 (purchased refurbised from Newegg for $ 99 each!!!).

My setup: 8 of my 12 drives allocated to STORAGE forming a 9.7 TB volume with ZFS2. 1 of my 12 drives is allocated as a SPARE. 1 of my drives is allocated as CACHE. 2 of my drives are allocated as a LOG MIRROR.

QUESTION: I don't see any SWAP utilization taking place. Does this relate to the CACHE drive? If so, should I replace that 2TB RE4 with a smaller, 10K RPM 75GB Western Digital velocicaptor?

21163.png


QUESTION: I don't see where/how the 2 drives allocated to LOG MIRROR are being used. How much of the 2TB + 2TB drives are utilized by the LOG MIRROR? I also have the feeling about swapping out these larger drives for some smaller faster 10K RPM 75GB velociraptors. (Refurbished from newegg, the velociraptors were only $ 80 each!!!).

-----
RAM ? how much is enough ?
-----
Right now I have 16 GB of ram, and I see that my Physical Memory Utilization is always at or near capacity. I recently purchased more RAM and am trying to decide where to allocate it to. What type of performance boost would upgrading to 32GB RAM give me?

21164.png


--------------------
NETWORK CARDS ? do you think this would this work....
--------------------

I appear to be saturating my dedicated iSCSI network interfacing attempting to server all 4 of my cloud servers. So, I was going to install an additional 4 gigabit NICs and connect each server directly to the FreeNAS device, so there is no contention, hubs, switches, etc. in the way. All 4 of the servers are mounting the same iSCSI target (which is supported using my failover cluster service).

QUESTION: With an additional 4 network cards installed, each one cross-connected directly to its dedicated cloud server, can the iSCSI service be configured to have multiple IP addresses mapped to the same EXTENT ?


----------------

Thats the end of all my questions.... sorry for such a long post.... I also was looking at upgrading my CPUs from 2 x Dual Core XEON to 2 x Quad Core XEON but I see my CPU average is 5 %. This is my chart for CPU usage, its pretty low.... does not look like that would do much for me.... any input appreciated.

21165.png


Thanks !!!!

Brett
 

b1ghen

Contributor
Joined
Oct 19, 2011
Messages
113
Wouldn't you benefit a lot more from using SSD's for Cache/ZIL? Sure a dedicated drive is probably a little better than no cache/ZIL at all but SSD is magnitudes faster than mechanical drives.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
The RAM use is normal. ZFS will over time consume (almost) all that is available. This is nothing to worry about.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
Not sure if I'm understanding this correctly, but shouldn't your eight 2TB storage drives result in something closer to 12TB of available storage? I'm assuming by "ZFS2" you meant ZFS in a raidz2 configuration. With 8 total drives that would be 6 drives + 2 for redundancy, so around 12TB of available storage. For the cache and log drives, obviously you've already spent the money to get them set up, but I've read elsewhere on the forum that typically you should put in as much RAM as your motherboard can handle before worrying about cache and log drives. Some links with decent discussions of log and cache drives (and ZFS in general):

http://www.markround.com/archives/35-ZFS-and-caching-for-performance.html
http://www.zfsbuild.com/

As for your swap usage, I'm not sure, but I'd imagine the graph only refers to actual swap space on your hard drives--it doesn't view your cache drive as swap space. The default setting in FreeNAS 8 is 2GB for each new drive added, so your 12 drives would have about 24GB swap unless you changed this setting before adding the drives. I see about 24GB in total in your graph.

I'm not too familiar with multiple network cards either, but I believe that in a system like yours the network is likely to be a bottleneck in terms of data transfer, so adding more cards would probably result in a decent performance enhancement.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Reported swap utilization for a system with lots of memory darn well ought to be zero. :smile:
 

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi dotpoka,

Let me go through your questions in order:

1) Swap serves 2 functions. It's obviously swap in the traditional sense of the word, but while ZFS is aggressive in trying to use every bit of ram it can, it uses it for cache so it's just as willing to free it up. Basically the system shouldn't really need to swap. The swap partition also servers as a tiny little hedge against a replacement disk being just a tiny bit to small. Most drives of a given class are within a few MB of each other in size, but you can't use a disk that's any smaller as a replacement. The swap is insurance against that.

Honestly I wouldn't use either drive as a cache drive. Ideally for cache you would use an ssd of sufficient size to hold the amount of "hot" data, but the need for this also gets down to how fast your network connections are.

2) For the ZIL (write cache) you once again really want to use a pair of ssds. The ZIL doesn't have to be very big...it only has to cache 30 seconds worth of writes before it is flushed to disks. You can figure out how big (worst case) you need to go like this: each gig-e interface can do 125MB\s max so you need to have capacity to buffer about 3.7GB of data per gig-e connection if you run them full-bore. You want the expensive SLC (enterprise) SSD's for this because of the number of writes they have to endure. I don't think you would see any gain swapping the RE4's out....in fact I would bet that you are actually hurting your performance using HHD's for the ZIL.

3) ZFS loves the RAM, that's for sure. The reason it looks like you are always using all your RAM is that you are...ZFS is using it for caches. Adding more RAM won't hurt performance and depending on what actually dragging the system down may help a ton. I know...that's just a long winded "it depends" but I'll get further into why in a moment.

4) You can certainly do a dedicated connection from each server, but I'm not sure the network is the problem. How is the rest of the network configured? How is the filer connected? Good switches?

5) I don't think more CPU will help at all. you aren't working the proc hard at all.

So here's what i think is going on. I think you are hamstrung by your cache & log devices. If I were you I would redo the storage to get rid of the HDD based cache & log devices and make a volume made up of a pair of 6 drive raidz2 virtual devices (this will require some more thought\discussion) . Pop in the other 16GB of ram & see how it works. If that doesn't make it better figure out if it's reads or writes (or both) that are choking out your performance and add some SSD's as appropriate.

-Will
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
Survive is right. For performance, you need to use something like an STEC Zeus RAM (http://www.stec-inc.com/product/zeusram.php) for your ZIL, and SSD for your cache. Use your disks for regular storage. I would suggest that you break your spindles into 4 groups of 3 for RAIDZ. ZFS will stripe across the RAID devices, so your performance is based on the number of RAID devices, not the number of spindles in a RAID device. If you feel compelled to have spares (not a bad idea, I do this myself, I have 8 spindles in mirrors and I keep 4 spares), I would keep them out of the chassis and replace as needed.

ZFS will use (essentially) all your available RAM as a cache so it doesn't have to go to disk to get data. The more RAM you have, the bigger the cache. The RAM cache works in conjunction with the "cache disk" (L2ARC in ZFS terms). If you have an L2ARC, it will use it, and SSD will be significantly faster than a traditional spindle.

This is essentially the same architecture that the TrueNAS hardware uses, and they can handle 10G, so I think this is probably the path you want to look at.

Also, you may want to consider simply plugging your NAS into a good switch, and using LACP or EtherChannel to agregate the interfaces. You would never get more than the speed of one interface to a single host, but you can get that speed to many hosts at the same time.
 

dotpoka

Cadet
Joined
Sep 12, 2011
Messages
7
Some follow up questions/clarifications

Excellent input. Thank you all. Some follow up questions/clarifications if I may ....

-- HARD DRIVES --

Putting any network bottleneck asside for the moment, to verify my understanding:

ZIL = Log Mirrors
CACHE = RAM + (L2ARC in ZFS terms)

I happen to have 3 x 120GB SSD drives with decent 500MB/S performance, and enough RAM to boost to 32GB.

Given my current resources already on hand, I should:

- upgrade the RAM to 32GB
- replace the CACHE DRIVE - or "L2ARC in ZFS terms" with one of my SSD drives
- replace the LOG MIRRORS - or "ZIL - write cache" with 2 mirrored SSDs

Thank you louisk for the reference to Zeus STEC, I will keep that in mind for the future. At the moment I must work with the components on hand.

I hope I have properly understood the everyones feedback.... Important question though : is there a recommended approach or things to avoid doing while I am swapping out my LOG MIRRORS and the CACHE drive so that I dont jeopardize my data?

-- NETWORK ---

Switching back to the network bottleneck:

The system is currently using the dual Motherboard based Intel GB NICs. 1 for management, 1 for iSCSI.

I am going to install 5 additional GB NIC into the FreeNAS server. (3 x 1GB Intel PCIe and 2 x 1GB PCI Intel cards)

I mentioned that the purpose of this FreeNAS device is to provision clustered storage service over iSCSI to 4 fail-over configured cloud servers.

Based on the input provided by My switch supports LACP and has decent bandwidth capacity. So my decision is to:

(1) aggregate the 5 ports into a 5GB/s connection

or

(2) directly cross-connect each machine directly to the FreeNAS unit on a one-to-one basis.

Which of these 2 approaches would work best?

Really appreciate this input, its sincerely appreciated and I am adapting my go-forward plan accordingly.

- dotpoka
 

b1ghen

Contributor
Joined
Oct 19, 2011
Messages
113
I happen to have 3 x 120GB SSD drives with decent 500MB/S performance, and enough RAM to boost to 32GB.

Given my current resources already on hand, I should:

- upgrade the RAM to 32GB
- replace the CACHE DRIVE - or "L2ARC in ZFS terms" with one of my SSD drives
- replace the LOG MIRRORS - or "ZIL - write cache" with 2 mirrored SSDs

I would be very careful with using regular MLC based SSD as ZIL drives, they will wear out a lot faster then SLC drives.
My (theoretical) drive choice for ZIL is Intel 311 (Larsen Creek) which is only 20GB in size but SLC and made to be a caching unit to begin with. I don't think you need more than 20GB for write caching in a normal scenario. I haven't tried this myself and I haven't found anyone who have either, but in theory it should be perfect as ZIL.
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
My understanding is that even MLC should last at least several years, and for the cost of 32G (currently somewhere around $50), that's not too horrible. That said, yes, b1ghen is correct, SLC is better suited to the task.
 

dotpoka

Cadet
Joined
Sep 12, 2011
Messages
7
I have been researching SLC vs MLC. What I have learned is that SSD drives are like the tires on your car. SLC "drives" will last longer than "MLC" (about 10 times) but over a period of time they wear out.

What matters most is "tread pattern" of the wear ... and this is where the drives controller interface makes all the difference with MLC drives. If you rotate your tires regularly they will last longer, so the capability of the SSD drives controller to "spread the usage over the entire surface of the drive"... apparently Intel drives do this well. What I am considering is that if my requirement is for roughly 4 GB of storage capacity per network interface, then with 5 NICs equals approximately 20 GB drive. Since SLC lasts 10 times longer than MLC drives, a 200 GB MLC drive that evenly distributes "wear" over the lifetime of the drive would be equivalent. And its all moot since I have a 128 GB SSD MLC drive and no budget to get another at this moment... LOL.
 

b1ghen

Contributor
Joined
Oct 19, 2011
Messages
113
Just be careful as a loss of ZIL will mean loss of data, hence the mirroring of the ZIL drives but I am sure you knew this already since you were planning to mirror them. Just hope both don't die at the same time :)
 

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi dotpoka,

I wouldn't go to crazy with the networking yet. Also try to avoid changing a whole bunch of things at once.

First thing I would do is pop the ram in & see what that does. You could also configure the 2 motherboard NICs for link aggregation without changing to much.

If you replace the ZIL with SSDs I wouldn't worry to much about the wear right off the bat, if it fixes the problem then making the sale on a proper solution is much easier.

-Will
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
Before you do anything with LAGG, please read about how it works and think about your situation and make sure that it does what you need. I see lots of people on these forums that don't understand how LAGG works and think it will magically make their network faster. In some cases this is true, but most people aren't doing this on a switch that supports any kind of management, which means that LAGG is likely not an option. If you want to know more, look up LACP or 802.3ad.
 

dotpoka

Cadet
Joined
Sep 12, 2011
Messages
7
Yes thanks louisk for the pointers.... I have used LACG and its associated 802.3ad protocol on my switch, so I know the underlying support is there. It is a bit of a "mystery technology" as I have found differences by manufacturer. My 3COM devices dont always work exactly the same way as my Cisco or my Linksys, etc. Very annoying... but I think that I have at least that part under control.

I am certain these forums are filled with the answer to this next question, and I will go hunting around, but if there are some top 2 or 3 commands to run or execute I would like to get a baseline of where my system is today. Tomorrow I am starting my upgraded, and would appreciate a few pointers in the right places to look or even some scripts I may want to run to capture a good "before" and "after" comparison.
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
You can use (from the cli) top, or 'sysstat -vmstat' or 'sysstat -iostat' to get an idea of what is going on under the hood. You could also run vmstat (figure out what options you want) and capture it to a text file, from there, you can probably (don't recall if any munging is necessary or not) import it into excel and do graphing of points over time.
 

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi dotpoka,

A quick & dirty benchmark you can use is the plain old "dd" command. This isn't a scientific test by any means....it just measures how fast you can access the volume.

If you aren't familiar with this command be careful because it will do exactly what you tell it to! Basically it reads from some input and writes out to the output. For this exercise we will read from /dev/zero, which spews out "00000000"s as fast as the system can generate them and write them to a file on your array. The command looks like this:

dd if=/dev/zero of=/mnt/<volume name>/testfile bs=8192k count=1000

where:

if= input file, of= output file, bs= block size & count is how many blocks to write. You can vary the block size and count to write out a file that will fit into or exceed the system's memory.

You can turn the command around like this:

dd if=/mnt/<volume name>/testfile of=/dev/null bs=8192k count=1000

to measure read speed.

FreeNAS 8.0.2 also ships with the "iozone" benchmark. This is certainly more scientific, and if you are skilled in the arts of Excel you can analyze the data all sorts of ways. Take a look here for more info:

http://www.iozone.org/

You can also use the "gstat" & "zpool" (with the iostat option) commands to see what sort of performance you are getting to the disks. Take a look at the respective man pages for more details.

Personally I would do an "iozone -a" and get the results in an Excel file just to have it no matter what.

Please keep in mind that all these tools only show how the disks are reacting to requests internal to the server and all the speed in the world won't help if you are being choked out at the network connection.

-Will
 
Status
Not open for further replies.
Top