FreeNAS slower than Windows Server 2012 on similar hardware

Status
Not open for further replies.
Joined
Jul 31, 2013
Messages
7
We have two similarly spec'd Aberdeen Storage servers at work. One is running FREENAS-9.3-STABLE-201506042008 and the other is running Windows Server 2012.

The FreeNAS box has 256GB ECC RAM, dual Intel Xeon E5-2630 v2 2.60 GHz CPUs, an Intel 82599EB 10Gbps SFI/SFP+ NIC, 3 LSI MegaRAID SAS 2208 controllers, and 45 6TB Hitachi drives. The RAID controller is in JBOD mode and we're using 5 RAID-Z2 VDEVs of 9 drives each configured into a single zpool of 245 TB.

The Windows Server 2012 box has essentially the same hardware but it has 60 drives instead of 45 and the RAID controllers are in RAID6 mode rather than JBOD.

We've noticed the throughput to the FreeNAS box maxes out at 300 MBps whereas we can saturate the 10Gbps link on the Windows box.

The testing we do involves multi-GB Veeam backups that are much larger than the cache on the RAID controllers.

Does anyone have any ideas about why we're getting such worse performance out of the FreeNAS box? Is it possible the NIC driver is causing the throughput issue?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Can you provide a debug file from your system? System -> Advanced -> Save Debug

Right now I'm betting that LSI 2208 controller is not in a true JBOD mode and therefore not only a bad choice for you, but also a performance killer. But the debug file will tell me more.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
You didn't mention the protocol. If this is a single threaded cifs connection. Samba could very well be choking. All the cores and RAM won't help there. You need threads. We often see good gear choke around 300. If iscsi then you should be able to get closer to theoretical.

I kinda like. Check the network stack, check the local pool performance (cj is here), test protocols.

Good luck maybe cj will spot something interesting.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Notes on what I saw from the debug file.

- Not on the latest version (June 4 build) (not a particularly big deal)
- Your zpool is multiple RAIDZ2 9-disk vdevs (5 total) + 2xl2arc + slog
- You are using hardware RAID (this is a total failure IMO)
- There is no SMART data on your disks because of the RAID controller. (another total failure IMO)
- You are using an Intel NIC for the 10Gb. You'll never truly get saturation on Intel 10Gb. Speeds of 6-8Gb are about the best you're going to get on real network workload.
- You set CIFS max protocol to SMB3_00 (not recommended IMO)

So here's some stuff you need to know for your situation:

  • Your zpool should give very good throughput for workloads that can be prefetched. This means NFS and CIFS only. iSCSI will not necessarily give good performance, now or in the future. Even if it appears to give good performance now, you will watch your zpool performance slowly decrease and things become more fragmented because if limited iops (see next bullet for more).
  • Your zpool is going to be iops limited because of the number of vdevs you have. If you are not concerned about iops, then this may not be a problem. But iSCSI is definitely all about the iops because it is a block device and cannot be effectively pre-fetched by ZFS nor can it not end up with significant fragmentation in the long run.
  • Your RAID controller is totally inappropriate for ZFS from a performance standpoint. More than likely the RAID controller is caching your reads and writes, which is in direct competition with ZFS' own scheduler. Likewise the write cache on your RAID controller is probably making ZFS think that each disk does multiple GB/sec (which we know is impossible). This further screws up the scheduler for ZFS.
  • Your RAID controller is also masking all SMART data. So if a failing disk was responsible for your performance problems, you have no way to actually prove that is the case nor do you have a good way to identify the failing disk without serious manual labor and troubleshooting.
  • If performance of the network is what you care about, the Intel NICs are not the ones to use. Chelsio T520s are pretty much the card of choice on FreeBSD. They are without a doubt the undesputed leader in network performance. They also come at a fairly hefty pricetag.
  • CIFS is single threaded, so you aren't going to saturate 10Gb with a single CIFS connection unless both the server and client have very fast single-core performance. At least not with Windows clients using CIFS. My own experience with Windows shows that I get much, much better performance of CIFS with linux than any Windows client. So if you are trying to maximize CIFS throughput, go to linux. If you read my review of the FreeNAS Mini I put a 10Gb card in there and I did do performance comparisons to/from the Mini using Windows and Linux. Linux pretty much always won, and by quite a large margin at times.
  • There was no way for me to prove what your slog and l2arc devices are, how big they were, etc. So here's some things to consider about that. If your devices do not perform appropriately for the workload they are going to be used for, they can actually hurt performance. If your slog and/or l2arc devices aren't top of the line in terms of benchmarks then expect them to hold down your zpool. If they are inappropriately sized they can actually create new problems for you.

Clearly with a few changes there is lots of room for your server to perform better. How much better depends on your workload and other factors that I can't really hypothesize about here. If you can't replace the RAID card, then I would consider that the decision to also not try to use FreeNAS any further as that will eventually mean problems and heartache. I just replied to someone else in another thread explaining that they made some mistakes by not having SMART enabled on his disks, and he had no indication that anything was wrong until all of his data was lost.

Your system is definitely big and beefy. Generally by the time people even consider a server 1/3 of your size we recommend you simply contact iXsystems and buy one of their servers. They deal with large-scale servers and can help you make sure you aren't bottlenecking yourself in multiple different ways like you currently are. Once you've resolved the issues above there may be other limitations, but there are little resources to help you in the forum when dealing with a server at this scale. What works when you are small-scale home user doesn't necessarily work when you have something 10 times bigger.

Good luck!
 

schultmc

Cadet
Joined
Jul 23, 2015
Messages
2
Thanks, cyberjock, for your quick reply.

I was told by the colleague who setup the FreeNAS box that the RAID controllers were put in JBOD mode, not RAID mode. Could you point me to where in the debug file it shows RAID being active on the controllers? I already knew FreeNAS needed direct access to the disks and thus RAID should've been disabled and assumed my colleague had done so correctly.

Should we reconfigure the zpool to use smaller numbers of disks (for example 6-disk vdevs) vs. the existing 9-disk vdevs?

We're not opposed to swapping out the Intel NIC for a Chelsio.

We're currently using CIFS to connect to the FreeNAS (and Windows) box.

The slog and l2arc devices are standard Crucial consumer grade SSDs.

What should the CIFS max protocol be set to?

The reason we built this ourselves rather than getting a TrueNAS from iXsystems was because of budgetary issues.

Thanks again.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
It may be on JBOD, but it is not actually acting like a JBOD. This is obvious to me because the disks are labled as mfids and not adas or das. I've never used a 2208 based controller, but I'm betting that they are saying JBOD but simultaneously creating a bunch of single-disk RAID arrays and calling it JBOD. That is not JBOD. Not even in the slightest. What it definitely is though is one of the biggest noobie mistakes you can make with ZFS. :P

Yeah.. standard consumer grade SSDs are not what you need. You need either high-end consumer SSDs (don't just take someone's word for what is high-end either.. do the research to find out what actually works well as an L2ARC and/or slog) or industrial grade SSDs.

The CIFS max protocol should be left on default unless you've extensively tested the SMB3 protocol. It's no secret that SMB3 is not well-supported with Samba (the package that provides the CIFS/SMB protocol for linux/unix).

I can understand the budgetary issues, but, as I said before this can quickly turn into something more expensive than buying a TrueNAS from iXsystems. Large scale servers like what you are trying to do is not trivial to implement, not trivial to troubleshoot, and not trivial to get the performance you want at the price you'll pay. I definitely don't get particularly involved with large-scale servers like yours in the forums and IRC. It can turn into a major time-sink as you spend weeks (or months) trying to find out what a problem is, then replace/upgrade/tune as necessary.

You definitely won't be the first to try to do this while citing budgetary constraints as the reason for not going with iXsystems. Many people ultimately do give up on FreeNAS because it is not worth the time and effort to try to figure out what the problem is, let alone try to figure out how to fix it.

This is absolutely going to be an up-hill battle. If you think you're going to replace a couple pieces of hardware and be up and running, you should probably just give up on FreeNAS right now. It does not come that easily when you scale up to what you are doing. I don't think I can exaggerate how difficult this can be for you, and the problems you will encounter along the way are not going to be able to be resolved with a forum post or two. The issues you may run into will possibly not be discussed on the forums or IRC because nobody has the money to afford hardware like yours and then chooses to handle all of the software setup in-house. When many companies do risk analysis they lean off the risky adventure you're starting to embark on. More than one person here has dropped money equivalent to the cost of a new vehicle, then ultimately never got FreeNAS working at full speed because they failed to consider the complexity of the software and ZFS.

If the server can be used with Windows, I'd strongly recommend you go that route. This just isn't likely to end well for you in the long run if you continue down the path with FreeNAS. You've already made one fatal mistake (using RAID) and I have no doubt that you are likely to make a few more before you figure this all out. And unfortunately you'll likely figure out what you did wrong after it is too late (if it works why would you even suspect something is wrong?). Backing up or restoring quantities of data like what you are considering is extremely time consuming. Serious time (and time = money) is lost when you have a zpool of your magnitude go down and has to be restored from backup, etc.

This really feels to me like you are going to die of 1000 cuts before you realize you should have either not gone with FreeNAS or gone with iXsystems and a support contract. You are absolutely in new territory for the forum. It's nice from the e-penis standpoint, but is a bad place to be when you want performance and reliability. Trailblazing often leads to problems, tears, and lots of lost data you had no idea was coming until it was too late.
 
L

L

Guest
Aberdeen is highly skilled with ZFS, they were one of the first openzfs vendors. I would go back to them and ask some these questions.
 
Status
Not open for further replies.
Top