Hardware configuration for performance (2 x SAS HBAs, 2 x SAS JBODS)

Status
Not open for further replies.

elliotpea

Cadet
Joined
Jul 28, 2017
Messages
7
Hi,

We're going to repurpose our current production SAN (running Nexenta) into a FreeNas server used for backup storage (disk images via scp, rsyncs of millions of files). We have this hardware available:

2 x Xeon Dual CPU servers, 96GB memory, dual 10GB NIC in each.
2 x STEC ZEUS 8GB SSDs for ZIL
2 x OCZ Talos 460GB for L2 ARC
2 x LSI 9200-8e SAS HBA
2 x DataON 1600D SAS 24 Bay JBODs with dual SIMs (controllers?) (manual: https://goo.gl/ycEjtA)

Right now we're in an active/passive setup using Nexenta, we will be using just one server for FreeNas, so i'll put as much memory from the other server into the one unit. Redundancy isn't a worry, be it server failure, HBA failure or JBOD failure, i'm not too concerned.

We will have 24 x 4TB SAS disks, in mirrored vdev pairs.

My question is regarding performance of SAS bandwidth. Specifically:

  • Is it best to put the ZIL and L2 ARC in the server rather than the JBOD to use internal HBA bandwidth, or perhaps on their own dedicated HBA / PCIe Slot?
  • Should i look at using 2 x JBODS and splitting mirrored vdevs between them to increase bandwidth? or;
  • Can i use multiple SAS HBAs connecting to the single JBOD to increase overall SAS bandwidth?
  • What is the SAS bandwidth between the HBA and the JBODs, is it 6gbps per port? Or more?

Thanks,

Elliot
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
My question is regarding performance of SAS bandwidth. Specifically:

  • Is it best to put the ZIL and L2 ARC in the server rather than the JBOD to use internal HBA bandwidth, or perhaps on their own dedicated HBA / PCIe Slot?
  • Should i look at using 2 x JBODS and splitting mirrored vdevs between them to increase bandwidth? or;
  • Can i use multiple SAS HBAs connecting to the single JBOD to increase overall SAS bandwidth?
  • What is the SAS bandwidth between the HBA and the JBODs, is it 6gbps per port? Or more?
I would say that some of the answers you seek would depend on the exact configuration of hardware you end up with, which is not yet defined.
You listed the 24bay JBOD chassis, but it would be helpful to know what kind of chassis the server board is in and what the board is / what capabilities it has. Theoretically, the ZIL and L2ARC would be best implemented internal to the server, but I can't really say if that is even a possibility without knowing more about that server. In that light though, ZIL and L2ARC would be most useful if you are doing iSCSI or running virtual machines. If you are going to set this up as a samba share it won't help much, if at all. How do you plan to use the system?
You could cascade from one JBOD to the next to connect all the drives but it might limit you depending on SAS version of the chassis or drives. If that unknown server has slots for it, I would connect both of those LSI 9200-8e cards and connect a JBOD to each so you have 48 drives of online storage instead of 24. With that in mind, I would set them up as 6 or 8 drive RAID-z2 sets instead of mirrors because this is a backup system, you probably don't need all the speed that the mirror sets gives but you might like to have the capacity that is available.
If you used 6 drives per vdev in RAID-z2, your pool could have a capacity of about 112TB.
If you used 8 drives per vdev in RAID-z2, your pool could have a capacity of about 119TB.
Where, if you used mirrors, you would only get 84TB using the whole 48 drives.
The rated SAS/SATA interface speed for that card is 6.0Gb/s per link. It has two external SFF-8088 connectors and each has 4 links (8 x 6Gb/s minus overhead) so the aggregate bandwidth would be a little less than 48Gb/s. It is also interesting to note that it is an 8 lane PCIe 2.0 card with each PCIe link having a maximum speed of 500MB/s (8 * 500 minus over head) which would leave you with an estimate of less than 4000MB/s for the interface to the system board. These two standards are expressed differently in the documentation Gb/s vs MB/s, it can be confusing when one is expressed as bytes and the other is expressed as bits. It works out that the card would be capable of 6000MB/s but the system interface through PCIe 2.0 is only able to handle 4000MB/s (all minus some amount for overhead)... This is why I said to use both SAS controllers, one for each JBOD.
Anyhow, the disks (I don't know what model you have) are likely your bottleneck when it comes to raw speed. How much speed are you looking to get?
Can i use multiple SAS HBAs connecting to the single JBOD to increase overall SAS bandwidth?
I would say this is not even worth thinking about because the disks will be too slow to fully saturate the speed of a single HBA. The disks are always the slow spot which is why we use many of them to share the workload.
The more you can say about how you want to use this, the more helpful we can be.
 
Joined
May 10, 2017
Messages
838
Link to the manual is not working, so I'm going to reply more generically:

Can i use multiple SAS HBAs connecting to the single JBOD to increase overall SAS bandwidth?

Not likely, when multiple HBAs are supported it's usually for fail-over only.

What is the SAS bandwidth between the HBA and the JBODs, is it 6gbps per port? Or more?

Assuming the JBOD enclosure is SAS2 and the disks used SAS2/SATA3:

2400MB/s with single link, 4800MB/s with dual link if supported by the JBOD, of those and accounting for protocol overhead 2200MB/s and 4400MB/s usable respectively, if using dual link the bottleneck will be the PCIe 2.0 HBA, which in my experience has considerable overhead, of the 4000MB/s theoretical max bandwidth you can usually get between 2500 and 3000MB/s
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
PS. If you go with the solution I suggested above:
Code:
If you used 6 drives per vdev in RAID-z2, your pool could have a capacity of about 112TB.
You should see enough speed from the system to fully saturate the 10GB network link.

Naturally, I am making some big guesses about the performance of the drives because I don't know what model drive you are using.

Your mileage may vary, but I had some overhead in my calculation too. It worked out to 1653MB/s to the drives.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
It might not be totally tested to work long term, but those ZUES devices are badly outclassed in performance and price by Optane SSD's.

https://www.servethehome.com/exploring-best-zfs-zil-slog-ssd-intel-optane-nand/

Once again, even though it hasn't had the extensive background of testing of DRAM based ZIL's, i would still strongly consider it.
The OP indicated that they are planning to repurpose hardware that they already have, so there would be little or no cost involved.
 

elliotpea

Cadet
Joined
Jul 28, 2017
Messages
7
Thank you for your replies. Yes correct, we don't wish to purchase anything else, so it is very much a case of getting the best out of what we have.

The workload is going to be mostly random i/o, small writes over ssh/rsync and local 'cp' and 'rm' commands. Our primary backup application uses an rsync with hardlinks to perform differentials. We're a web host, so we're dealing with millions/billions of small files in these backups. This results in high iops requirement on the backup server, as it has to 'cp' a snapshot. We retain 90 days of daily snapshots.

Edit: I should clarify that the commands are executed locally on the freenas server over ssh, we don't mount these disks remotely using NFS or ISCSI. These are local file operations.

The result is not much disk space requirement - at the moment we rent a server with 12 locally attached 6TB disks in mirrored pairs, total utilisation with lz4 is 21TB. This is with nearly 90 daily snapshots of all systems.

https://www.dropbox.com/s/r6iswtntcokvcbn/Screenshot 2017-12-19 10.24.02.png?dl=0

Some disk stats:

https://www.dropbox.com/s/wojlidpj113lelk/Screenshot 2017-12-19 10.26.16.png?dl=0

So this really needs to be a mix of performance and capacity - up to 48TB useable space will do us ok for the future, leaving room to expand just incase, but really it's having the IOPS to finish the backups in a reasonable timeframe. At the moment it takes around 7 hours to process them all, and i'd like to get this down.

So perhaps i should look at more spindles? Are mirrored pairs best for this or should we go a different way? Ideally i'd like to get this done within 24 disks in one JBOD, as we are having to purchase the disks for this, but if more are recommended, or 2 JBODs will provide better performance, i'm fine with that.

A working link to the pdf for the JBOD:

https://s3.amazonaws.com/cdn.freshdesk.com/data/helpdesk/attachments/production/6001246981/original/DNS 1600 User Manual QS0002.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJ2JSYZ7O3I4JO6DA/20171219/us-east-1/s3/aws4_request&X-Amz-Date=20171219T103319Z&X-Amz-Expires=300&X-Amz-Signature=24d18bc7e79f753ccf8d850818c1feb94197947006becdf85406d8aabc5eccc9&X-Amz-SignedHeaders=Host&response-content-type=application/pdf

Thanks,

Elliot
 
Last edited:
Joined
May 10, 2017
Messages
838
A working link to the pdf for the JBOD:

Still not working, but for the workload described I believe a single x4 SAS link would suffice, even if it's SAS1 (1200MB/s), your main bottleneck is most likely going to be the pool IOPS.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
So perhaps i should look at more spindles? Are mirrored pairs best for this or should we go a different way? Ideally i'd like to get this done within 24 disks in one JBOD, as we are having to purchase the disks for this, but if more are recommended, or 2 JBODs will provide better performance, i'm fine with that.
Like Johnnie Black said, the slow part of the whole system is the mechanical access at the drive, but I would still use two SAS cables between the server and the JBOD. You are probably being held up by the low drive count in the rental system, with only 12 drives, because of the kind of data. Large files are handled more quickly, but small files always create a problem. Because of the workload you describe, I would suggest, having more drives would make the process faster. More drives gives more potential IOPS. Sounds like you are aware of that already though and yes, mirrors gives more vdevs (virtual devices) and in ZFS, vdevs give more IOPS. The rental system with 12 drives has 6 vdevs, and when you move to the re-purposed system you could have twice or even four times as many vdevs; potentially (twice or more) the IOPS.
I was thinking that you were re-purposing equipment where the drives were still installed. Having to buy new drives makes your position a little more difficult. I would start with one JBOD (24 drives in 12 vdevs) and see how that goes. If you need more IOPS, you can always expand the pool by adding the second JBOD to the server for another 12 vdevs. The thing to look at is disk utilization vs network utilization. With a 10GB pipe into the server, at some point, the network will become the choke point. Small files slow file transfer but with enough disks (vdevs) you should be able to get the data to disk fast enough to fill that network pipe. I have a server in my shop with 60 drives and I still struggle to get fast access to small files, but we didn't use mirrored vdevs because we needed a large volume of storage. We will probably double the drive count on the next system to have both the storage volume and the IO.
 
Last edited:

elliotpea

Cadet
Joined
Jul 28, 2017
Messages
7
Like Johnnie Black said, the slow part of the whole system is the mechanical access at the drive, but I would still use two SAS cables between the server and the JBOD. You are probably being held up by the low drive count in the rental system, with only 12 drives, because of the kind of data. Large files are handled more quickly, but small files always create a problem. Because of the workload you describe, I would suggest, having more drives would make the process faster. More drives gives more potential IOPS. Sounds like you are aware of that already though and yes, mirrors gives more vdevs (virtual devices) and in ZFS, vdevs give more IOPS. The rental system with 12 drives has 6 vdevs, and when you move to the re-purposed system you could have twice or even four times as many vdevs; potentially (twice or more) the IOPS.
I was thinking that you were re-purposing equipment where the drives were still installed. Having to buy new drives makes your position a little more difficult. I would start with one JBOD (24 drives in 12 vdevs) and see how that goes. If you need more IOPS, you can always expand the pool by adding the second JBOD to the server for another 12 vdevs. The thing to look at is disk utilization vs network utilization. With a 10GB pipe into the server, at some point, the network will become the choke point. Small files slow file transfer but with enough disks (vdevs) you should be able to get the data to disk fast enough to fill that network pipe. I have a server in my shop with 60 drives and I still struggle to get fast access to small files, but we didn't use mirrored vdevs because we needed a large volume of storage. We will probably double the drive count on the next system to have both the storage volume and the IO.

Thank you - this sounds like a plan. I'll probably end up doing 2 x 10Gbps LACP as we have 2 core switches. I'll fill out this JBOD with 24 x 4TB mirrored vdevs and see how we go. As you say, it's double what we have now, and we don't have any ZIL currently.

With regards to the 'two sas cables' suggestion. To clarify, i can run 2 SAS cables from one HBA to one JBOD (one into each controller on the JBOD) ? This will double the bandwidth available?

Additional question - do I put the ZIL and the L2ARC off of the spare HBA internally, making use of the additional pcie bandwidth?

Thanks again,

Elliot
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
With regards to the 'two sas cables' suggestion. To clarify, i can run 2 SAS cables from one HBA to one JBOD (one into each controller on the JBOD) ? This will double the bandwidth available?
No, that gives you redundancy. I still have not been able to look at the documentation of the JBOD. The link doesn't work for me. It may be that it only works internal to your organization. If you have the document, you could try putting it on the dropbox site and adding a link.
I have worked with several different models of JBOD and they are all different in their capabilities. I have one JBOD that only has a single controller, but you can run two links from the SAS HBA to the controller to increase the bandwidth. It basically has two ports marked 'in' and the (what I call cascade port) marked 'out' to connect another JBOD. I also have some that have dual controllers (for redundancy) and each controller has a port marked in and another marked out so that you can cascade to another JBOD from the out port but the second controller only takes over in the event that the primary controller fails. In the system where that is setup (a Sun/Oracle multi-rack deal) they have two servers, one cabled to each JBOD controller, so if one server fails, the other server can grab the same disks and go.
If you only have one port marked 'in', you may as well just use one SAS cable between the HBA and the JBOD.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Additional question - do I put the ZIL and the L2ARC off of the spare HBA internally, making use of the additional pcie bandwidth?
I am actually not entirely sure you need the ZIL and L2ARC. The theoretical transfer speed of the disk pool we have discussed should be faster than the interface speed of the drives you have for ZIL and L2ARC so they could actually introduce a bottleneck. The pool might have more latency though. If you have the opportunity to do some testing, I would suggest to try it each way (with and without) to compare the performance and let us know what you find out.
From what i can see, it has 2 ports for IN, per controller, so 4 per JBOD in total.
The link speed of a single cable is theoretically faster that the aggregate speed of the drives you would have in the chassis, so you would probably see no speed benefit from having two cables, I just hate to rely on only one of something when I could have two.
 

elliotpea

Cadet
Joined
Jul 28, 2017
Messages
7
Ok, thank you again for your advice. I'll be starting on this early Jan and will report back after some testing!
 

diehard

Contributor
Joined
Mar 21, 2013
Messages
162
If writes are synced, he absolutely needs a SLOG. Performance will crawl without one.

All ZFS pools have a ZIL.

It's possible depending on the scenario that using L2ARC could lessen your performance, but with 96GB of RAM id still probably give it a go.

Backplane bandwidth will almost certainly not be the bottleneck.. pool configuration and fragmentation will.
 
Status
Not open for further replies.
Top