Pool Setup / Recommendations

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
I am trying to setup a new server, but maybe I am missing something obvious... If I have (12) 6TB disks... That's 72TB RAW... Run in RAIDZ-2, that should end up with 60TB, minus a little for overhead, right? TrueNAS calculates out 54TB during the creation process, but when the pool creation is finished, I end up with 48TB, which is the same as what the RAIDZ-3 calculates out... But if I do a RAIDZ-3 pool, I end up with a 46TB pool. Now I am unsure about what kind of setup I should be running... So here are some questions and also my hardware. I would love some answers to these questions as well as some recommendations for a solid setup. Thanks in advance!

OS Version: TrueNAS-12.0-U2.1
Model: PowerEdge R720xd
CPU: (2) Intel Xeon CPU E5-2637 v2 @ 3.50GHz
Memory: 384 GB
Install Disks: (2) Samsung 860 EVO SSD's (Installed in the rear flex bays)
Storage Disks: (12) 6TB Enterprise 7200RPM SATA HDD's
Networking: 10GB
Usage: SMB Share, PLEX Jail, iSCSI target for VMware

MAIN QUESTIONS
1. Can someone please explain the storage situation I outlined above better?
2. What would the recommended pool setup look like to obtain the best storage/performance? I'd like at least 2 disk tolerance.
3. I have read about mirrored vdevs over RAIDZ but that seems to lose so much storage in trade of a slight performance gain, but perhaps you can explain further.

EXTRA QUESTIONS
4. Running the OS on the (2) Samsung EVO SSD's, is there any real benefit aside from redundancy?
5. Should I run the OS on a single disk and use the other as SSD storage cache?
6. Considering the large amount of RAM, would running an SSD storage cache disk be of any real benefit? (Note: I have 256GB, 500GB, and 2TB SSD's available for cache disks if installing to a single disk and using the other slot as cache disk is best.)
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
the best storage/performance?
How are you defining performance? If you want to use it as block storage like you mention, you will want the most IOPS available to ensure "performance" (of course at the expense of capacity). For plex, you could get away with RAIDZ2 and give yourself maximum capacity with good throughput (performance, without worrying about IOPS specifically as media files are large and sequentially accessed).

Have a read of this:

Also:
Model: PowerEdge R720xd
Depending on the storage option you took when buying this, you may have ended up with a RAID controller... make sure to read this:

1. Can someone please explain the storage situation I outlined above better?
See if putting it in here helps:
Also remember the difference between the drive being reported in TB and TrueNAS telling you in TiB Those aren't the same unit.

4. Running the OS on the (2) Samsung EVO SSD's, is there any real benefit aside from redundancy?
Little to none.

5. Should I run the OS on a single disk
Yes

use the other as SSD storage cache?
Maybe...
6. Considering the large amount of RAM, would running an SSD storage cache disk be of any real benefit?
Possibly not... how big is your working set?

Maybe you would benefit from SLOG though for your iSCSI workload... check the path to success post above.
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
First, thank you for your response! Here's some answers to your questions, plus I have some additional questions based on your responses.

How are you defining performance? If you want to use it as block storage like you mention, you will want the most IOPS available to ensure "performance" (of course at the expense of capacity). For plex, you could get away with RAIDZ2 and give yourself maximum capacity with good throughput (performance, without worrying about IOPS specifically as media files are large and sequentially accessed).

Have a read of this:
I guess my question comes into my lack of understanding of how to setup this mirrored vdev storage and if I did it this way, am I essentially cutting my available storage in half so 72TB would give me 36TB usable? I would select 2 disks and mirror them so I would have 6 different mirrored sets, then combine those into a single vdev? Can I still use this as storage for Plex, SMB Share, and iSCSI at the same time? Is the performance going to be that noticeable outside of the block storage?

Depending on the storage option you took when buying this, you may have ended up with a RAID controller... make sure to read this:
I appreciate the catch on this, however I have already dealt with this exact scenario in the past and did a firmware conversion to an LSI 2008 "IT Mode" on the HBA and it works great to read all of the disks.

See if putting it in here helps:
Also remember the difference between the drive being reported in TB and TrueNAS telling you in TiB Those aren't the same unit.
I do understand the size reporting, since I am a long time FreeNAS user, so I guess the bigger question at hand would be why RAIDZ-2 ended up at 48TB and RAIDZ-3 ended up at 46TB. Seems there's a RAIDZ overhead or something I am not quite understanding considering all the drives are 6TB.

Little to none.
Yes
Maybe...
Possibly not... how big is your working set?
Maybe you would benefit from SLOG though for your iSCSI workload... check the path to success post above.
So why maybe? I know I have a ton of RAM on this system which acts as ZFS cache by default. I don't have any active workloads that necessarily exceed that space, so I am not sure if it is truly necessary. I would assume that the RAM cache would be first in line, then the SSD cache would be the second hop (so the 2TB disks might be better here) and then flushed down to the spindle disks, correct? And in reverse order when files are read and placed into the hot tier? Would love some additional input as to why it might or might not be beneficial to change the OS from a mirrored set of redundancy and run one of the disks as a cache.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I would assume that the RAM cache would be first in line, then the SSD cache would be the second hop (so the 2TB disks might be better here) and then flushed down to the spindle disks, correct? And in reverse order when files are read and placed into the hot tier?
I'm not sure it works exactly how you think.

If you have nothing much running, your server will have about 350GB of RAM to use for ZFS ARC (in-memory cache). The last 350 GB read or written since you booted will be there, so no need to get it from the disks. So if your working set (the files you use regularly) is smaller than 350GB, I would expect after a little time of use (a few days after a reboot), more-or-less everything you read will come from ARC.

Writing files is quite different to the "tiering" you're talking about.
Check out this article which explains with some diagrams how ZIL works (and the rest of ZFS... the link is to page 3 of the article): https://arstechnica.com/information...-understanding-zfs-storage-and-performance/3/

Essentially writes are all to RAM, but sync writes will not return to the requestor until it's in ZIL, so you either need to have a pool that can handle the IOPS you will be using or put a SLOG (alternate location for ZIL) in the middle... although that doesn't get you out of jail if your pool IOPS are insufficient, it will just buy you a few seconds of peak performance before it slows to a crawl waiting for the SLOG to clear. If you're just using a lab where the VMs can crash occasionally, just force async writes and enjoy the RAM-speed writes.

Would love some additional input as to why it might or might not be beneficial to change the OS from a mirrored set of redundancy and run one of the disks as a cache.
Unless you're using hardware mirroring for the boot device, a boot device failure will still cause a kernel panic... rebooting may be OK or not depending on your BIOS.

The boot pool (when on a good SSD) rarely provides a source of error, so just keep your config backed up and let it fail if it will (if you're really unlucky) you can quickly reinstall and restore the config with little trouble.
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
You are providing some great information, and it is MUCH appreciated. I have a lot of learning to do about the "under the hood" aspect of things even though I have been a long time user of FreeNAS, but those were much smaller systems. :)

Considering my usage will be... SMB File Share, PLEX Jail, iSCSI target for VMware

CACHE QUESTIONS
1. Since there are 2 options, "Cache" and "Log", would it be better to run the OS from USB flash media or the Dell Mirrored SD Cards and leverage one SATA SSD for Cache and a separate SATA SSD for Log? If so, what sizes of SSD for each? (250GB, 500GB, or 2TB)
2. If I should keep the OS install on an SSD, would Cache or Log be recommended over the other?

DATA STORAGE QUESTIONS
1. Should I run RAIDZ-2 to obtain larger storage space considering my usage if I run the Cache / Log SSD's?
2. Or would it still be better to do the mirrored VDEVs? (Need to figure out how to actually set it up right though...)

Ultimately I am not seeking crazy performance numbers. I am seeking a good balance between redundancy, capacity, and performance for all the usage scenarios I mentioned. The VMware Lab will not be huge by any means, and the workloads it will support will not be huge in the IOPS department. I don't really want things to crash or lose connectivity of course, but I don't need the Lab to be crunching serious numbers or anything.

Thank you again for your advice and support. It is GREATLY helpful and appreciated!
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
It looks like I figured it out the mirrored VDEV setup. I ended up with just shy of 32TB. Is the setup below correct? I understand it will not matter much for the SMB Share and PLEX, but will this really provide that much of a difference in VMware vs running RAIDZ-2? I guess the real question is... Is the cost of 16 or 20TB really worth it? Thanks!

ZFS
MIRROR
da1p2
da0p2
MIRROR
da3p2
da2p2
MIRROR
da5p2
da4p2
MIRROR
da6p2
da7p2
MIRROR
da8p2
da9p2
MIRROR
da10p2
da11p2
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
I guess my question comes into my lack of understanding of how to setup this mirrored vdev storage and if I did it this way, am I essentially cutting my available storage in half so 72TB would give me 36TB usable? I would select 2 disks and mirror them so I would have 6 different mirrored sets, then combine those into a single vdev? Can I still use this as storage for Plex, SMB Share, and iSCSI at the same time? Is the performance going to be that noticeable outside of the block storage?
2 (or even 3 for better safety) disks per mirror vdev, and then (up to) 6 vdevs into a pool. This pool could be used for all uses, with different datsets for Plex, SMB and iSCSI. But unless you have a high number of users, only iSCSI would benefit from this setup.
The best option would be a RAIDZ2 pool for Plex and SMB, and a mirrorred pool for iSCSI. For instance, if your iSCSI needs are 6 TB or less, 2*(2*6 TB) would give that with <50% occupancy—which is what is recommended for performance in block storage. Then the remaining drives in a single RAIDZ2 vdev, in another pool for storage (8*6TB, 36 TB raw, 75-80% of which could be filled).
With 384 GB RAM, a L2ARC is unlikely to be useful. Only the block storage pool, with sync writes, would benefit from a SLOG. Capacity is not important for a SLOG (it only needs to hold one transaction group, i.e. a few seconds of writes), but it should have power loss protection; Optane is ideal, regular SSD used as SLOG are a liability to the pool.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
SMB Share, PLEX Jail, iSCSI target for VMware
The challenge here is that those are very different workloads, requiring very different solutions. @Etorix has an excellent suggestion of using two different pools for each of these purposes - a large RAIDZ2 pool of eight drives for the SMB and Plex data, and a smaller 4-drive 2x2 mirror for the iSCSI ZVOL, assuming that your VMware data will fit inside that smaller pool.

Regarding "is it worth it to run mirrors over RAIDZ2" - that really depends on what level of performance you're wanting out of your VMware environment. While you've got a huge amount of read cache in the form of that 384GB of RAM (and even more SSD if you added L2ARC) your writes will still need to be committed back to the pool vdevs, and those being spinning disk will end up slowing towards that speed under heavy workloads.

The concept of "synchronous" or "safe writes" for your VMware iSCSI ZVOLs also needs consideration. Otherwise, your most recently written data is potentially at risk if there was an unexpected power loss or other hardware failure that caused your TrueNAS system to crash. The Optane devices are a popular choice for good reason (fast, high endurance) but other options include NVRAM devices like the Radian RMS-200 (basically infinite endurance) - you'll also need to force the ZVOLs to be written synchronously by setting sync=always

1. Since there are 2 options, "Cache" and "Log", would it be better to run the OS from USB flash media or the Dell Mirrored SD Cards and leverage one SATA SSD for Cache and a separate SATA SSD for Log? If so, what sizes of SSD for each? (250GB, 500GB, or 2TB)
2. If I should keep the OS install on an SSD, would Cache or Log be recommended over the other?

The Dell IDSDM won't have the endurance needed to handle a TrueNAS install. It's really only rated for vSphere, which redirects its scratch/tmp partition elsewhere. If you had one of the dual M.2 BOSS cards that would be an easy choice for a boot device, but don't go scrambling to buy one unless it's exceptionally cheap.

To be blunt, SATA SSDs will be a disappointment in terms of SLOG performance if you're expecting to hook this up to a 10Gbps network. You might be able to get acceptable results out of fast/expensive SAS 12G drives, but you'll probably end up paying more than you would for an Optane or RMS-200 device - although, at the cost of not being able to hot-swap them. But if this is a home system, a little downtime is often acceptable.

One more thing to note - while your Plex media files are already compressed, your general SMB fileshare and your iSCSI ZVOLs will both benefit from the inline compression that ZFS uses (either LZ4 or ZSTD are fine here) so you'll be able to squeeze out a bit more usable space that way.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
[..] regular SSD used as SLOG are a liability to the pool.
Nice wording :smile: .

@kjparenteau , what I have done, although with XCP-ng and not ESXi, is to go for local SSD storage (EVO 860, 1 TB) for the virtualization host. I then run an hourly backup to FreeNAS with RAIDZ2 (8 x 16 TB Seagate Exos). I can live with one hour of data loss in a worst-case scenario, and have a nice amount of relatively secure storage.
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
First of all, thank you everyone for your replies and EXCELLENT advice and recommendations. I am having a better understanding of what I will be doing, I just need to figure out the pool arrangement and do some testing between leveraging (1) RAIDZ2 Pool for the SMB and PLEX Data, and then leverage (1) mirrored vdev pool for VMware. I am still curious about the performance differences and IOPS between the two pools so I will likely setup 2 targets (one on each pool) and test a bit. I guess I am still trying to wrap my head around how a 4 disk mirrored vdev pool (2 disks per vdev) would yield better performance than a 12 disk RAIDZ2.

The BOSS adapter is definitely pricey, but what about using a cheaper dual M.2 NVMe adapter $50 or less and a couple 32GB or 64GB drives on it? Use one for Cache and one for Log? I assume even thought he SATA SSD would still provide better performance for the logs, using a cheaper M.2 adapter would be even better and allow me to leverage both? Any recommendations on adapters and drives? Thanks!
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
I assume even thought he SATA SSD would still provide better performance for the logs, using a cheaper M.2 adapter would be even better and allow me to leverage both? Any recommendations on adapters and drives? Thanks!
SLOG with an SSD that is not made for the purpose (i.e. enterprise-grade with power-loss protection) will gain you absolutely nothing. In that case you would be better off with no SLOG at all and just disabling sync writes. Also, please ignore consumer SSDs that claim to have power-loss protection. It will be some kind of protection, but not what you need for an SLOG.
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
Makes sense. What about 2 of the SSD's with the adapter listed below? Someone mentioned Optane would be a good fit and seems I don't need a huge drive size. Thoughts?

Dual NVMe PCIe Adapter, RIITOP M.2 NVMe SSD to PCI-e 3.1 x8/x16 Card Support M.2 (M Key) NVMe SSD 22110/2280/2260/2242/2230 https://www.amazon.com/dp/B08P57G1J...abc_9RKDP5FNV3MYV8A438AA?_encoding=UTF8&psc=1

Intel Optane Memory M10 16 GB PCIe M.2 80mm
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
That should work. But, although 16 GB is enough capacity, a higher capacity would buy you more throughput and more endurance. It also feels a bit silly to spend three times more on the adapter than on the drive itself and I suspect that the ASMedia chip will add latency, which is not what one wants in a SLOG. If you have no M.2 slot, look for Optane drives in AIC form factor (consumer 900/905p, enterprise DC 4800X/4801X).
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
Thank you for your information Etorix. Much appreciated! Looking at the AIC Optane drives, that is a bit more expensive than I'd like to consider for a home lab scenario. Was hoping for something fairly inexpensive to boost the overall performance. I don't want to lose data per say, but I don't mind a slight recovery time with crashes, or last file lost, etc. I do have a couple UPS's covering my gear as well.

So I spent some time to do some testing, and maybe I need to try some different setups, but i found this interesting. The largest difference was response time. Is there a recommended test setup for Iometer? I did a 1GB limit, 32 outstanding I/O per target, 30 sec warmup, 1 min runtime.

Between a mirrored vdev pool and a raidz2 pool, the IOPS, and overall throughput seem to be very similar, but the response time I assume is where everyone is talking about greater iSCSI performance? Please let me know your thoughts. Thanks!

ALL DISK RAIDZ-2
TrueNAS - All Disk RaidZ.png



ALL DISK MIRRORED VDEV
TrueNAS - All Disk Mirror.png
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
Bump... Any thoughts surrounding the performance differences or even better a recommended set of settings for Iometer to make sure I am testing the right way? Thanks!
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Been busy, apologize for the silence here.

I suspect that the ASMedia chip will add latency, which is not what one wants in a SLOG.
My suspicions as well. Solutions such as the 2-in-1 10Gbps NIC + PCIe M.2 combo cards produced by QNAP show significant added latency in benchmarks, but those devices also "oversubscribe" the PCIe lanes (eg: x8 to board, x4 to each of a 10Gbps NIC and two M.2 slots) so the "interleaving" adds more latency there. Unfortunately I don't believe that bifurcation was ever added to the 12G Dell servers (Rx20) - it's in the R730XD but that doesn't help you out here. A single M.2 to PCIe adaptor card with an Optane 16G or 32G would be sufficient here.

I don't want to lose data per say, but I don't mind a slight recovery time with crashes, or last file lost, etc. I do have a couple UPS's covering my gear as well.
The challenge with block/VM storage is that it's not the "last file lost" it's the "last bits lost" and that can have unpredictable impact on a VM disk or VM filesystem. If those last bits were just heartbeat updates or other unimportant data, no problem. But if it was filesystem metadata or something that was critical to the VM to be stable, you're rolling back to a snapshot at best and restoring the entire contents of the LUN from backup at worst. A single SLOG will make it significantly safer (eg: 99.9%) it just won't protect from the case of "simultaneous crash and SLOG fail" - a single SLOG is probably fine for a home use scenario.

The largest difference was response time. Is there a recommended test setup for Iometer? I did a 1GB limit, 32 outstanding I/O per target, 30 sec warmup, 1 min runtime.
The small size of the test means it was completely captured by your ARC on reads, and if you haven't set sync=always your writes were just merrily going into RAM as well. The other challenge is that performance on a clean pool is basically the maximum you'll ever get - data isn't fragmented, there's lots of free space to write to, and with 384G of RAM you're experiencing zero cache contention.

You could try simulating a 0% cache hit rate by setting primarycache=metadata on a ZVOL (I recommend you make a second one for this) and re-running the tests. This would force ZFS to actually have to "go to disk" for the reads at least, but you can already see the difference in the maximum response time of Z2 vs mirror (33ms vs 7ms)
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
No worries! Just didn't want this to drop off the radar. I'm eager to get the pools setup so I can start using this thing. :)

A single M.2 to PCIe adaptor card with an Optane 16G or 32G would be sufficient here.
So you are suggesting adding something like the following as SLOG for the iSCSI pool would be the ideal? And if so, I assume I can just add this at a later date as SLOG for the iSCSI pool after I add in the card, correct?

PCIe SSD M2 Adapter
https://www.amazon.com/gp/product/B01FU9JS94

Intel Optane Memory M10 16 GB
https://www.amazon.com/gp/product/B06XSMTN31

The small size of the test means it was completely captured by your ARC on reads, and if you haven't set sync=always your writes were just merrily going into RAM as well.
This makes sense. The testing piece with TrueNAS/FreeNAS is a bit new to me. I've never made a deep dive into performance specs or setup. So when creating or editing a ZVOL, I change Sync to Always and that forces the writes to disk as it comes in, correct? Standard (set by default) brings the writes in to the RAM first, then flushes down to disk? I assume this is a setting I can edit at will on a particular ZVOL for testing purposes?

The other challenge is that performance on a clean pool is basically the maximum you'll ever get - data isn't fragmented, there's lots of free space to write to
I do understand that testing in a clean environment isn't necessarily a fair test since fragmentation and having data on the disks will yield different performance metrics between RAIDZ2 and a MIRROR. I guess I am just trying to establish a general baseline here. But this kinda goes back to the test setup with Iometer and making sure I am setting the right parameters.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Edit: Since I'm apologizing already today, apologies in advance for the long post. :grin:

So you are suggesting adding something like the following as SLOG for the iSCSI pool would be the ideal? And if so, I assume I can just add this at a later date as SLOG for the iSCSI pool after I add in the card, correct?
The adapter is fine, but the 16G Optane is going to severely bottleneck that 10Gbps network card. It's really only good for just over 1Gbps of sync-write throughput.

Take a look at the SLOG benchmark thread in my signature: https://www.truenas.com/community/threads/slog-benchmarking-and-finding-the-best-slog.63521/

You'll really want to get a Optane P4801X M.2 (in an adaptor card, but mind the M.2 22110 - 110mm spec!) or perhaps a DC P3700 or Radian RMS-200 card, in order to let that 10Gbps network really shine. Yes, they're expensive - but you could sell some of that RAM and a CPU to fund it. (The RMS-200's used to be cheap, until they got popular.)

This makes sense. The testing piece with TrueNAS/FreeNAS is a bit new to me. I've never made a deep dive into performance specs or setup. So when creating or editing a ZVOL, I change Sync to Always and that forces the writes to disk as it comes in, correct? Standard (set by default) brings the writes in to the RAM first, then flushes down to disk? I assume this is a setting I can edit at will on a particular ZVOL for testing purposes?
Some generalization follows:

For reads, your most frequently/most recently used data will end up cached in RAM. Because you have 384GB of RAM, you'll basically (thanks to compression) be able to have the most active 384GB of reads never even touch your disks. They'll come flying through at full RAM-speed, so 10Gbps. Once you grow beyond that (eg: 500GB) of active data, then you have to hit the spinning disks.

For writes, the "standard" process for iSCSI means the incoming data lands in RAM only. So the last few seconds of data written are volatile - if your TrueNAS system crashes, they're gone for good, and VMware won't know that until it goes to read them and gets garbage or an error back. Setting sync=always forces ZFS to ensure the data is on stable (non-volatile) storage before replying back, which closes the gap for data integrity. However, without a Separate LOG Device (SLOG in common parlance) the only "stable storage" is the pool itself - and hard drives do a very poor job of small, random writes. So you add a fast SLOG to regain some of that lost performance.

The thought process is "sync=standard is unsafe, but fast. If safety isn't negotiable, set sync=always to make it safe. Then add an SLOG to make it fast again."

You can edit the sync status on the fly on a ZVOL by ZVOL basis in the GUI. The primarycache value can be changed on the fly as well but requires editing from the CLI.

Two other things that can only be changed at time of ZVOL creation are the volblocksize and the sparse property, I want to note these for additional benefits.

The former is analogous to recordsize in that it's an upper bound on the biggest chunk of data that ZFS will write to a record at once (after compression) - smaller values tend to improve latency under a read/modify/write operation, but hurt overall throughput as there's more overhead for large transfers. The default is 16K, but I've had good results in terms of improved compression with a 32K size.

The latter setting (sparse) allows the ZVOL to only consume as much space as is physically written, rather than the logical space. Since the CoW fragmentation tends to shred a VMFS ZVOL in a hurry regardless of thick/thin disk, you might as well set sparse to save yourself from the pool deciding it's "full" prematurely, as well as ensure that the VAAI UNMAP primitive works properly. With that said, make sure not to oversubscribe the storage on your pool - eg, if you only have 10TB of pool space, don't make three 5T ZVOLs based on a 1.5:1 starting compression ratio. If it bloats up, you run out of actual pool space, and then you get to experience the VAAI STUN primitive, which is a lot less fun.

I do understand that testing in a clean environment isn't necessarily a fair test since fragmentation and having data on the disks will yield different performance metrics between RAIDZ2 and a MIRROR. I guess I am just trying to establish a general baseline here. But this kinda goes back to the test setup with Iometer and making sure I am setting the right parameters.
A single test client often can't saturate the system, so you'd either have to coordinate several concurrent IOmeter tests from different clients/VMs/hosts, or leverage something like vdbench or HCIbench to automate the creation and testing process. Generally speaking though, the overall test working set will need to be larger than the cache size in order to see the impact of different underlying disk configurations (mirrors vs RAIDZ) so in your case you'd need to use a working set of over 500G, or even larger if the test data turns out to compress well. You can use the primarycache=metadata as a way to shortcut that, but then it's simulating a scenario where the cache is never used which is also unlikely.

Testing a simple system is challenging, a caching one like ZFS or a tiered one like vSAN is often an exercise in frustration. The closer the test case can mimic your actual usage, the better the results.
 

kjparenteau

Explorer
Joined
Jul 13, 2012
Messages
70
Since I'm apologizing already today, apologies in advance for the long post. :grin:
I really appreciate the fantastic information! I did several rounds of benchmark testing (very time consuming for sure!) and I think I found a sweet spot I can work with and be happy with the results. I made 2 pools. First a RAIDZ2 with (8) disks and ended up with 30TB for the SMB Shares, Plex, etc. I also made a mirrored vdev pool with (4) disks made up of 2 separate mirrors. I also added a 256GB SATA SSD (for now) as cache because it did show a significant performance boost in a VM. Granted as you and a few others have recommended I will move away from this once I can get a couple Optane drives installed, but for now I am pleased with the results.

I did try a full (12) disk mirrored vdev with cache and the VM performance was incredible. But I am not quite ready to throw all the stuff in the SMB shares into a Windows based VM just yet. Maybe version 2.0 will be a full shift in that direction. :)

I absolutely will be keeping this thread handy as I continue my work on this system, but so far things seem to be coming along nicely, so thank you all for the fantastic information and suggestions! MUCH appreciated!
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
I also added a 256GB SATA SSD (for now) as cache because it did show a significant performance boost in a VM. Granted as you and a few others have recommended I will move away from this once I can get a couple Optane drives installed, but for now I am pleased with the results.
You're welcome. But I suspect you called "cache" (= L2ARC) what is actually a SLOG. For the safety of your pool, replace that one with an Optane drive or a Radian RMS-200/300 (there's a handful of these on eBay right now) as soon as possible.
 
Top