Budget hardware recommendation with 10 GBit/s

stefan_o

Dabbler
Joined
Jun 20, 2023
Messages
10
Hello,
I have not found any recent thread like this, so I'm starting a new one.
I'm want to replace a bunch of external hard driver with a NAS as a proper storage solution. Unfortunately the TrueNAS mini is a bit out of my price range, so I'm most likely looking for used hardware. The NAS is mostly used as storage for video editing, so high read speeds are needed, while write speeds are of low priority.
My requirements:
-4 drives
-10 GBit/s LAN
-Low noise
-Reasonable size (no massive 19" device)
What (used) hardware is recommended? Is an old Dell workstation with 4 core Xeon Haswell quadcore upgraded with a NVMe drive as cache and a 10 GBit/s Ethernet card a good choice? Any better ideas?
Best regards
Stefan
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
I don't have specific hardware to recommend. However, a ZFS Cache device, (aka L2ARC), is not really something you implement without looking first at the amount of RAM the server has.

For example, adding a 1TB L2ARC when you only have 16GBs of RAM is not recommended. Generally you max out your RAM first before adding a L2ARC device. And even then, generally 5 times memory is the size for L2ARC, going up to 10 times if you understand what you want to cache.

You don't mention HOW you plan on redundancy with those 4 drives. The safer option is RAID-Z2, but that eats up 50% of the space for redundancy. Speed wise, Mirrors might be best. Except again, that is 50% of the space used for redundancy.

Next, how much space do you need?

If you use massive hard drives, like 22TBs in RAID-Z1, that's not recommended. Even in a 2 way Mirror, that might be problematic.

Last, it is important you don't get SMR type hard disks. The last 4 years they became a problem because a certain vendor introduced them into their NAS like of disks without clearly documenting that change. This caused massive problems, especially with ZFS.
 

stefan_o

Dabbler
Joined
Jun 20, 2023
Messages
10
For example, adding a 1TB L2ARC when you only have 16GBs of RAM is not recommended. Generally you max out your RAM first before adding a L2ARC device. And even then, generally 5 times memory is the size for L2ARC, going up to 10 times if you understand what you want to cache.
ok, but for me it is desirable than more data is available at better speeds (let's say 500 MB/s) rather than that a small amount is available at very high speeds (1200 MB/s). Can this be achieved by a large L2ARC cache (1TB) even with moderate RAM (32GB)?
Next, how much space do you need?

If you use massive hard drives, like 22TBs in RAID-Z1, that's not recommended. Even in a 2 way Mirror, that might be problematic.
I would reuse the devices I currently use as external USB drives (I'm on a budget), that are 8TB NAS grade HDDs. I would use 3 devices with RAID-Z1. In the past I had a failed device in a traditional Linux Software-RAID5, I just ordered a new one, replaced the defective and RAID was reconstructed without any problem. During the time a device was missing I could still access the data, it was just incredibly slow. I assume RAID-Z1 is similar?
Last, it is important you don't get SMR type hard disks. The last 4 years they became a problem because a certain vendor introduced them into their NAS like of disks without clearly documenting that change. This caused massive problems, especially with ZFS.
I'm aware of that problem (and that in the past that SMR was used wasn't even mentioned for some devices), I double-checked not to get those whenever I bought drives.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
ok, but for me it is desirable than more data is available at better speeds (let's say 500 MB/s) rather than that a small amount is available at very high speeds (1200 MB/s). Can this be achieved by a large L2ARC cache (1TB) even with moderate RAM (32GB)?

No, it cannot. We recommend no more than a 10:1 ratio of L2ARC:ARC, and we don't even recommend L2ARC until you have at least 64GB of ARC. On a 32GB system, doing 5:1 L2ARC:ARC may be plausible in some circumstances but that is only 160GB L2ARC. The L2ARC pointers that the ARC needs to maintain eat a lot of the ARC capacity, and starve the system of ARC needed to generate proper MRU/MFU statistics of the cached data.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Is an old Dell workstation with 4 core Xeon Haswell quadcore upgraded with a NVMe drive as cache and a 10 GBit/s Ethernet card a good choice?
For a few drives, an old desktop with a cheap 10 GbE NIC (e.g. Solarflare) is a reasonable option. But for ZFS you'll want ECC if possible and, in any case, as much RAM as possible—at least 32 GB for 10 GbE.
Older would be cheaper, but could end up being limited in RAM capacity, especially if its not RDIMM. Too old hardware may also not be very energy efficient.

Forget the idea of using a NVMe cache.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
I would reuse the devices I currently use as external USB drives (I'm on a budget),
...
Use of USB attached drives in a ZFS data pool is HIGHLY discouraged for multiple reasons:
Of course, if you "shuck" them, that works. (Shucking in case you, or another reader does not know, means removing the SATA drive from the USB enclosure. Then using the plain drive.)

...
I would reuse the devices I currently use as external USB drives (I'm on a budget), that are 8TB NAS grade HDDs. I would use 3 devices with RAID-Z1.
...
Drives larger than about 2TB in a single parity configuration, (like RAID-Z1 or RAID-5), is also discouraged. It may work for you perfectly for the entire life of your NAS. But, their are statistical indications that such large drives might get an unrecoverable read error on one or more "working" drive(s) during a drive replacement. Thus, unrecoverable data loss, (unless you have backups).

Of course the nice thing about ZFS is that it will tell you which file is impacted. Thus allowing more graceful restoration if the failed blocks are limited to data and not multiple metadata blocks, (aka directory info). ZFS does by default keep 2 copies of metadata, and by preference, on different disks, even on RAID-Z1.


...
In the past I had a failed device in a traditional Linux Software-RAID5, I just ordered a new one, replaced the defective and RAID was reconstructed without any problem. During the time a device was missing I could still access the data, it was just incredibly slow. I assume RAID-Z1 is similar?
...
Yes, this is how ZFS' RAID-Z1 would work.


Many of us in the TrueNAS forums are conservative for our NAS builds. We love our data, and want it to have the best chance of surviving. This means RAID-Z2 or at times 3 way Mirrors, and not using USB for data pool devices.
 

stefan_o

Dabbler
Joined
Jun 20, 2023
Messages
10
No, it cannot. We recommend no more than a 10:1 ratio of L2ARC:ARC, and we don't even recommend L2ARC until you have at least 64GB of ARC. On a 32GB system, doing 5:1 L2ARC:ARC may be plausible in some circumstances but that is only 160GB L2ARC. The L2ARC pointers that the ARC needs to maintain eat a lot of the ARC capacity, and starve the system of ARC needed to generate proper MRU/MFU statistics of the cached data.
So a very small ARC and a large L2ARC is not a viable option?
For a few drives, an old desktop with a cheap 10 GbE NIC (e.g. Solarflare) is a reasonable option. But for ZFS you'll want ECC if possible and, in any case, as much RAM as possible—at least 32 GB for 10 GbE.
Older would be cheaper, but could end up being limited in RAM capacity, especially if its not RDIMM. Too old hardware may also not be very energy efficient.
That's why I meant old Dell Workstation with Xeon processor, to have ECC (but UDIMM, not RDIMM), not just any old PC. But more than 32GB RAM is probably not possible. I read that AMD is not well supported on FreeBSD, so Ryzen is not a good option (they have ECC at least on non-G models)?
Of course, if you "shuck" them, that works. (Shucking in case you, or another reader does not know, means removing the SATA drive from the USB enclosure. Then using the plain drive.)
This is what I meant of course. I bought the drives and the USB enclosures separately, that's why I know that this are NAS-grade drives
Drives larger than about 2TB in a single parity configuration, (like RAID-Z1 or RAID-5), is also discouraged. It may work for you perfectly for the entire life of your NAS. But, their are statistical indications that such large drives might get an unrecoverable read error on one or more "working" drive(s) during a drive replacement. Thus, unrecoverable data loss, (unless you have backups).

Of course the nice thing about ZFS is that it will tell you which file is impacted. Thus allowing more graceful restoration if the failed blocks are limited to data and not multiple metadata blocks, (aka directory info). ZFS does by default keep 2 copies of metadata, and by preference, on different disks, even on RAID-Z1.
I'm not to sure if a 2 parity devices are really helpful to prevent data loss: A fire, a failure in the PSU that causes overvoltage etc. will most likely kill all drives, no matter how much parity you have. For really important data you should have at least one backup at another location. So, for quick restore a single parity drive is enough, for two drives failing external backup.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So a very small ARC and a large L2ARC is not a viable option?

No, because as I said, you have to have a certain amount of ARC to sustain the L2ARC. In practice this works out such that you want 64GB minimum.
 

stefan_o

Dabbler
Joined
Jun 20, 2023
Messages
10
For others who may visit this thread, this is what I got:
  • Dell PowerEdge T20 with Xeon E3-1225 v3 (Haswell): Got it for less than 100€ used with a 128GB SSD (and 16GB non-ECC RAM)
  • 32 GB ECC RAM: Got it for 70€ used
  • Intel X550-T2 10G Ethernet card: Got for less than 30€ used (probably a fake card, as the Intel logo is a sticker rather than printed on the PCB), but works fine at 10G (tested with iperf3) and my hopes are that a used fake card is better than a new fake card, because it won't fail immediately)
In total I spend less than 200€ for the NAS (without any drives, for that I use 3 existing NAS-grade HDD drives (WD Red Pro) that have been in USB-enclosures before with RAID-Z1). When writing I have constant transfer rates of 350 MB/s.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Overall my impression is that this is not going to work. Video editing is one of the most demanding use-cases for a NAS. So 4 HDDs and 10 Gbit is simply a non-match. Please provide more details (e.g. IOPS) about the requirements.

As to the search: This subject has been discussed in abundance over at least the last 3 years. You will find plenty of recommendations on the subject.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
The real problem here as @ChrisRJ is pointing out is that is the need for IOPS. When you are scrubbing through a timeline you're going to struggle. Are you the only user of the system, 1 client editing video?
 

stefan_o

Dabbler
Joined
Jun 20, 2023
Messages
10
There is only one client video editing, all others have no high-speed requirements and use the NAS just as general storage/backup. I tested scrubbing through a timeline: A lot smoother than with the previous solution (same HDD (single) connected with USB3). There is almost no noticeable difference compared to internal SSD.
The highest bitrate footage I have is 2160p50 ProRes HQ, which is about 1.5 GBit/s. Plays perfectly smooth, was not possible before.

Of course you can always build something better, but that will be a lot more expensive. For less than 200€ I think this is a very good setup, I couldn't find any other thread for a high-speed low-budget solution (the only one I found is 10 years old). Even the NAS-devices from the two consumer-oriented companies are a lot more expensive with far less power (and proprietary software with horrible bugs, according to reviews some people lost all data with a firmware update, because disks cannot be read anymore and need to be reinitialized)
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
You want to have your working set completely inside the RAM, do do your numbers and consider increasing it... Reaching 64 GB would allow you to use L2ARC.

Having multiple vdevs will help you with IOPS.

Edit:
but works fine at 10G (tested with iperf3)
You have to check its performance under stress/real conditions.
 
Last edited:

rungekutta

Contributor
Joined
May 11, 2016
Messages
146
No, because as I said, you have to have a certain amount of ARC to sustain the L2ARC. In practice this works out such that you want 64GB minimum.
I think it’s time to refresh this general advice, which is often given on this forum, to match current reality. Unless I’m mistaken, modern ZFS needs 80 bytes of RAM for each block in l2arc. So in a worst case scenario of *only* 4k blocks, that’s 80/4096 ~= 2% overhead. Ie 32GB l2arc would in a *worst* case eat roughly 0.7GB RAM, in reality and depending on workload likely significantly less. That would seem like a pretty good trade-off to me even in a 16GB or 24GB RAM scenario, considering that NVMe L2ARC albeit much slower than RAM could still be 1000x faster than hitting the HDD, in a small RAIDZ1 setup like this.

My sense is that this general advice of not even considering l2arc before 64GB RAM comes from a time when a) l2arc required much more RAM overhead in ZFS, b) nvme wasn’t a thing and SSDs were much more expensive (vs RAM), and c) written from PoV of enterprise use cases, where dozens of spinning disks anyway stood up quite well against a couple of SATA2 SSDs for l2arc. As opposed to a modern but small scale SOHO/homelab scenario where a 32GB optane stick for $20 is slower than RAM but still 1000x IOPS vs your spinning array (and even survives a reboot!).

For a long time I ran 256GB l2arc with 32GB RAM, and the memory overhead for l2arc was in the order of 1GB. So 24ish GB ARC, vs 23ish GB ARC + 256GB l2arc. On a relative slow RAIDZ2 array. A no-brainier (and demonstrably much better performance!). Have since also maxed RAM to 64GB.

Time to nuance the message a bit methinks.

Edit: typo and for clarity
 
Last edited:

NickF

Guru
Joined
Jun 12, 2014
Messages
763
There is only one client video editing, all others have no high-speed requirements and use the NAS just as general storage/backup. I tested scrubbing through a timeline: A lot smoother than with the previous solution (same HDD (single) connected with USB3). There is almost no noticeable difference compared to internal SSD.
The highest bitrate footage I have is 2160p50 ProRes HQ, which is about 1.5 GBit/s. Plays perfectly smooth, was not possible before.
The concern here is not that what you are doing is not feasible, the issue is that you should consider shucking the device and NOT using USB with ZFS.

You're going to have a bad day of dataloss if you leave it via USB. You mentioned above something about it - but can you confirm you have shucked the drives?

Also FWIW - I'm glad your build seems to be performing up to your spec. :)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
My sense is that this general advice of not even considering l2arc before 64GB RAM comes from a time when a) l2arc required much more RAM overhead in ZFS, b) nvme wasn’t a thing and SSDs were much more expensive (vs RAM), and c) written from PoV of enterprise use cases, where dozens of spinning disks anyway stood up quite well against a couple of SATA2 SSDs for l2arc.

None of this is true. This advice comes from looking at ARC eviction patterns under common workloads and noticing that it is quite common for users to try to add L2ARC and expect magically better results where they should instead have added RAM. With the advent of "low end" server CPU's starting with Skylake that supported 64GB, there was no longer a good reason to try to skimp on RAM and get by on 32GB. It is certainly possible to devise workloads where 32GB of ARC is sufficient for L2ARC, but we still have people coming by where they think that they can add 1TB or even 2TB of L2ARC to make up for a memory shortfall.

ZFS has never "required much more RAM overhead in ZFS" for L2ARC. Your primary concern is the presence of data in ARC long enough to accrue useful MRU/MFU statistics to make intelligent eviction choices, and "presence of data in ARC" means that you have to hold actual full blocks in ARC and that you want them to stick around as long as possible. This is accomplished by having lots of ARC. You can rob Peter to pay Paul by having too much L2ARC which squeezes ARC, and that has improved a bit over the years, but the primary issue is still the capacity of the ARC.

If you want to argue about ARC sizing, yes of course you can devise workloads where you can get by with "less", just as it's possible to devise workloads where 64GB would be laughably small. But we're trying to give users a reasonable starting point to set expectation levels. It works out pretty well that for a NAS with 1Gbps that 64GB of ARC tends to work out quite well as a starting point for L2ARC.

NVMe is generally meaningless for L2ARC unless you are doing 10GbE or faster. The line speed of SATA or SAS3 is sufficiently fast for L2ARC use at 1GbE.

Enterprise use cases don't enter into this at all. This is primarily a hobbyist forum and only a few of us are supporting more serious use cases. The enterprise people are usually professionals who read the docs and size their systems accordingly.
 

stefan_o

Dabbler
Joined
Jun 20, 2023
Messages
10
The concern here is not that what you are doing is not feasible, the issue is that you should consider shucking the device and NOT using USB with ZFS.

You're going to have a bad day of dataloss if you leave it via USB. You mentioned above something about it - but can you confirm you have shucked the drives?

Also FWIW - I'm glad your build seems to be performing up to your spec. :)
Yes, of course. The drives are connected via SATA. Back then I bought the drives and the USB-HDD-enclosures separately (so very easy to disassemble again) because I knew that these drives may be used in a NAS one day. Also they weren't ZFS before, but exFAT for maximum compatibility. Biggest task was to copy all data somewhere else and after setup of the NAS to copy it all back.

To l2arc discussion: 32 GB is the maximum amount of RAM that Haswell supports, 64 GB is not an option, but NVMe SSD is. I have read speeds of about 290 MB/s now, which means 10G network is not the limiting factor. I tried copying a 10GB file two times, the second time I had a transfer speed of almost 900 MB/s (file was cached then). This is of course not a typical use case, but in video editing it could be that multiple files are repeatedly accessed simultaneously (overlays/transition etc.) and the size could exceed ARC, so in that case L2ARC could come in handy. I do not plan to add this now, but would nice to know.

@jgreco

What I don't get: One the one hand you say more RAM is better, but on other hand that NVMe is useless in 1G networks. When you are limited to at best 125 MB/s, how does a big ARC help? The only thing I can image is latency. But then again, why wouldn't NVMe help in that case? If the numbers by NickF are correct, and 128GB of L2ARC need about 1GB of RAM, why not 512GB L2ARC, needing 5GB of 32GB. Why is 32GB ARC better than 27GB ARC and 512GB L2ARC? I see that there are not many use-cases where a massive ARC or L2ARC will improve performance, but for these specific use-cases, what's wrong with L2ARC? When it comes to RAM/NVMe speeds, 10G ethernet is the limiting factor.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
What I don't get: One the one hand you say more RAM is better, but on other hand that NVMe is useless in 1G networks.

This is simple math. Modern SATA or SAS3 SSD's are capable of 6Gbps. 6Gbps is larger (by quite a bit) than 1Gbps. Even if we worry about the fact that a SATA SSD might not reliably hit SIX Gbps but maybe only three or four if it is an older one, this is sufficient to fill the 1Gbps.

When you are limited to at best 125 MB/s, how does a big ARC help?

The problem you're dealing with is the ability of ZFS to gather MRU/MFU statistics to make an intelligent eviction decision.

Let's pretend you have an ARC large enough for four ZFS blocks. This is for illustration only, obviously. We're going to look at MFU here, which is Most Frequently Used.

Client reads a file "/pool/file1". ZFS opens "/pool/" (a directory) and reads the contents. Since the directory is less than 1MB, it fits in one ARC entry. The ARC MFU for this block is 1. ZFS then opens "file1" and reads that. The file fits in 1MB, so this is the second ARC entry. MFU count is also 1.

Now let's say that there's a file "/pool/file2", client reads it. MFU count for the directory goes to 2, MFU for file2 is 1.

Now repeat for "/pool/file3". MFU count for the directory goes to 3, MFU for file3 is 1.

So we have ARC with MFU stats of 3, 1, 1, 1.

Because one of these files is read a bunch, "file2" gets read another ten times. So now we have ARC with MFU stats of 3, 1, 11, 1.

There is very little point in evicting a block with only 1 access to L2ARC; 1 access is not a predictor that the block is popular, just that it was accessed. 2 might be meaningful. But we have no 2's. We do have a 3 but it is metadata and is therefore maybe not a good choice for eviction. We desperately want to keep the file2 ARC block around in ARC because it has 11 accesses.

What you're looking for when you look for L2ARC evictions is to pick the blocks with MFU access greater than 1 but still relatively low. These blocks can be well served by L2ARC. You prefer to have the very frequently accessed blocks served out of ARC directly. This should make sense.

The problem that you typically run into is that the access patterns on a hobbyist NAS tends not to favor repeated access to the same files over and over again, which means that lots of MFU counts remain at 1. This makes it hard for the NAS to differentiate what blocks would be most useful to evict to L2ARC. You need to hold the blocks in ARC longer until hopefully some of them show up as MFU 2 or 3 or whatever, so that you can evict those blocks to L2ARC knowing that they do get used periodically. Otherwise what ends up happening is you evict effectively random crap out to L2ARC which just causes thrashing and burns through your SSD endurance.

Therefore the larger ARC is often helpful because it allows the ARC to make better eviction choices. Does that always happen? No. Just like everything with ZFS, you really need to look at your workloads and your stats. But we are often confronted here in the forums with guiding users towards sane choices, and in general, sight unseen and after a decade of helping users and asking them for their stats, 64GB seems like a much more reasonable basement value for ARC than 32GB if you are going to use L2ARC.

If the numbers by @NickF are correct, and 128GB of L2ARC need about 1GB of RAM, why not 512GB L2ARC, needing 5GB of 32GB.

I don't know what numbers you're referring to. If it makes you more comfortable, let me repeat again that this is general guidance for general hobbyist-oriented use cases. If you really want to dial in on what you need, you MUST run the stats and do the math with an understanding of how the size of the ARC interacts with your workload. Your ARC basically "cannot" be "too big", that is always a good situation to be in, but your ARC can definitely be "too small" for the reasons outlined above. There is not actually a fixed ratio of these things because ZFS uses a variable block size and your configuration and workload are integral to the mechanics of the mechanism. I can tell you that 1TB of L2ARC with 70 byte overhead for the L2ARC ptr works out to.... 1000000ish records or 70,000,000 bytes of ARC if the blocksize is 1MB, but on the other hand with an 8KB blocksize it's going to be a lot more. Not sure I did the math right. This is what we mean by "workload dependent".

The TrueNAS guide recommends no less than 32GB of RAM and no more than 5x-10x the amount of ARC as L2ARC. These are sensible "will probably work fine" numbers. We typically recommend no less than 64GB here in the forums because it gives you more intelligent eviction choices.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Nuance ahoy!

Current OpenZFS needs 96 bytes of RAM per record in L2ARC, if you're aiming to do the math; so @rungekutta you weren't far off with your 80-byte estimate (and may well have been correct in prior iterations or branches such as ZoL/BSD-ZFS/Oracle/OpenZFS-before-pL2ARC-was-merged) but it's not just about "memory footprint of headers" but also about the capability of the L2ARC feed thread to find relevant data as @jgreco summarized.

To build on those posts, another important note is the general behavior of L2ARC as a ring buffer - there's no concept of "MFU" or "MRU" for blocks residing in L2ARC. When the device becomes full, the oldest blocks simply get overwritten to make room for the newest candidates - even if those "old" blocks potentially were better candidates to stay on the SSD.

There's some good information in this SNIA PDF on "L2ARC in the Era of NVMe" - although it does need to be updated again, as four years is a long time in the technical space and there have been some other knobs we can adjust, such as having L2ARC feed from the MFU tail only, which helps address the "wasted feeds" and "write churn" issues laid out above.

 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
What numbers are you quoting from me? I didn't quote any numbers here...
What I don't get: One the one hand you say more RAM is better, but on other hand that NVMe is useless in 1G networks. When you are limited to at best 125 MB/s, how does a big ARC help? The only thing I can image is latency. But then again, why wouldn't NVMe help in that case? If the numbers by NickF are correct, and 128GB of L2ARC need about 1GB of RAM, why not 512GB L2ARC, needing 5GB of 32GB. Why is 32GB ARC better than 27GB ARC and 512GB L2ARC? I see that there are not many use-cases where a massive ARC or L2ARC will improve performance, but for these specific use-cases, what's wrong with L2ARC? When it comes to RAM/NVMe speeds, 10G ethernet is the limiting factor.
 
Top