Core/Clock sweet spot

Joined
Dec 26, 2023
Messages
17
I'm currently looking into buying EPYC Rome used, since there is a local drop in my Area (Germany), so I can get some good price.
After reading through some forum posts and resources, I do understand that at one point there is no real value to add for cores, and rather higher base clocks should be chosen. My use-case is storage only, no vms, no apps. The system will store large media files and day to day documents. Working directly off the system for video editing is a plus but not a must-have. I'll be starting with 24*8TB SAS12G drives, but will need to extend those with external JBODs soon.

Using my understanding I would go with a 16c cpu (would 8c be also enough?), but now there are some options:

- AMD Epyc 7282, 16C/32T, 2.80-3.20GHz -> The chip has only 4 channels of memory and base clocks are the slowest. (€100EUR)
- AMD Epyc 7302, 16C/32T, 3.00-3.30GHz -> This chip would be the multi-purpose. (€150EUR)
- AMD Epyc 7F52, 16C/32T, 3.50-3.90GHz -> This chip would clock higher compared to the other, but also cost more. (€225EUR)

I'm now looking for the sweet/balance spot between clock and core count. I hope my question make sense?
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
A sweet spot really just depends how you're going to be hitting the server.

How many clients/users?
What's your vdev plan?
How much memory are you aiming for?

What does your edit workload/workflow look like? Multicam? Any complex codecs?
 
Joined
Dec 26, 2023
Messages
17
How many clients/users?
Max. 10, usually around 2-3
What's your vdev plan?
I'm looking at 6-Wide Z2 vdevs, which looks like a very good compromise between performance, efficiency and safety.
How much memory are you aiming for?
Initially I want to deploy 256GB, with the option to extend to 512GB.
What does your edit workload/workflow look like? Multicam? Any complex codecs?
I don't think it is very complex. All are ProRes RAW files with proxy media in either ProRes or DNxHD & we're using Davinci Resolve. Timeline performance is the main concern. Rendered will be into h265, but that is not used for "live" editing.

So I guess to find out the right sweetspot, two questions need to be answered. Does single threaded workloads (like smb) scale infinitely with higher clocks)
And what multiple threaded workloads are important for a storage only system, and how many cores would give still a improvement.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Your bottleneck it likely to be your network, not the rest of the system. As for the CPU choice, given what you have said so far, any of them would be fine. Remember, you are not processing anything on the TrueNAS system, it is purely storage. You don't even need huge amounts of RAM. You might need to think about the real bottleneck which would be the network. The next bottleneck would be your drives and vdev layout. There are some posts in the forum specifically about building a NAS that would support video editing of the files on the NAS. Search for them, you should find a few out there.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
Video editing and media storage is tricky. Especially in shared environments with more than a few clients. So many variables. Typically the network jams up before 24 rust platters though.

Yes, having the a higher clock will improve your SMB performance. I don't have experience with an Epyc running TrueNAS. I wish I did.
 
Joined
Dec 26, 2023
Messages
17
As for the CPU choice, given what you have said so far, any of them would be fine
Ok that means I won't see a big improvement going from 3.00-3.30GHz to 3.50-3.90GHz?As far as I know only smb would benefit from it anyway.

There are some posts in the forum specifically about building a NAS that would support video editing of the files on the NAS.
Thank you. I'll look for them.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Ok that means I won't see a big improvement going from 3.00-3.30GHz to 3.50-3.90GHz?As far as I know only smb would benefit from it anyway.
I don't think so, those are minor speed differences for a device that will be idle most of the time and when SMB does need it, you still have a NIC bottleneck and the pool layout is very important as well. If you were crunching a lot of numbers, get the faster CPU, but that is not the case here.

But I do think it is best for you to find out who has built a NAS for similar needs and see what they did. I also think copying video files to a workstation to manipulate them is better than trying to do it live on the NAS. But I'm not a video editor either.

If you do build a system that is designed for live video editing, creating a thread on it and noting the good and bad things would be helpful for others in the future.

Good luck.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
@joeschmuck Video editing as an industry has mostly been on SAN/NAS appliances since Apple's Xsan products in the early 2000s.
At least anywhere that workflow is shared, feature length or episodic is mostly likely over SMB. ATTO thunderbolt and network products are pretty much ubiquitous in professional over IP environments.

Although we see far more enterprise solutions at that scale. Avid. Quantum. Facilis. SNS. ScaleLogic for ZFS. Arista and Juniper make up almost anywhere I've ever done work with. And as a small editing facilty, you're probably working on Synology or QNAP cause it was off the shelf at B&H and you don't know what an integrator is yet.

With today's cameras all being 4k+ and supporting raw and editing codecs, the demand for solo editors to work off desktop NAS solutions has skyrocketed. I think most of us learned the hard ware that that G-tech direct attached raid units were a really bad investment.

But you can also work smarter. Throwing hardware at most video production problems doesn't solve as much as people would hope.

I am curious if the extra clock speed would have an impact on SMB if we elemeninated the networking and platter bottlenecks. And what that limit actually is. But something I was always told is that I wanted more clock speed for serving files over SMB. My assumption would lead to a lower core count and higher clock speed for video editing centric applications. But damn there are a lot of variables.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
And as a small editing facilty, you're probably working on Synology or QNAP cause it was off the shelf at B&H and you don't know what an integrator is yet.
I laughed at that because I can visualize it happening.
I am curious if the extra clock speed would have an impact on SMB if we elemeninated the networking and platter bottlenecks.
I honestly do not know. You could reduce the storage bottleneck went with a full NVMe using PCIe Gen 5 for storage (that is crazy fast) to eliminate the spinning rust and interface speed limitations for the storage. But you also have a NIC bottleneck and I'm certain you could do some high speed fiber thing that I have never heard of before. Then you get back to the CPU clock and core and hyper-thread count. As I understand it, SMB is a single thread process, not multiple threads. This means that the CPU single thread speed (clock count) is now a factor. I have heard there is something called SMB Multichannel but that sounds like having multiple NIC connections so you can use the same number of threads. 2 NIC's = 2 Threads, practically double the speed. I have no idea SMB Multichannel is working thing, and does TrueNAS have support for it.

So yes, CPU clock speed could be a bottleneck. In the OP's use case, I personally do not see the clock speed being a limitation. The slower CPU listed should not become a bottleneck. That is just my experience and gut feeling talking, not factual proof. We would need to setup a system, test, swap out the CPU, test again and compare.

Here are three links to some benchmark tests, if you scroll down a bit you will see a chart that compares this CPU to other CPUs. This is a good way to get an idea how fast/slow these CPUs are compared to others in the market. But keep scrolling down until you find the chart for Single Thread Rating, that is important for SMB. The numbers are not as impressive sounding as the multi-core tests.

 
Joined
Dec 26, 2023
Messages
17
So yes, CPU clock speed could be a bottleneck. [...] We would need to setup a system, test, swap out the CPU, test again and compare. [...] But keep scrolling down until you find the chart for Single Thread Rating, that is important for SMB. The numbers are not as impressive sounding as the multi-core tests.
You are totally right. Going EPYC is mostly also a decision for more PCIE Lanes and max. RAM.
But as you pointed out another option would be Threadripper, however those are not available as cheap on the used market yet.

As you pointed out we would need to have real world SMB tests, I thought maybe we already have those based on many people being here active.
For example if SMB will be significantly faster on a Threadripper, it might be worth the additional cost. However if those performance difference won't be visible daily and are only synthetic, I also agree that it really doesn't matter.
I even think we don't need to just use NVMe, as soon as a Storage scales to multiple hundreds of Disks, Disks shouldn't be a bottleneck anymore.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
as soon as a Storage scales to multiple hundreds of Disks, Disks shouldn't be a bottleneck anymore.
It is already there. But for your use case, did you want to purchase hundreds of hard drives? My answers were based on your needs, not what Elon Musk can afford.

As you pointed out we would need to have real world SMB tests, I thought maybe we already have those based on many people being here active.
I do not know of a single place this data is compiled but there are people here who I'm sure have tested this out. All I can suggest is Google Searches for "truenas fast smb" or something like that. You may find several of our members who have built systems that you could message and obtain some data.

For example if SMB will be significantly faster on a Threadripper, it might be worth the additional cost. However if those performance difference won't be visible daily and are only synthetic, I also agree that it really doesn't matter.
That speed will likely depend on your network connectivity. Throw a 100Gbit connection at it, see if you can get near those speeds. And then, what connectivity speed do you need. We all want the fastest but "need" was the question.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Mmmh... If your only use is going to be storage for a few users (serviced at the same time), why would you pick a CPU that has so many cores?
Something like an E-2134 would come cheaper and be more suitable for the high speed SMB storage of large media your use case seem to require; granted as others said you will have to consider networking and the HDDs I/O limitations, but that's beyond CPU.

If you are going to edit on it (how many concurrent editors do you want to service??) you will have to either use a crapton of RAM, improve your pool layout, or both.
 
Last edited:
Joined
Dec 26, 2023
Messages
17
why would you pick a CPU that has so many cores?
That's also on of my initial questions, how many cores are really necessary? When (what are the use-cases) to go for more cores?
Not enough PCI Express Lanes, and only 128GB max RAM, that's the reason why I think going with Epyc is better. Am I wrong?
how many concurrent editors do you want to service?
On average 2 concurrent.
improve your pool layout
Yes sure for granted, but for now I'm trying to understand get some good guidelines about core count vs clock speed. There are both ways, either going for very high core count or going for very high clock speeds. If I would map those two parameters on two axes, I'm looking for that area which makes the most sense for storage only TrueNAS.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
Epyc is mostly about lanes. And in some key situations, brute force.

Integrators who spec post-production facilities will sit down and do the math on how many connected clients x stream count are needed to meet the needs and build around that.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
That's also on of my initial questions, how many cores are really necessary? When (what are the use-cases) to go for more cores?​
Since SMB uses a single thread per user, when you start servicing more users concurrently you need more cores; with NFS I believe you are not bound by single thread for user, but the escalation is similar. You also need more cores if you do different things like VMs or apps/jails.
Not enough PCI Express Lanes, and only 128GB max RAM, that's the reason why I think going with Epyc is better. Am I wrong?​
Depends on what you need. I don't think you need a crapton of lanes unless you go M.2 or U.2, in your case I guess you will need enough bandwidth for an HBA and a 10Gbps network card... if yout objective is simple storage of media files, 128GB is plenty and there is also the option of L2ARC (up to 1TB I would say, better to stay around 1:4 or 1:6 ratios of RAM to L2ARC). Do you actually have a lane requirement?
On average 2 concurrent.​
Generally you want to have all your working set in RAM, for 2 users I would say your pool layout is fine. I don't think you need many cores for your use case, but powerful ones... 4C/8T looks enough imho since you are just servicing your editors' machines and not actually computing on the NAS.​
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
A current day post-production build, probably a dual 4th gen xeon with 24 bays and some fast connections, benchmarks at 9 concurrent R3D 8K 7:1 streams and up to 18 4K ProRes HQ streams.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
A current day post-production build, probably a dual 4th gen xeon with 24 bays and some fast connections, benchmarks at 9 concurrent R3D 8K 7:1 streams and up to 18 4K ProRes HQ streams.
I'm confused, what should this be?
 
Joined
Dec 26, 2023
Messages
17
28GB is plenty and there is also the option of L2ARC (up to 1TB I would say, better to stay around 1:4 or 1:6 ratios of RAM to L2ARC)
Interesting, I kind of gathered with all information online, that it is always better to add more RAM than adding L2ARC.
Do you actually have a lane requirement?
The only lane requirement would be if the Storage expands with additional hard drives, and the need for more HBA. I don't want to update the controller, once we add more JBODs to it.

Start servicing more users concurrently you need more cores
Ok so I can conclude, that for a storage only use-case "generally" it is safe to say you can calculate one core / concurrent user. Obviously there is a lower minimum but after that I could map core counts to users.


The only topic I still don't fully get, how does clock speed scales? Does higher clock speed always mean better performance? Is there a upper boundary where we start to see no improvements anymore. Or are all "modern" CPU fast enough to not be the bottleneck?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Interesting, I kind of gathered with all information online, that it is always better to add more RAM than adding L2ARC.
That's true, but I don't think it's a good idea go for a more expensive platform that's not really suitable for your needs just because it holds more RAM... the basic advice is to max out the motherboard's RAM first then add L2ARC; with 128GB od ARC + 500/750/1000GB of L2ARC (depending on the ratio you want to achieve, the best for performance is 1:4, but you can go up to 1:8) I don't think you will have an issue... how large is your working set usually? I doubt more than 300GB each, but it's not really my field of expertise... as far as I know you could edit days of 4K video.​

The only lane requirement would be if the Storage expands with additional hard drives, and the need for more HBA. I don't want to update the controller, once we add more JBODs to it.
I mean, with a X11SSCH-F you would have:
  • 8 SATA ports in the motherboard
  • two M.2 3.0 x4 slot for the L2ARC and something else
  • an internal USB for the boot drive or something else
  • two PCIe 3.0 x8 (one in a x16 slot) for an HBA (either a 9200-16 with 6Gb/s or a 9300-16 with 12Gb/s or a 9305-24i with 12Gb/s) and a 10Gbps card
Which means you wouldn't have space for a SAS expander... you have the option to get the 24i HBA in order to gain expansion for a VDEV more, but from then on you will have to increase the HDD sizes in order to expand. This is the best shot with the E-2100/2200 CPUs, if it's not enough we can look at other CPU options.

Ok so I can conclude, that for a storage only use-case "generally" it is safe to say you can calculate one core / concurrent user.
With SMB it's one thread per user, I purposefully suggested a CPU with HT in order to leave you with a bit of headroom for future expansion; there is also the 4C/4T version if you are interested.

The only topic I still don't fully get, how does clock speed scales? Does higher clock speed always mean better performance?
In the ideal world, yes; we are limited by network, drives, and clients.
 
Last edited:
Joined
Dec 26, 2023
Messages
17
how large is your working set usually?
If I understand your question correctly, you ask about the average file size?
While that depends on Project length, but on average I would say one file size is ~500-700GB. Each editing Project could include multiple of those. But I don't think know how Software loads those, in theory you don't need to load all of the file, but rather only "sections" of it.
This is the best shot with the E-2100/2200 CPUs, if it's not enough we can look at other CPU options.
I don't really think this fits into this topic/thread, but we do generate ~2-4TB new Data per week. So expansion is kinda important.
I think it might be better to open another thread in another category with the current design and get some suggestions there.
 
Top