Build advice for dealing with large image sequences + 10GbE

Status
Not open for further replies.
Joined
Jun 3, 2016
Messages
7
Hi all. I’m hoping you can help point me in the right direction because I have quite a specific use-case.

I’m a motion graphics designer / 3D generalist, and I work between both Mac and PC. I spend my days dealing with large image sequences (PNG, TIFF and OpenEXR generally - mostly zip compression) that can be anywhere from 200KB to 200MB per frame… often totalling 10’s or 100’s of GBs for the whole sequence which can be many thousands of frames. This data gets shuffled around between my workstation and 3 render nodes as well as a render server.

I’ve been using direct attached storage for working files, and Synology/Qnap/OSX network storage for the last few years, but gigabit networking is starting to get me down, and having recently moved to the UK and sold off a lot of my older gear it is time for a new build.

I’ve worked on many different SANs and storage systems in different TV/advertising studios over the years and have found that many of them choked on image file sequences. I did recently have quite a good experience* working off an all-SSD 8 bay Qnap with 10GbE and that has encouraged me to go down this route (minus the Qnap). (I’d add that a lot of the issues here have to do with having a lot of users - a problem I won’t have).

So two main questions (with sub questions!):
  • Any ideas on what I can do to speed up the image sequences… I’ll be using SSDs but is this a question of IOPs? Or pure read/write speed? How does compression affect this? Are consumer SSDs OK? Very curious to hear some opinions.
  • I haven’t managed to find any really solid info on the relationship between GHz and 10GbE speeds. Is it all about clock speed? Or is it also the protocol? CPU recommendations most welcome. I’d be happy with 500-700Mbps if that were possible.
I’d note that it will really only be myself and one other user hitting the NAS hard, but there will be constant traffic from the render server as I will have its repository on there too. I will have my working files (the image sequences) on the NAS.

Thanks for reading all of this and thanks in advance for your ideas. Hope I've included all relevant info.
Lindsay
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I’ll be using SSDs but is this a question of IOPs? Or pure read/write speed?
Both, I would think.
Are consumer SSDs OK?
My first thought would be Samsung 850 Pro, but I'm a Samsung SSD fanboy, so don't pay too much attention. The SSD reviews on anandtech.com include all kinds of benchmarks that might be useful. I bet performance consistency will be important to your happiness, which might lead you to a different brand.

You will most likely want to use striped mirrors, even with SSDs, and more RAM than you ever imagined you would own.
 

AlainD

Contributor
Joined
Apr 7, 2013
Messages
145
Hi

As I understand it it will be large (>100 GB) sequential writes. The maximum speed for 10GbE will be around 1000MB(ytes)s, if both ends can cope with the volume.

In a RAIDZ2 (or RAIDZ3) the sequential speed is a function of the sum of the speed of all the datadisks. So for a 6 disk RAIDZ2 you get theoretical 4x the sustainable sequential speed of the individual drives.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
it it will be large (>100 GB) sequential writes
I figured that with image sequences it wouldn't be pure sequential, because there would be metadata access as each individual image is accessed, so IOPS would be just as important as sequential performance.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
In a RAIDZ2 (or RAIDZ3) the sequential speed is a function of the sum of the speed of all the datadisks. So for a 6 disk RAIDZ2 you get theoretical 4x the sustainable sequential speed of the individual drives.
Are you certain about this? My understanding is that ZFS RAIDZ write performance is limited to the write performance of the slowest disk in the RAIDZ vdev and that the same rule pretty much applies to reads as well, in the real world.

Since mirrors always deliver the most IOPS and fastest performance, @Lindsay Horner will probably be best served using a pool comprised of mirrors, all else being equal.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
@Lindsay Horner : I know you're in the UK... but something like this 2U 24 bay Supermicro machine, once equipped with 10Gbe capacity, might be worth considering:

http://www.ebay.com/itm/131833738828
SC216A-R900LPB_spec.jpg


Loaded with 24 x 1TB SSDs in mirrored vdevs, it would provide ~12TB of really fast storage. Would that be adequate for your storage needs?

Supermicro has a world-wide presence, so you should be able to obtain their equipment in the UK, but you may not be able to get the good prices we've been seeing on such gear on eBay here in the US.
 
Joined
Jun 3, 2016
Messages
7
Thanks for your replies, everyone.

@Spearfoot Wow, that is quite a setup! I was surprised at the price, less than I might have imagined. It's probably more than I need right now to be honest... I was thinking more along the lines of 8 X 1TB SSDs for a total of 4TB of working storage plus some spinning drives (I already have some 5TB HGST's) for archives etc. In that situation with the SSDs what exactly is the setup you'd recommend with the mirrored vdevs?

@Robert Trevellyan Interesting about the metadata etc, that would explain some of the slow-down I've experienced.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
@Spearfoot Wow, that is quite a setup! I was surprised at the price, less than I might have imagined. It's probably more than I need right now to be honest... I was thinking more along the lines of 8 X 1TB SSDs for a total of 4TB of working storage plus some spinning drives (I already have some 5TB HGST's) for archives etc. In that situation with the SSDs what exactly is the setup you'd recommend with the mirrored vdevs?
@Lindsay Horner, you've pretty well spec'd your system. It sounds like two pools would best suit your needs: one pool made up of SSDs (4 vdevs of 2 mirrored 1TB drives) would provide ~4TB of really fast storage for rendering and whatnot; another pool of spinning rust for slower archival storage. We can't really give any sensible pool layout advice without knowing what your storage requirements are, but in general, I like a RAIDZ2 pool of 4 or more 'largish' HDDs because it gives you a nice margin of safety; you can lose any 2 drives and still not lose your data. Your 5TB HGST drives would make a nice basis for creating this second pool.

You'll also have to decide what kind of case to use to fit all this stuff in. Rackmount? Tower? And you'll need to select an appropriate power supply. Given your prospective use case, you're going to want to provide FreeNAS as much memory as you can. Lots of decisions to make!

You might want to look at some of the 'stickies' here on the site having to do with hardware selection and ZFS:

Cyberjock's Hardware recommendations
Cyberjock's Slideshow explaining VDev, zpool, ZIL, L2ARC and other newbie mistakes!
 

AlainD

Contributor
Joined
Apr 7, 2013
Messages
145
Are you certain about this? My understanding is that ZFS RAIDZ write performance is limited to the write performance of the slowest disk in the RAIDZ vdev and that the same rule pretty much applies to reads as well, in the real world.

Since mirrors always deliver the most IOPS and fastest performance, @Lindsay Horner will probably be best served using a pool comprised of mirrors, all else being equal.

To quote you're given blogpost :

"For write bandwidth, you may get more: Large blocks at the file system level are split into smaller blocks at the disk level that are written in parallel across the vdev's individual data disks and therefore you may get up to n times an individual disk's bandwidth for an n+m type RAID-Z vdev.

Unfortunately, in practice, the cases where performance matters most (mail servers, file servers, iSCSI storage for virtualization, database servers) also happen to be the cases that care a lot about IOPS performance, not much bandwidth performance."

For the given use case (lot's of large images, probably single concurrent user and 10GBe) and the desire to use SSD's (which is maybe even not needed, but has lot's higher IOPS) I doubt the IOPS will be the bottleneck. If I understand a zpool of mirrors correctly, it will only use more than one mirror at the same time when using it with concurrent IO requests. In the use case I expect that the limit will those of one drive (Max speed will then be limited by about 500MB/s and the IOPS of one drive (approx. 80.000 IOPS/s)). For a RAIDZ2 with 6 SSD's it will still max out at the IOPS of one drive, but sequential will hit 4*500MB/s and thus will be limited to the 10GBe bandwith.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
For the given use case (lot's of large images, probably single concurrent user and 10GBe) and the desire to use SSD's (which is maybe even not needed, but has lot's higher IOPS) I doubt the IOPS will be the bottleneck. If I understand a zpool of mirrors correctly, it will only use more than one mirror at the same time when using it with concurrent IO requests. In the use case I expect that the limit will those of one drive (Max speed will then be limited by about 500MB/s and the IOPS of one drive (approx. 80.000 IOPS/s)). For a RAIDZ2 with 6 SSD's it will still max out at the IOPS of one drive, but sequential will hit 4*500MB/s and thus will be limited to the 10GBe bandwith.
You may very well be right! The best way to be sure would be to test both designs with the OP's normal workload and see which one works best.

That said, it has been my experience that mirrors deliver better performance than any RAIDz topology, and wide reading supports this view. For example, this informative article, with key quote:
It’s easy to think that a gigantic RAIDZ vdev would outperform a pool of mirror vdevs, for the same reason it’s got a greater storage efficiency. “Well when I read or write the data, it comes off of / goes onto more drives at once, so it’s got to be faster!” Sorry, doesn’t work that way. You might see results that look kinda like that if you’re doing a single read or write of a lot of data at once while absolutely no other activity is going on, if the RAIDZ is completely unfragmented… but the moment you start throwing in other simultaneous reads or writes, fragmentation on the vdev, etc then you start looking for random access IOPS. But don’t listen to me, listen to one of the core ZFS developers, Matthew Ahrens: “For best performance on random IOPS, use a small number of disks in each RAID-Z group. E.g, 3-wide RAIDZ1, 6-wide RAIDZ2, or 9-wide RAIDZ3 (all of which use ⅓ of total storage for parity, in the ideal case of using large blocks). This is because RAID-Z spreads each logical block across all the devices (similar to RAID-3, in contrast with RAID-4/5/6). For even better performance, consider using mirroring.

All of which implies that you may be right... but only briefly, when the the pool is new, unfragmented, and empty.

And of course, I could simply be wrong! :smile:
 

AlainD

Contributor
Joined
Apr 7, 2013
Messages
145
You may very well be right! The best way to be sure would be to test both designs with the OP's normal workload and see which one works best.

The test seems rather easy while building one and could be very interesting.
Full SSD systems are not that common.

RAIDZ2 has off course the advantage that you can lose 2 drives at ones...

For backup and archives you could also think about a second -low spec- system. --> I like data to be on separate physical system/rooms and preferable places.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I was thinking more along the lines of 8 X 1TB SSDs for a total of 4TB of working storage
RAIDZ2 has off course the advantage that you can lose 2 drives at ones...
RAIDZ2 can lose 2 drives from one vdev without data loss. The interesting thing about striped mirrors is that, while the loss of 2 drives from the same vdev means the pool is lost, the more vdevs you have, the less likely it is that a 2nd drive loss will be from the same vdev. With 8 x 1TB in striped mirrors, you can lose 4 drives without data loss. The catch is, it has to be the right 4 drives.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
The interesting thing about striped mirrors is that, while the loss of 2 drives from the same vdev means the pool is lost,

What an intriguing statement. That wouldn't seem to be a reasonable outcome if you've got three way mirrors.

Since mirrors always deliver the most IOPS and fastest performance, @Lindsay Horner will probably be best served using a pool comprised of mirrors, all else being equal.

Mirrors do not always deliver the fastest performance. A vaguely wide RAIDZ2 will be faster than mirrors for large sequential writes. This would be the classic counterexample.

Any ideas on what I can do to speed up the image sequences… I’ll be using SSDs but is this a question of IOPs? Or pure read/write speed? How does compression affect this? Are consumer SSDs OK? Very curious to hear some opinions.

You can always experiment with compression. Consumer SSD's might be okay. This is primarily a workload problem. A consumer SSD is generally rated for a given number of GB written per day (~20-150) and if you are planning to dramatically exceed that limit, you need to consider whether a decreased lifespan of the components might be acceptable. For example, I've been tossing Intel 535's and Samsung 850 Evo's into gear lately on the theory that if I burn them out in two or three years, the price of SSD's will have fallen substantially and I still save more by avoiding the much pricier data center grade SSD's. All our SSD's are deployed RAID1 style so this isn't a data loss issue at all.

I haven’t managed to find any really solid info on the relationship between GHz and 10GbE speeds. Is it all about clock speed? Or is it also the protocol? CPU recommendations most welcome. I’d be happy with 500-700Mbps if that were possible.

700Mbps? 700 megabits per second?

Okay, so anyways, higher clock speeds tend to produce better results than lower clock speeds plus lots of cores. That means that something like an E5-1650 v3 is a fairly hot NAS platform that can easily hit 256GB of RAM if needed, and is both cheaper than and better than a low end dual E5-2609 v3 which seems to be what people end up selecting.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
In the context of a quote that clearly applies to 2-way mirrors, but yes, for precision I should have stated that.
What about two way mirrors and then a pair of hot spares? That may be an option that sits between RaidZ2 and three way mirrors.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
What about two way mirrors and then a pair of hot spares? That may be an option that sits between RaidZ2 and three way mirrors.

It's a possibility. Part of this involves needing to know just how much trouble there would be if there was a catastrophe and two drives in a two-way mirror vdev failed, taking down the pool. This is an *unlikely* scenario to begin with, it's just something that needs to be considered.

The other thing is that if this is long runs of sequential data (image files), it's totally possible that mirrors are overkill and RAIDZ2 could be just dandy. In that case, RAIDZ2 or even RAIDZ3 would be a better choice.
 

AlainD

Contributor
Joined
Apr 7, 2013
Messages
145
A remark : SSD have high IOPS, maybe the SSD is not the limiting factor for IOPS, but other server (or client) parts.

I looked at http://www.anandtech.com/show/10296/the-sandisk-x400-1tb-ssd-review/6
(not only nr's for the X400, but also for others)

The sandisk X400 1TB has about 7500 read 4KB IOPS with queue depth 1 and about 2700 steady write 4KB IOPS (worst case) . (normal HDD's have about 100 IOPS as a rough guide)

The sequential 128K read is "only" 500MB/s or about half of a 10 GbE. 128K sequential write is "only" about 300 MB/s or less than 1/3 of 10 GbE.

As I understand the use case there will mostly only one concurrent user that's taxing the server. I believe that a 6x SSD RAID-Z2 will be able to saturate the 10GbE for sequential read and writes. (if server and client can handle it.) I doubt if 7500 QD1 read IOPS are the bottleneck for the client appication.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
For ZFS, the likelihood that you have QD1 accesses is ... hahahahahahahahah

Sorry, you implied a funny. :smile:

I would definitely be concerned more about other aspects of the system. The CPU can be a limiting factor, in particular.
 
Status
Not open for further replies.
Top