Is 24 drives in RaidZ3 with Toshiba N300s safe?

Status
Not open for further replies.
Joined
Aug 10, 2018
Messages
46
Is 24 drives in RaidZ3 with Toshiba N300s safe?
Here, it says that the N300 is only rated for up to 8 drives - I don't really understand what that recommendation is about?
Is me filling our 24 drive bay with 8TB drives with 3-drive redundancy and spares on-hand a bad idea?
https://www.ebuyer.com/788983-toshiba-8tb-n300-nas-internal-hdd-bulk-at-ebuyer-com-hdwn180uzsva
(I like the idea of high capacity utility, high read speed and I'm not worried about low write speed because I'm planning to use an M2 SSD write cache (mirror)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I wouldn't do it. If you're putting that many drives in a single vdev together it's not going to perform well. It'd be better to have two vdevs of 12 disks each at RAID-Z2.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 
Joined
Aug 10, 2018
Messages
46
Why would it not perform well? Would you not get a peak theoretical read speed of 21 x 1-drive read speed and a write-speed determined by your SSD write-cache?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Why would it not perform well? Would you not get a peak theoretical read speed of 21 x 1-drive read speed and a write-speed determined by your SSD write-cache?
I will give you some links to some reading that you should definitely do, but to try and make a very long story short, each vdev gives about the performance of a single drive. So, comparatively, it would be slow to have all those drives in a single vdev. If you don't mind it being slow, I guess it isn't that big of a problem. It will work. The recommendation is that RAIDz2 vdevs be no larger than 12 drives and RAIDz3 vdevs be no larger than 13 drives, but I know at least one situation where a pool was configured with 45 drives in a single vdev and it worked, but slowly. It really depends on the use the system will be put to. If you have a large number of clients that will be accessing the system simultaneously, you will want more IOPS and IOPS increase based on the number of vdevs. More vdevs give more IOPS. So, for example, in one of our storage systems at work, we have around 300 drives, not because of storage capacity because each drive is only around 1 TB. The reason is for the IOPS.

Did you read the manual? Really a lot of good info in there.
http://doc.freenas.org/11/freenas.html

Updated Forum Rules 4/11/17
https://forums.freenas.org/index.php?threads/updated-forum-rules-4-11-17.45124/

Slideshow explaining VDev, zpool, ZIL and L2ARC
https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/

Terminology and Abbreviations Primer
https://forums.freenas.org/index.php?threads/terminology-and-abbreviations-primer.28174/

Why not to use RAID-5 or RAIDz1
https://www.zdnet.com/article/why-raid-5-stops-working-in-2009/
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
a write-speed determined by your SSD write-cache?
PS. That isn't how ZFS works. That is a really complex thing.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
What kind of work are you planning for this system?
 
Joined
Aug 10, 2018
Messages
46
Thankyou for your links, I'll go through them ASAP. This is a storage server for large amounts of sensor data for R&D that will make that data available to probably just a couple of clients as an NFS. I've been tasked with designing and building it. I've set up a FreeNAS NAS before with a simple mirror. Not designed a massive storage array before.

I'm quite surprised that a single vdev will give roughly the read/write of a single disk - perhaps raid is the better option for us -if a raid6 of 10 drives will give us up to an 8-drive read speed up and a vdev won't give us any improvement at all if I've understood you correctly?
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Thankyou for your links, I'll go through them ASAP. This is a storage server for large amounts of sensor data for R&D that will make that data available to probably just a couple of clients as an NFS. I've been tasked with designing and building it. I've set up a FreeNAS NAS before with a simple mirror. Not designed a massive storage array before.

Wait a minute, are you also the OP ( @rag ) or did I miss something here? :confused:

Either way, you need to know the nature of the workload for both reads and writes.
  • will it be random or sequential?
  • what's the average file size?
  • does the client-side require low latency, or just huge bandwidth?
  • is data modified in-place, or manually copied remotely and back?
  • how much concurrent access is expected?
 
Joined
Aug 10, 2018
Messages
46
I have read the manual cover to cover in the past and I see that this presentation recommends that RAIDZ2 vdevs "should have the total number of drives equal to 2n + 2. (ie 4, 6, 10, etc)" in order to ensure alignments of 4k sectors and that total number of drives in any one vdev should never exceed 11. I'm certainly going to take this advice about not constructing massive vdevs. Are there any kind of docs that you are aware of that will help me design our zpool by giving me some notion of what kind of performance I might expect from different configurations? Apparently my assumption that it would be similar to the analagous raid configs is way off.
Update:
Excellent resources concerning IOPS and read/write speeds both theoretical and measured:
https://www.ixsystems.com/blog/zfs-pool-performance-1/
https://calomel.org/zfs_raid_speed_capacity.html
 
Last edited:
Joined
Aug 10, 2018
Messages
46
Wait a minute, are you also the OP ( @rag ) or did I miss something here? :confused:
Hah, yes I am - I created an account here at home and at work and forgot about it until you pointed this out - I'll delete one of them soon, so that it's less confusing in the future.
The workload is likely to be largely sequential for both reads and writes - we will dump a big file (maybe 10GBytes or several tens of GBytes)
Most of the data won't be modified after creation. it will be dumped there by the experiment apparatus in one big blob and then read by a different client later on, across NFS for analysis. I can't imagine latency being an issue as it will all be on the LAN? (perhaps I've misunderstood you?) High Bandwidth is highly desirable - 1Gbit/sec or more would be the aim. Read speed is more important than write speed. Concurrent access probably won't happen much - most of the time it will be a single client. Probably occassionally two.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Would you not get a peak theoretical read speed of 21 x 1-drive read speed and a write-speed determined by your SSD write-cache?
Nope, IOPS of a single disk. I don't know if it's bearable with SSDs and a 24-wide vdev, but I guess you'd want SSD performance, which will probably not be the case.

Hah, yes I am - I created an account here at home and at work and forgot about it until you pointed this out - I'll delete one of them soon, so that it's less confusing in the future.
Ugh, that's going to be a pain in the ass for the admins to sort out.
@JoshDW19 - can you reassign the old account's posts to the new one and then delete the old account? That would seem to be the simplest solution short of leaving things as they are.
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
Is 24 drives in RaidZ3 with Toshiba N300s safe?
Here, it says that the N300 is only rated for up to 8 drives - I don't really understand what that recommendation is about?
Is me filling our 24 drive bay with 8TB drives with 3-drive redundancy and spares on-hand a bad idea?
https://www.ebuyer.com/788983-toshiba-8tb-n300-nas-internal-hdd-bulk-at-ebuyer-com-hdwn180uzsva
(I like the idea of high capacity utility, high read speed and I'm not worried about low write speed because I'm planning to use an M2 SSD write cache (mirror)

This would refer to the drive not being certified in a greater than 8 bay chassis, likely vibrational tolerance.

For work, critical data I would suggest WD Gold. That's our go to for work.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Hah, yes I am - I created an account here at home and at work and forgot about it until you pointed this out - I'll delete one of them soon, so that it's less confusing in the future.
The workload is likely to be largely sequential for both reads and writes - we will dump a big file (maybe 10GBytes or several tens of GBytes)
Most of the data won't be modified after creation. it will be dumped there by the experiment apparatus in one big blob and then read by a different client later on, across NFS for analysis. I can't imagine latency being an issue as it will all be on the LAN? (perhaps I've misunderstood you?) High Bandwidth is highly desirable - 1Gbit/sec or more would be the aim. Read speed is more important than write speed. Concurrent access probably won't happen much - most of the time it will be a single client. Probably occassionally two.

Sounds like this is a "repository" style workload, where things won't ever really be interactively accessed on the array, but rather read and written in huge bulk quantities.

For this, I would go with the two 12-disk Z2s as suggested by @Chris Moore - I would also consider recordsize=1M on the dataset in question if you're going to be writing files in the "tens of GB" size.

Latency is absolutely an issue if you're working interactively with the data (eg: reading and writing directly to the array, using it to host virtual machines, etc) but for your workload it really sounds like the data will just be dumped there, then copied off to another worker machine, and processed from its local storage.
 

JoshDW19

Community Hall of Fame
Joined
May 16, 2016
Messages
1,077
Ugh, that's going to be a pain in the ass for the admins to sort out.
@JoshDW19 - can you reassign the old account's posts to the new one and then delete the old account? That would seem to be the simplest solution short of leaving things as they are.

Done. I did the reverse because the majority of the posts were associated with the older account. It's a tedious and slow process.
 
Joined
Aug 10, 2018
Messages
46
Would a Slog device improve write speed at all for very large ( eg. > 5GB file) writes? If not, would it be a terrible idea to hack some kind of SSD write buffer in the form of an NFS Shares mirrored M2 SSD pair that users who just want to dump some data on the storage quickly can drop into which will then rsync with the main pool and then delete what was deposited once the rsync is complete.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Would a Slog device improve write speed at all for very large ( eg. > 5GB file) writes?
Highly unlikely. Unless you're doing sync writes, an SLOG is an expensive paperweight, and I doubt you are.

If not, would it be a terrible idea to hack some kind of SSD write buffer in the form of an NFS Shares mirrored M2 SSD pair that users who just want to dump some data on the storage quickly can drop into which will then rsync with the main pool and then delete what was deposited once the rsync is complete.
Look, a 24-wide vdev might be dangerously slow. You run the risk of it being slow to the point of uselessness, i.e. data loss by philosophical discussion. If you really want to try this out, you'll need to test it beforehand with data you don't need safely stored, including filling it up and fragmenting stuff.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I'm quite surprised that a single vdev will give roughly the read/write of a single disk

It gives the random I/O performance of a single disk, not the sequential I/O performance.

But imagine a 24 bay raid with the random I/O capacity of a single hard disk. Not pretty... but perhaps with SSDs not so much of an issue. Feel free to test it.

Doubling your vdevs will double your IOPs.

If they are all SSDs, the HBA cards can end up being bottlenecks too.
 
Joined
Aug 10, 2018
Messages
46
Look, a 24-wide vdev might be dangerously slow. You run the risk of it being slow to the point of uselessness, i.e. data loss by philosophical discussion. If you really want to try this out, you'll need to test it beforehand with data you don't need safely stored, including filling it up and fragmenting stuff.

Ok, I'm quite convinced that a 24-wide is a bad idea! I'll do 3 x 8 or 2 x 12 - do you have any ideas on my hacky write-cache idea? I like the idea of giving the users very fast write speeds for cheap but I'm quite prepared to be told that this is a bad idea or hear alternative suggestions.

Also... I notice that all of the suggestions in this thread ignore the advice that RAIDZ2 vdevs "should have the total number of drives equal to 2n + 2. (ie 4, 6, 10, etc)" in order to ensure alignments of 4k sectors and that total number of drives in any one vdev should never exceed 11 in https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/
Is this advice then considered to be outdated or contentious?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Ok, I'm quite convinced that a 24-wide is a bad idea! I'll do 3 x 8 or 2 x 12 - do you have any ideas on my hacky write-cache idea? I like the idea of giving the users very fast write speeds for cheap but I'm quite prepared to be told that this is a bad idea or hear alternative suggestions.
That idea isn't horrible, but I think it would be of limited utility. What is the speed of the network? I'm finding that I can come close to the performance of SSDs with my disk based pool over the 10Gb network I have at home. If you have 24 drives to work with, and you want to give good performance, I would say that six drives per vdev would be a good idea. That's going to give you four vdevs and (depending on the exact drive performance) about 800 MB/s of goodness to the pool. That's just off the top of my head while I am on the way to work. NFS is synchronous by default if I recall correctly, so I think it would be useful to have the SLOG. I also set "sync always" on one of the servers at work where I have a very fast PCIe NVMe drive for SLOG it has improved the pool response to write with small, random, file access. Large sequential access was fine before and became slower, but the important thing in that case was the fast response to small file write. It is a bit of a trade. You can get enough performance from disk to satisfy a 10Gb network but it will take more vdevs. You can add more drives to get that or use smaller vdevs. I actually talked with someone who used four drives per vdev (at RAIDZ2) which costs like using mirror vdevs but gives two drives of redundancy. That gave them six vdevs but they didn't reply back with the observation of performance after the build.
Also... I notice that all of the suggestions in this thread ignore the advice that RAIDZ2 vdevs "should have the total number of drives equal to 2n + 2. (ie 4, 6, 10, etc)" in order to ensure alignments of 4k sectors and that total number of drives in any one vdev should never exceed 11 in https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/
Is this advice then considered to be outdated or contentious?
I didn't get the impression that we are ignoring the vdev size information you linked and it is not that it is wrong. It is only that it is a recommendation, and we realize that other options are available and it depends on how you need the system to perform and the budget for this project. ZFS is very flexible, but the way it is configured will have an impact on your experience with the system. I have a cold storage system at work (for example) that is using 15 drives per vdev, to get more capacity, with four vdevs, it is a certainty that the system would be faster if it was configured differently. That system has around 290 TB of data on it and it works like a champ, just not blinding speed.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
do you have any ideas on my hacky write-cache idea? I like the idea of giving the users very fast write speeds for cheap but I'm quite prepared to be told that this is a bad idea or hear alternative suggestions.

If my earlier post was correct, and users won't be working from this array interactively, you technically won't even need synchronous writes - therefore, no SLOG is required. Async writes are considered "complete" when they're in RAM.

You can adjust the size of the "buffer" that ZFS will allow to be fully of dirty (uncommitted to pool) data, but unless that's on a stable storage device (eg: NAND flash) you also have to accept the risk of losing that much data in case of a crash.

There's also the issue of how quickly you're going to "fill" and "empty" that buffer. Assuming an infinitely large "buffer", if you have a 10Gbps connection and shuttle in a 10GB file, that will take roughly 10 seconds. But you'll only be able to "empty" it at the pool's write speed, whether that's 800MB/s as proposed above, or 400MB/s if things get full and fragmented over time. If the writes are spaced out far enough to let the buffer empty, cool. Nothing bad happens. But keep leaning on it too hard, and you'll fall back to disk speed - and you'll also be impacting the reads while it's doing the writes.
 
Status
Not open for further replies.
Top