A build to run a ZFS storage for mirror server

Status
Not open for further replies.

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
Hello all ,

I'm looking to build a new setup to act as mirror server, and I'm thinking to utilize ZFS for this setup,
we currently have a machine with ~15TB of storage which act as mirror for most Linux distro's and some *BSD and some other FOSS projects.
my current bottle-neck is the I/O as we need to rsync from several places all the content 1-2 a day.
rsync need to go over all this space over and over which create a lot of i/o wait on the server which bring the server load ~ 50-70% all day.

For the new build I'm thinking to go with ~15TB (for start) (and have the option to expand ->20TB->30TB as needed)
Not sure about using RAIDZ2 or go with disk pairs ? what do you think ?

for the I/O problem i'm thinking to use 2x(500/750/1000)GB SSD in mirror for the L2ARC for read cache
and another pair for SLOG/ZIL

My current BW usage peek ~ 300Mb but I expecting to get more traffic in the near future.
and max concurrent connection ~1000, and I'm looking for a grow up to 3000 connection.

I'm thinking to take a HP DL 80/180 G9 2U RACK server (with 64GB of RAM) which can have all the desired capacity we need.

What do you think ? will this setup will work for my needs ?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Hello all ,

I'm looking to build a new setup to act as mirror server, and I'm thinking to utilize ZFS for this setup,
we currently have a machine with ~15TB of storage which act as mirror for most Linux distro's and some *BSD and some other FOSS projects.
my current bottle-neck is the I/O as we need to rsync from several places all the content 1-2 a day.
rsync need to go over all this space over and over which create a lot of i/o wait on the server which bring the server load ~ 50-70% all day.

For the new build I'm thinking to go with ~15TB (for start) (and have the option to expand ->20TB->30TB as needed)
Not sure about using RAIDZ2 or go with disk pairs ? what do you think ?

for the I/O problem i'm thinking to use 2x(500/750/1000)GB SSD in mirror for the L2ARC for read cache
and another pair for SLOG/ZIL

My current BW usage peek ~ 300Mb but I expecting to get more traffic in the near future.
and max concurrent connection ~1000, and I'm looking for a grow up to 3000 connection.

I'm thinking to take a HP DL 80/180 G9 2U RACK server (with 64GB of RAM) which can have all the desired capacity we need.

What do you think ? will this setup will work for my needs ?
I will have to look at the hardware that the system is using. Is that a server you have or one that is to be purchased?
If it is to be purchased, it might be possible to get it configured to support BSD/FreeNAS.
IOPS of the task probably call for mirrors unless you want to throw a lot of disks at the solution.
Is the a budget limited?

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If you're using rsync, you are almost certainly CPU limited.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
I will have to look at the hardware that the system is using. Is that a server you have or one that is to be purchased?
If it is to be purchased, it might be possible to get it configured to support BSD/FreeNAS.
IOPS of the task probably call for mirrors unless you want to throw a lot of disks at the solution.
Is the a budget limited?

Thanks Chris,

This is a new purchase, we like to replace our current mirror server.
the budget constrains is ~2500-3000 US$

As for the OS, I'm still not sure If I will run FreeBSD/FreeNAS on it
or I will just go with something I more familiar with (like Linux and ZoL)
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
If you're using rsync, you are almost certainly CPU limited.

Hi,

as far as I can see my current problem is I/O
my current setup is 2x6 disks in RAID5 (12 disks in total) grouped via LVM into a single VolumeGroup.

screenshot9.png
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
rsync is a cpu limited task. Reading your content is usually network limited until you hit the disk pool limit.

You will not need a slog it will not get used. L2arc is maybe going to be useful but start without it and figure out how to test your read cash hit ratio and see if it's good or not.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
OK, I understand, so I'll remove the SLOG from the plan and leave a room to add the L2ARC in the future if needed.
what do you think, a single disk should be enough ? or use 2 in a mirror ?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
as far as I can see my current problem is I/O
my current setup is 2x6 disks in RAID5 (12 disks in total) grouped via LVM into a single VolumeGroup.
This isn't ZFS terminology. Is that how the old server is setup and you are trying to figure from that how the new server can be improved?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
As for the OS, I'm still not sure If I will run FreeBSD/FreeNAS on it
or I will just go with something I more familiar with (like Linux and ZoL)
If you are more familiar with Linux, the setup might not be a problem, but there are a lot of 'ease of use' features in the FreeNAS web GUI including some disk monitoring features that I have not been able to duplicate in my ZFS on Linux servers at work.
I am working on my management to get permission to implement some FreeNAS systems to take the place of Linux.
I actually got some quotes from iXsystems to for TrueNAS and we may go that route if I can convince the management.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
This isn't ZFS terminology. Is that how the old server is setup and you are trying to figure from that how the new server can be improved?

yes that is the old server which we are looking to retire.
basically the requirements are to be able to server the content we have faster and better.

the current setup suffer from i/o buttle-neck (even for the nginx), and I was thinking to solve this with a ZFS with a pool create from multiply mirror disks, to split the i/o request over a large zdev's. and for extra use a ssd to cache reads.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
If you are more familiar with Linux, the setup might not be a problem, but there are a lot of 'ease of use' features in the FreeNAS web GUI including some disk monitoring features that I have not been able to duplicate in my ZFS on Linux servers at work.

which stats for example ?

I am working on my management to get permission to implement some FreeNAS systems to take the place of Linux.
I actually got some quotes from iXsystems to for TrueNAS and we may go that route if I can convince the management.

I checked with iXsystems, but the problem is, they can't provide NBD (next business day onsite) warranty service for the hardware where I need this server. :(
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
What is your
yes that is the old server which we are looking to retire.
basically the requirements are to be able to server the content we have faster and better.

the current setup suffer from i/o buttle-neck (even for the nginx), and I was thinking to solve this with a ZFS with a pool create from multiply mirror disks, to split the i/o request over a large zdev's. and for extra use a ssd to cache reads.
Nic? The network will almost always be there limit with streaming workflows. The disks can easily do 400MB/s+.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I checked with iXsystems, but the problem is, they can't provide NBD (next business day onsite) warranty service for the hardware where I need this server.
Where is the new system going to be?

When you have a lot of random IO to small files, you are often served better by a bunch of mirrored disks pooled together. I didn't exhaustively study the options available for the server you mentioned, but I don't think it will be a good choice because it doesn't support enough disks. I would suggest a chassis with 24 drive bays to serve your needs now with room to grow as your need for storage grows.
Especially as you mentioned the need to be able to grow the storage to 30 TB or more.
In order to have both satisfactory IOPS and the starting capacity you mentioned with a little overhead, you would want to start with 5 mirrored disks in the storage pool (10 disks total). I did my calculations based on the Toshiba N300 NAS drive at 6TB, which is rated at 210MB/s transfers. If you use 10 in mirror sets, it should give you roughly 26 TB of total space with 20 TB usable (ZFS needs 20% spare space for good performance) and the pool should be able to provide about 1400MB/s of IO.
If you use a 24 bay chassis as I suggest, you could simply add another pair of drives to the pool when you need to grow the capacity and each additional pair of drives also increases the total potential IO capacity and in the 24 bay chassis using 6 TB drives you would be able to expand to a usable capacity of 50 TB.
You will need a 10GB network interface to take advantage of this speed.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
What is your

Nic? The network will almost always be there limit with streaming workflows. The disks can easily do 400MB/s+.

The only apply for sequential read, this server act as repo mirror for 15+ linux distro and other foss project, and there is a lot of "random"* access over many repos.

+ we have the rsync cron's which need to read each repo and stat all the files to compare them to the remote hosts which we mirror from.

the only sequential read we have is when pepole's download distro ISO images.

*random is not exactly right as there is a most frequent files which are access a lot
for example the Ubuntu and CentOS repos are very active compared to SUSE or MINT in our case.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
PS. The reason I suggest the 6 TB drive size is because it is a better value per TB at this time in general but the HGST Deskstar NAS drives at 6 TB is a better value than the Toshiba I mentioned above.
It won't matter too much if you are buying the drives from a server vendor at the same time as the server because they will cost more (almost double) what you could buy them for on the open market.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
Where is the new system going to be?

When you have a lot of random IO to small files, you are often served better by a bunch of mirrored disks pooled together. I didn't exhaustively study the options available for the server you mentioned, but I don't think it will be a good choice because it doesn't support enough disks. I would suggest a chassis with 24 drive bays to serve your needs now with room to grow as your need for storage grows.
Especially as you mentioned the need to be able to grow the storage to 30 TB or more.
In order to have both satisfactory IOPS and the starting capacity you mentioned with a little overhead, you would want to start with 5 mirrored disks in the storage pool (10 disks total). I did my calculations based on the Toshiba N300 NAS drive at 6TB, which is rated at 210MB/s transfers. If you use 10 in mirror sets, it should give you roughly 26 TB of total space with 20 TB usable (ZFS needs 20% spare space for good performance) and the pool should be able to provide about 1400MB/s of IO.
If you use a 24 bay chassis as I suggest, you could simply add another pair of drives to the pool when you need to grow the capacity and each additional pair of drives also increases the total potential IO capacity and in the 24 bay chassis using 6 TB drives you would be able to expand to a usable capacity of 50 TB.
You will need a 10GB network interface to take advantage of this speed.

So basically yes, a lot of small files (rpm/deb/tar.gz/ ... ),
That's why I was not sure if I need to go with RAIDZ{1,2} or a bunch of mirror's
later on I read:

With ZFS, the rule of thumb is this: regardless of the number of drives in a RAIDZ(2/3) VDEV, you always get roughly the random I/O performance of a single drive in the VDEV.
-- source - https://blogs.oracle.com/roch/when-to-and-not-to-use-raid-z

So now I'm sure I need mirror's for the pool,

currently I'm checking HPE ProLiant DL180 (Gen9) & Dell PowerEdge R540 as they both support ~12-16 disks.
which in the setup of 6TB*10 in mirror should give us ~ 30TB of storage.

based on the above I'm not sure if i also need to add a mirror ssd's to be used as L2ARC.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
which in the setup of 6TB*10 in mirror should give us ~ 30TB of storage.
Less space than that. There is some ZFS overhead for checksum data and the mathematical inaccuracy of the way drive manufacturers state drive size. ZFS reported space should be about 26TB but the usable space would only be 20 TB. ZFS requires a 20% space reservation because of the copy on write nature of the filesystem. Performance really tanks if you get over 90% of capacity, so you must keep an eye on that and expand the pool early if you see you are going beyond 80% capacity.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
based on the above I'm not sure if i also need to add a mirror ssd's to be used as L2ARC.
The pool I suggested would have a lot of IO capability. If you were going to use any SSD, it would need to be very fast to be of any use and high endurance. You might want to look at the review @Stux did when he was deciding on an SSD for his NAS.
https://forums.freenas.org/index.ph...node-304-x10sdv-tln4f-esxi-freenas-aio.57116/
He had another write-up about the SSD performance but I can't find it right now.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
based on the above I'm not sure if i also need to add a mirror ssd's to be used as L2ARC.
Note that you don't mirror L2ARC disks / SSDs. That is because the data is read only, and any L2ARC device failure simply causes ZFS to read the data from it's pool. (Slower, but 100% functional.) I'm not even certain you can mirror L2ARC devices.

That said, there are situations where it's advisable to stripe L2ARC devices. Newer ZFS uses compressed ARC and reduced memory foot print for any L2ARC entries. L2ARC data has pointers in RAM, thus the desire to max out RAM before adding L2ARC.

Having 2 smaller L2ARC devices in a stripe increases IOPS of the L2ARC, since both can be read at the same time, (if the needed data ends up on separate L2ARC devices). Plus, un-like single device L2ARC failures, any remaining L2ARCs can continue to work. Over time, ZFS will re-balance the remaining L2ARC device's data based on what's most needed.

I do agree that you should both skip initially using L2ARC, and also allow adding one, (or more), later.
 

Rabin

Dabbler
Joined
Aug 22, 2014
Messages
21
The pool I suggested would have a lot of IO capability. If you were going to use any SSD, it would need to be very fast to be of any use and high endurance. You might want to look at the review @Stux did when he was deciding on an SSD for his NAS.
https://forums.freenas.org/index.ph...node-304-x10sdv-tln4f-esxi-freenas-aio.57116/
He had another write-up about the SSD performance but I can't find it right now.

Thank you very much, it was a very interesting and very instructive thread, Waiting for more posts from @Stux.

Note that you don't mirror L2ARC disks / SSDs. That is because the data is read only, and any L2ARC device failure simply causes ZFS to read the data from it's pool. (Slower, but 100% functional.) I'm not even certain you can mirror L2ARC devices.

That said, there are situations where it's advisable to stripe L2ARC devices. Newer ZFS uses compressed ARC and reduced memory foot print for any L2ARC entries. L2ARC data has pointers in RAM, thus the desire to max out RAM before adding L2ARC.

Having 2 smaller L2ARC devices in a stripe increases IOPS of the L2ARC, since both can be read at the same time, (if the needed data ends up on separate L2ARC devices). Plus, un-like single device L2ARC failures, any remaining L2ARCs can continue to work. Over time, ZFS will re-balance the remaining L2ARC device's data based on what's most needed.

I do agree that you should both skip initially using L2ARC, and also allow adding one, (or more), later.

OK, now I need to figure out if i want to go with 5x2 disks mirror's
or other combination with RAIDZ(/2)?

I need to figure out how much IOPS i need to support 3000 clients on peak time
and based on that to figure IF i need so much mirros.

Will 64G of RAM will be enough for 30TB of storage ?
 
Last edited:
Status
Not open for further replies.
Top