Storage optimisation

Status
Not open for further replies.

beezone

Cadet
Joined
Feb 7, 2018
Messages
5
Hi everyone,
I have a problem but don't even know how to start solving it.

I got a time machine backup server on FreeNAS-11.1-U1, CPU E5-2609 v3 @ 1.90GHz, 96Gb RAM, 36x6Tb 7200rpm Toshiba Enterprise SATA drives. Disks are split to 6 raidz1 volumes by 6 drives. Each backup user has a personal dataset to isolate one from the other. Every day I got same problem, disks busy counters almost at 100% during the working time as a result backup performance is terrible. Mac are using afp shares. Some backups are 40000+ of 8mb files.
24h graph of disk busy time by volumes.
img-2018-02-07-23-27-59.png

da2busy.png
da2pendingioreq.png
da2diskopetrations.png
da2latency.png


Is it possible to increase the performance of disk subsystem? What of this may help or not: add ZIL, SLOG, ZFS settings or maybe block-size. Which advanced benchmarks or stats I can get to understand what to do?

Or nothing can help me and I need to find a cozy corner and cry?
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Let's get some more details about your system and use case.

How many users/clients are we talking about here? Are you doing full backups, or incremental? Are you using dedup? Compression? Are you configuring all the backups to run simultaneously, or are they staggered?

How full is your pool? Can you provide the output from zpool status?
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
^^ All of the above will help. That said, 7200RPM drives are good for about 100 IOPS... and that's about what you're doing. I'm guessing you have a fair number of users - you may have no choice but to move to a different RAID configuration (striped mirrors) to get the performance you want.

In all honesty, your RAID configuration needs improvement anyway. RAIDZ1 is dead for drives of that size anyway - you should be running RAIDZ2. Do you really have separate pools for each vdev? Or are all of the vdevs in one pool?
 

beezone

Cadet
Joined
Feb 7, 2018
Messages
5
Let's get some more details about your system and use case.

How many users/clients are we talking about here? Are you doing full backups, or incremental? Are you using dedup? Compression? Are you configuring all the backups to run simultaneously, or are they staggered?

How full is your pool? Can you provide the output from zpool status?
There are about 50 clients. Most of backups are incremental at most. Dedup is off, compression default lz4, cpu usage is about 20% on reporting graph, Load Average 6.34, 4.97, 4.31. Apple does its own magic with time machine so they don't give ability to change backup frequency (system integrity protection), by default it starts every hour. So most of time backups are running simultaneously.
Code:
zpool status
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:01:18 with 0 errors on Mon Feb  5 03:46:18 2018
config:

	NAME		STATE	 READ WRITE CKSUM
	freenas-boot  ONLINE	   0	 0	 0
	  mirror-0  ONLINE	   0	 0	 0
		ada1p2  ONLINE	   0	 0	 0
		ada0p2  ONLINE	   0	 0	 0

errors: No known data errors

  pool: vol1
 state: ONLINE
  scan: scrub repaired 0 in 0 days 16:04:01 with 0 errors on Sun Jan 28 16:04:04 2018
config:

	NAME											STATE	 READ WRITE CKSUM
	vol1											ONLINE	   0	 0	 0
	  raidz1-0									  ONLINE	   0	 0	 0
		gptid/e6abddf8-3f3a-11e7-a9ad-0cc47a820950  ONLINE	   0	 0	 0
		gptid/e77675a2-3f3a-11e7-a9ad-0cc47a820950  ONLINE	   0	 0	 0
		gptid/e841a9ad-3f3a-11e7-a9ad-0cc47a820950  ONLINE	   0	 0	 0
		gptid/e913b3c3-3f3a-11e7-a9ad-0cc47a820950  ONLINE	   0	 0	 0
		gptid/e9e67228-3f3a-11e7-a9ad-0cc47a820950  ONLINE	   0	 0	 0
		gptid/eab07c94-3f3a-11e7-a9ad-0cc47a820950  ONLINE	   0	 0	 0

errors: No known data errors

other volumes are also online and healthy

^^ All of the above will help. That said, 7200RPM drives are good for about 100 IOPS... and that's about what you're doing. I'm guessing you have a fair number of users - you may have no choice but to move to a different RAID configuration (striped mirrors) to get the performance you want.
I assume that RAIDZ2 will not improve the performance. But in case of failed disk it there will be less worry while it is slowly rebuilding.

In all honesty, your RAID configuration needs improvement anyway. RAIDZ1 is dead for drives of that size anyway - you should be running RAIDZ2. Do you really have separate pools for each vdev? Or are all of the vdevs in one pool?
All six pools are separate. Each consist of 6 disks connected to HBA.
 
Last edited by a moderator:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
All six pools are separate.
That really seems like a bad idea, and probably is causing a good portion of your problem. To increase IOPS, you add vdevs. With only one vdev per pool, each pool has only the IOPS capability of a single disk. You'd likely improve performance quite a bit by putting all the disks into the same pool.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Those charts look like high numbers, but not bottlenecks (should be straight horizontal lines if we're hitting a technical limit).

It's clear that a non-zero pending I/O reading is not great when sustained as in your chart, but danb35's suggestion may help to reduce that.

Have you looked at the network? (perhaps that's where you're bottlenecking)

If you have large numbers of reasonably large sized files (and not too many small files), you may benefit slightly by having a larger block size.
 

beezone

Cadet
Joined
Feb 7, 2018
Messages
5
I'm quite confused.

You'd likely improve performance quite a bit by putting all the disks into the same pool.
img-2018-02-08-17-34-53.png

If I understood correctly all is configured as you said. Top vol1 is a vdev of 6 disks, second level vol1 is a single pool on vdev. Do you suggest to increase the vdevs count from current 6 to 12 (or more)? Do I have any other options?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
If I understood correctly all is configured as you said.
I don't think you do understand correctly.
Top vol1 is a vdev of 6 disks, second level vol1 is a single pool on vdev.
No, top vol1 is a pool, and second level vol1 is a dataset--the implicit or root dataset that exists on every ZFS pool (it's always existed with ZFS, but wasn't shown in the GUI before FN 9.3). The screen shot you posted shows nothing about the vdev layout--to see that, click on the top vol1, and then on the Volume Status button below (it looks like a blank sheet of notebook paper). Edit: Never mind--the output of zpool status you posted above shows the vdev layout.

What I'm proposing is that you have a single pool (what FreeNAS calls a Volume) consisting of all your disks, in multiple vdevs. This will increase performance--IOPS will increase roughly linearly with the number of vdevs. It will also simplify storage administration--you'll have a single volume with all your space there. You won't need to manage free space across multiple volumes (pooled storage is one of the big selling points of ZFS). It will also increase risk--as @sretalla notes below, when any single vdev fails, you lose your pool. That's why we'd recommend RAIDZ2 over RAIDZ1. Making this change would require that you destroy and rebuild your pool.
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You could probably also benefit from more RAM (before thinking about ZIL).

To clarify on your confusion...

A Pool with multiple VDEVs will perform better than a pool with only 1 VDEV, so if you had a single pool with all of your VDEVs in it, your individual pool performance (potential) will "double" as the writes can be striped across both VDEVs.

Careful! losing 1 VDEV in this configuration loses the whole Pool (all VDEVs).

You would then use that single pool (you could call it Pool1 or tank or whatever), then add 2 datasets, vol1 and vol2 to get back to your logical separation as in the current setup.
 

beezone

Cadet
Joined
Feb 7, 2018
Messages
5
Those charts look like high numbers, but not bottlenecks (should be straight horizontal lines if we're hitting a technical limit).

It's clear that a non-zero pending I/O reading is not great when sustained as in your chart, but danb35's suggestion may help to reduce that.

Have you looked at the network? (perhaps that's where you're bottlenecking)
Network seems is not a bottleneck. Lagg0 has a peak of 500-600mbit/s. But it can handle up to 2Gbps/s in theory. Our network infrastructure can handle this easily.
lagg0.png

If you have large numbers of reasonably large sized files (and not too many small files), you may benefit slightly by having a larger block size.
Backup file structure is a 40-60k+ of 8mb data files and about 10 small config files
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
If the vast majority of your files are large (seems to be what you are saying), then a much larger block size could help (a bit).

I agree with danb35 that a multiple (more than 1) VDEV per pool strategy is what you need to really improve performance.

Make sure you understand and assess your need for redundancy as you would effectively increase the chances of losing everything by reducing to a single pool.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Also remember for LAGG, 1Gbit + 1Gbit = 1Gbit per client as the max, not 2, but overall you should be able to see 2Gbits of throughput on the server end if enough clients were at full speed.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
You're also reaching the limits of that CPU. The E5-2609 v3 is a 6-core part... load averages north of 6 are reasons for concern.
 

beezone

Cadet
Joined
Feb 7, 2018
Messages
5
Also remember for LAGG, 1Gbit + 1Gbit = 1Gbit per client as the max, not 2, but overall you should be able to see 2Gbits of throughput on the server end if enough clients were at full speed.
I know, at this time both physical interfaces are loaded equally.

You're also reaching the limits of that CPU. The E5-2609 v3 is a 6-core part... load averages north of 6 are reasons for concern.
My mistake, it's a double CPU config = 12cores. So there is a half of power is available.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
An outside thought is that you are hitting the limits of your memory. You have over 200TB of raw storage with only 96GB of memory. However, I'm not very experienced tuning systems at these scales, so my intuitions about how much memory you need could very well be wrong.
 
Status
Not open for further replies.
Top