FreeNAS write speed dropping by 50% on Performance test

Status
Not open for further replies.

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
Hey,

I had a question about testing my FreeNAS setup, the setup is as follows,

Build FreeNAS-9.2.1.1-RELEASE-x64 (0da7233)
Platform Intel(R) Xeon(R) CPU E5506 @ 2.13GHz
Memory
49122MB


I've got 23TB of RAID 10 storage consisting of a bunch of 10 3 TB disk, 22 1TB disks and 2 200GB SSD's.

When I'm testing this setup using a sequential wriote with IOmeter, my results are roughly as follows in the first minutes of testing:
I/O per second, 170-220ish
Total MBs per second 90-100MB (This is good performance I think for my setup, no?)
I/O response time: 50-70ish

But after about 2 minutes, the performance suddenly drops to:
I/O per second, 100ish
Total MBs per second 40-50MB
I/O response time: 100-210ish

If I check the FreeNAS server using gstat, I see that initially everything seems to work fine, but when the performance drops, I notice there's a lot of reading from all the disks going on which causes the performance to drop in half.

Does anyone have any idea on what causes this and how I can fix this?
I tried messing with the tunables and sysctl settings, but I only achieve small performance in- and decreases. If I need to post more info please tell me and I'll post them ASAP.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I'm guessing the SSDs are L2ARCs? What configuration are the SSDs in?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I've got 23TB of RAID 10 storage consisting of a bunch of 10 3 TB disk, 22 1TB disks and 2 200GB SSD's.

As Cyberjock mentioned, what's the drive configuration? Having a hard time visualizing how you're getting 23TB usable out of this. Post the results of zpool status on pastebin or in a code block here if you can. Edit - I see that you've got them up as L2ARC, if you're using 2x200GB in an L2ARC stripe you're probably robbing memory from primary ARC to hold the mapping table. What brand are they?

Also what protocol are you using for accessing the server - CIFS or NFS mount, iSCSI raw mapping, VMFS on iSCSI?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
L2ARC can interact in unexpected ways (like: almost not at all) for benchmarks.

The real question here is what the pool layout and host attachment is like, because this strikes me as slowish.
 

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
It's my favorite consumer grade SSD! :D

I'm still trying to wrap my head around how that drive configuration works out ... 10x3TB in a mirror I understand well enough, giving 15TB usable, same with 22x1TB giving 11TB ... but that becomes 15+11 = 26TB. So I'm puzzled. Sparing?


I just checked and I think I missed one of the 3TB drives in making the volume, that should account for the difference.

L2ARC can interact in unexpected ways (like: almost not at all) for benchmarks.

The real question here is what the pool layout and host attachment is like, because this strikes me as slowish.


Do you happen to know of a better way to test my setup other than using IOmeter? :)
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I just checked and I think I missed one of the 3TB drives in making the volume, that should account for the difference.

... Wait, you're striping 9x3TB and 11x1TB together? I'd really like to see a zpool status on this one as I'm still puzzled.

Do you happen to know of a better way to test my setup other than using IOmeter? :)

It's not that IOmeter is bad, it's that testing a system with caching involved is a complex procedure, since you have to know what exactly is driving those results based on how much data you're moving about and where it's coming from/going to.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Since you can't run IOmeter on your NAS, it is a poor first choice for testing. We usually try local tests (i.e. with dd) first and then move on a bit at a time. It is much easier to look at smaller, simpler bits than the extremely complex totality of NAS, network, client PC, and testing tool.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
... Wait, you're striping 9x3TB and 11x1TB together? I'd really like to see a zpool status on this one as I'm still puzzled.

Yes that'd be nice to see, but why is that puzzling? ZFS allows you to add a mix of vdevs, and will distribute load among them.
 

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
zpool status
pool: local
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
local ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/fca92a05-9ee5-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/00a9dd8d-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gptid/04cf77f9-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/08cb87c5-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
gptid/0cf0f6cf-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/10edd065-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
gptid/1519d206-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/19350cbe-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
gptid/1d50b11b-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/21668091-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-5 ONLINE 0 0 0
gptid/2443e5a9-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/2642c1d6-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-6 ONLINE 0 0 0
gptid/28505e5e-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/2a4df03d-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-7 ONLINE 0 0 0
gptid/2c5c677e-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/2e4a34ed-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-8 ONLINE 0 0 0
gptid/30883927-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/329869bf-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-9 ONLINE 0 0 0
gptid/34a4774b-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/36a390ee-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-10 ONLINE 0 0 0
gptid/38cf701a-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/3b3889ed-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-11 ONLINE 0 0 0
gptid/3d4fcb7c-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/3f4db258-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-12 ONLINE 0 0 0
gptid/415ee534-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/434452c2-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-13 ONLINE 0 0 0
gptid/45832fc0-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/47879df8-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-14 ONLINE 0 0 0
gptid/4997cecb-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/4b999249-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
mirror-15 ONLINE 0 0 0
gptid/4dd77b79-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
gptid/4fd8c3dd-9ee6-11e3-82c8-0025900ecf46 ONLINE 0 0 0
logs
da34s1 ONLINE 0 0 0
da35s1 ONLINE 0 0 0
cache
da34s2 ONLINE 0 0 0
da35s2 ONLINE 0 0 0
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That should be pretty damn fast, I would think. Hm.
 

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
Hmm, my previous post ate my text reply, I also said the I'm just testing this set-up so I'm not 100% sure on some details of it, if there's anything you'd like to know about the configuration let me know.

Are the base IOmeter speeds okay though or should those be higher too? Also, I forgot to mention the IOtest was with 512kb size, 10 outstanding I/Os per target
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Yes that'd be nice to see, but why is that puzzling? ZFS allows you to add a mix of vdevs, and will distribute load among them.

I see it now with the zpool status, I misunderstood when he said he "missed a volume" - it should be 5x (2x 3TB) for the total of 26TB usable instead of 4x (2x 3TB) + 11x (2x 1TB) = 23TB.

Looks like you've also got those S3700s partitioned up and striped as both L2ARC and SLOG as well. How big are the partitions of each?

This should be a damned fast pool even if you're forcing sync writes.
 

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
I see it now with the zpool status, I misunderstood when he said he "missed a volume" - it should be 5x (2x 3TB) for the total of 26TB usable instead of 4x (2x 3TB) + 11x (2x 1TB) = 23TB.

Looks like you've also got those S3700s partitioned up and striped as both L2ARC and SLOG as well. How big are the partitions of each?

This should be a damned fast pool even if you're forcing sync writes.


I have to admit I'm not sure how those are partitioned, how can I check?
 

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
I think this might be what you were asking for!

logs - - - - - -
da34s1 136K 3.97G 0 0 0 3.99K
da35s1 136K 3.97G 0 0 0 0
cache - - - - - -
da34s2 44.0G 138G 0 0 0 0
da35s2 44.2G 138G 0 0 0 0
-------------------------------------- ----- ----- ----- ----- ----- -----

I did notice that the speed decrease starts as soon as the cache starts working. when I start the test it's just the log writing everything, then after a while the cache starts reading AND writing and it drops to 50%
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
You are a bad boy. Rule #1 of L2ARCs and ZILs.. don't put them both on the same drive. It sounds great in theory, but in practice it's just a horrible idea. It makes the drives suck at doing both!

But as others already have pointed out, benchmarks aren't the best way to find the limitation. The best way is to use the server how its intended to be used, then check out various parameters to see where the system is being stressed.

I could be wrong, but those logs and cache slices don't appear to be mirrored. If they were mirrored they'd have said mirrored below "log" and "cache".

Also, based on some hasty math in my head, you can't even use all 138GB of disk space for your L2ARC because of how little RAM you have. There's a reason why I tell people not to even consider L2ARCs until they hit 64GB of RAM. Also the manual says not to consider an L2ARC until you've maxxed out your RAM(within reason).

So I think you've got a few things to work on before we continue further. ;)
 

Chris_Zon

Dabbler
Joined
Mar 3, 2014
Messages
21
You are a bad boy. Rule #1 of L2ARCs and ZILs.. don't put them both on the same drive. It sounds great in theory, but in practice it's just a horrible idea. It makes the drives suck at doing both!

But as others already have pointed out, benchmarks aren't the best way to find the limitation. The best way is to use the server how its intended to be used, then check out various parameters to see where the system is being stressed.

I could be wrong, but those logs and cache slices don't appear to be mirrored. If they were mirrored they'd have said mirrored below "log" and "cache".

Also, based on some hasty math in my head, you can't even use all 138GB of disk space for your L2ARC because of how little RAM you have. There's a reason why I tell people not to even consider L2ARCs until they hit 64GB of RAM. Also the manual says not to consider an L2ARC until you've maxxed out your RAM(within reason).

So I think you've got a few things to work on before we continue further. ;)


Thanks for the reply! I appreciate the answers, I'll see what I can do to fix this, I'll keep you posted. Also, one more question, do you recommend those drives to be mirrored?
 
Status
Not open for further replies.
Top