My data is evil, and will be punished. Req for tuning tips.

Status
Not open for further replies.

dagrichards

Dabbler
Joined
Jun 24, 2015
Messages
12
I am still putting together a NAS box of floor scrapings and E-Bay parts. The intent is to use it for ESXi data store of throw away VMs to be used in a training environment.

Specs:

SuperMicro X8DTI-F with a single 5620
24 GB of ECC
3Ware 16 port Raid controller ( JBODing the the disks won't ask for help when it eats my data )
8 2TB HGST sata III 7200 rpm disks
8 250GB assorted mutt sata III 7200 rpm disks
1 mellanox 10G nic
1 consumer 120GB SSD on a PCIe card ( log device )

With that hardware the next question is usually something like "I made a raidz-1 and its slow .... why".

I finally got my crap together and did this
It gives me nearly 8TB of storage, I will keep it down to less than 4 TB used.

pool: tank

tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/496d9b23-4d41-11e6-8363-0025904a45ae ONLINE 0 0 0
gptid/4b42b99a-4d41-11e6-8363-0025904a45ae ONLINE 0 0 0

mirror-2 ONLINE 0 0 0
gptid/dca11c7a-4d41-11e6-8363-0025904a45ae ONLINE 0 0 0
gptid/dd720e26-4d41-11e6-8363-0025904a45ae ONLINE 0 0 0

.... etc for 8 mirror vdevs

logs
gptid/4b890fb1-4d41-11e6-8363-0025904a45ae ONLINE 0 0 0


I have started to do some bench testing, running iometer on 2 guests so far just one xenServer
set up like this: http://community.atlantiscomputing....e-Iometer-to-Simulate-a-Desktop-Workload.aspx.

Starting tomorrow I will be able to connect 5 ESXi servers and run pair of guests off each.

Im doing ill advised things like turning off checksum ( won't ask for help when it corrupts my data ).
And seeing how it changes the iometer score.
I have atime off.
I have vfs.zfs.prefetch_disable=1

I will compare nfs sync with the ssd ZIL vs async and see if the difference is worth the additional risk.

So far average IO response is 28 MS with checksum off, and 62 ms with checks on using sync ifs and the SSD. 580 IOPs vs 275 IOPs respectively.

More numbers as they are gathered


The intended use case is to attach 20 ESXi's with 10 MS 2008R2 guests hosted.
Students won't do anything more than install the servers and vmotion them a couple times back and forth.
The ESX'i will be booting off internal storage.


Right now I am getting about 93% hits on my ARC, how do I know if I am need more RAM?
It seems like as long as that number stays above say 90% I have enough....
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
won't ask for help when it eats my data

I like that you've got that disclaimer there, since this is clearly a playground machine.

Now with that said:

28ms/62ms latency, 580/275 IOPS

Assuming you're using the config from the Atlantis page (80% write, 80% random, 4K QD16) that feels like pretty poor results. Check to see if your RAID controller is disabling disk cache or something, and consider dropping the ~USD$50 on a proper HBA.
 

dagrichards

Dabbler
Joined
Jun 24, 2015
Messages
12
I have already identified myself as a putz, so I will ask ... $50 HBA? The only thing I have seen in that range have been sata II LSI's. I thought I we were being pointed to SAS IT mode controllers, and the 16 port jobs there are in the $200 range. The controller claims to be enabling the disk cache and warning me against it with the battery being absent.

And yes the data intended for this is easily rebuildable and has no intrinsic value, VM's for a 5 week VMware class.
 

JustinClift

Patron
Joined
Apr 24, 2016
Messages
287
Out of curiosity, what protocol are you using for sharing the data to the ESXi hosts? Not seeing that in the post. ;)
 

dagrichards

Dabbler
Joined
Jun 24, 2015
Messages
12
nfs. easier for me to configure than iSCSI, and since I plan on running it asnc it should overcome most of the performance advantages tat iSCSI has for ESXi.
 

JustinClift

Patron
Joined
Apr 24, 2016
Messages
287
No worries. Be aware of this, just in case:

https://bugs.freenas.org/issues/7659#note-40

The #40 one specifically mentions a slowdown with NFS and Mellanox cards. :(

There's an outstanding FreeBSD patch due to be added to FreeNAS 9.10-STABLE fairly soon, which I'm hoping fixes the problem. It may not though... :(
 
Last edited:

dagrichards

Dabbler
Joined
Jun 24, 2015
Messages
12
Bench marks finally being run ( here are iSCSI results )

The FreeNAS box now has:
SuperMicro X8DTI-F with a single 5620
24 GB of ECC
LSI using 2118it P20 firmware
8 2TB HGST sata III 7200 rpm disks
8 250GB assorted mutt sata III 7200 rpm disks
1 mellanox 10G nic
1 consumer 120GB SSD on a PCIe card ( log device )
disks layed out using 8ea 2 disk mirrored vdevs giving 7.8 TiB available space 1 TiB in use.


iSCSI without SSD ZIL
12 VM's on 3 DL380’s 2x L5520 at 2.27, 72GB RAM, booted off mirrored 10k SAS disks
Connected over 10G Mellanox cards
iSCSI no changes
no ZIL

in Freenas reporting:
ARC Hit 96.9 ( I interpret this as saying I have enough RAM for now? )
DiskWrite about 25MBps
network util averaging 350 MBPS
running tests as seen here:
http://community.atlantiscomputing....e-Iometer-to-Simulate-a-Desktop-Workload.aspx
Pseudo Random

VM's are fully patched Win7's 2GB Ram, swap stored with vm files

Machine | Total IOPS | Total MBPS | Average IO response Time (MS )

bench-0 1061 4.35 15.06
bench-1 1193 5.32 12.31
bench-2 1105 4.53 14.47
bench-3 1191 4.88 13.43
bench-4 0710 2.91 18.17
bench-5 1407 5.77 11.36
bench-6 0471 1.93 28.27
bench-7 975 4.00 16.40
bench-8 1708 7.0 09.36
bench-9 1281 5.25 09.36
bench-10 1300 5.33 12.30
bench-11 1275 6.89 09.89
 
Last edited:

JustinClift

Patron
Joined
Apr 24, 2016
Messages
287
Hmmm, that seems kind of weird (to me), that the network util says it was averaging 350 MB/s, but adding up the individual MB/s for the VM's gives more like 60MB/s.

Guess I'm forgetting something then? :D
 

dagrichards

Dabbler
Joined
Jun 24, 2015
Messages
12
I have been running the tests for 2 hours each

next run added an SSD log device ( 120G SanDisk on a PCIe card )
Log device added to existing pool by:
find gptid "glabel status | grep ada0 “
Then
zpool add tank log gptid/4b890fb1-4d41-11e6-8363-0025904a45ae

don’t see as much IO to the disk dev as I would expect.
shows 800 Bytes per second of write to the SSD
While the HDD’s are showing 30MB Per Second

ARC hit ratio of 99.3


Four of the machine below have really really bad numbers
For which I have no explanation.... i will run this same test again.
But first I am going to need to find a way to automate the simultaneous
initiation of these tests

Machine | Total IOPS | Total MBPS | Average IO response Time (MS )

bench-0 ---- 30 ---- .13 ---- 18.57
bench-1 ---- 27 ---- .11 ---- 20.78

bench-2 ---- 1804 ---- 7.39 ---- 8.87
bench-3 ---- 1813 ---- 7.43 ---- 8.82
bench-4 ---- 1030 ---- 4.22 ---- 15.53
bench-5 ---- 1044 ---- 4.28 ---- 15.32
bench-6 ---- 1214 ---- 4.97 ---- 13.17
bench-7 ---- 1615 ---- 6.61 ---- 9.91
bench-8 ---- 1889 ---- 7.74 ---- 8.47
bench-9 ---- 1914 ---- 7.84 ---- 8.36

bench-10 ---- 27 ---- .11 ---- 21.12
bench-11 ---- 31 ---- .13 ---- 18.29
 

JustinClift

Patron
Joined
Apr 24, 2016
Messages
287
Yeah, it does seem kind of strange. With the ESXi boxes... are you able to test the load against local storage (on themselves)? eg take FreeNAS out of the equation, and see if the problem still persists
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
8 2TB HGST sata III 7200 rpm disks
8 250GB assorted mutt sata III 7200 rpm disks
disks layed out using 8ea 2 disk mirrored vdevs giving 7.8 TiB available space 1 TiB in use.

So you're mirroring the 2TBs to each other, the 250GBs to each other, and then smashing them all together into a pool? This could give you the odd/inconsistent results you're seeing as ZFS is trying to balance across (very) different sized vdevs - 2TB/250GB is an 8:1 size relationship here.

I would say destroy your pool and rebuild with only the 2TB drives. Make those 250GBs into a scratch/temp pool.

don’t see as much IO to the disk dev as I would expect.
shows 800 Bytes per second of write to the SSD
While the HDD’s are showing 30MB Per Second

Judging by this post, you're using iSCSI which writes data asynchronously by default. You'll have to go to the command line and do a "zfs set sync=always pool/zvol" in order to force every write to hit the SSD.
 
Status
Not open for further replies.
Top