What to do next?

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
So I've got and HP DL380p G8 with 2 E5-2640 processors and I believe 28G ram. I'm using both the HP P420i (HBA Mode) and an LSI SAS9200-8i that I added. The P420i has 8 10k 450G drives and the LSI has 4 500G WD RED SSDs. I added an Intel dual port 10G ethernet adapter.

The only purpose in life this has is to provide VMFS storage for my lab as a Network engineer. I have windows Domain Controllers, Cisco ISE, 9800 WLCs, DNAC, FTD/FMC and an assortment of other VMs. These VM's reside on a Cisco UCS 5108 blade chassis with 4 B200 M4 blades and 1 B420 M3 blade. The Chassis has the 6324 FI's.

All of this is connected together as the FI's have a 40G etherchannel (10x4) per FI across a set of Nexus 5596's with VPC.

The DL380p is connected with both of the SFP's across the 5596's as well. Internal to the chassis I'm using the vmware software iSCSI and on the TrueNAS i've got one IP tied to each of the SFP ports that I target and setup vmware for MPIO.

The drives on each controller are just one big volume each and show up as two datastores to the ESX hosts.

This has served me well for a 2-3 years now. I know using the P420i isn't good with Truenas but its worked along. Other than the I/O bottlenecks I see when deploying an OVA or some other high I/O operation things worked pretty well. Get to many VM's running things were a little sluggish.

All of this other than a few upgrades I've done was free to me. I added the LSI card and 10G nic. Yesterday, something happened and based on another post of mine I think the issue was with the SDCard I had installed TrueNAS on. I'm going to attempt to reinstall and restore my backup to a SATA drive connected directly to the SATA port of the DL380. I think this is an intel of some sort. I didn't have the cable for it so that is coming but it got me to thinking. I've not put anything into this and maybe there is a way with a little $ thrown at it to make it even better or... come to terms that its as good as its going to get and move on to look for something else.

So given the above in mind. Taking into account the purpose of VMFS/iSCSI can you think of any upgrades I could do to enhance this hardware and up the performance etc? The data isn't even an issue, its lab, it can go if needed. I have veeam backups of the critical stuff like the Domain Controllers and can simply rebuild whatever I need. Some thoughts I had:
Abandon the P420i and replace with something better suited for Freenas? Another LSI SAS9200? I'm not sure here? What gives the best performance.
SLOG? I don't know much about this but would like to hear your thoughts?
L2ARC? I'd have to get a pci NMVE card or something I think to entertain this? Make any sense?
RAM upgrade? 64/128GB?
Rebuild the drive configurations and use a bunch of mirrors instead?
Abandon it and just look for new/different hardware that would work better out the gate re-using the drives I have?

Just curious on the community's thoughts with performance in mind on i/o as it relates to VMFS use case and iSCSI what would you do?
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You should read this and come back with further questions if you have any:
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
You should read this and come back with further questions if you have any:
I actually had found that and that is what led to my post here and my thoughts about upgrading RAM. The one thing I was going to go back and edit was to discuss the drive setup and possibly moving to mirrors. I was more looking to the community for some specific thoughts around what specifically people would do here given the details. Appreciate your response!!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Start with RAM. (go to the max. possible)

Switch to Mirrors. (if at all possible... consider dumping the hardware if not)

Consider SLOG. (but do it properly with Optane if you do)

Forget L2ARC unless you see ARC misses too high with the added RAM.
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
Start with RAM. (go to the max. possible)

Switch to Mirrors. (if at all possible... consider dumping the hardware if not)

Consider SLOG. (but do it properly with Optane if you do)

Forget L2ARC unless you see ARC misses too high with the added RAM.
Thanks!! I just ordered 64G additional. For $80 it wasn't to bad. I can always get more later as well but this will at least get me to that min of 64G.

Regarding switching to Mirrors. Is this basically stating that I'd take the existing drives that I have and make them mirrors. So in the end I'd have 6 sets of mirrors. 2x 500G for the 4 500G SSDs and 4x 450G for the 10K SAS Drives? In this case when I create this I'll end up having 6 smaller datastores in VMWare? I just want to make sure I have my head screwed on straight.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
WIthout buying more disks, that would be what would happen... you may also want to consider one or 2 hot-spares depending on the criticality of your data, so maybe only 5 pairs.

You could also have all of the mirrors in a single pool since you're then able to group your IOPS (maybe still problematic when all VMs are pushing hard, but should allow for peaks without too much impact between VMs in normal operation).
 
Last edited:

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
That's great! I really appreciate your feedback/input! I may take a combination approach here and buy some larger drives as well. I don't have a lot of storage now and don't really have need for a lot. Upgrade to some 1TB or larger drives and maybe do away with the drives that are on the P420i.
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
So I was able to get this back online. I went ahead and broke up the SSD drives that hang off the LSI controller. Created 2 new pools as mirrors. Added them to the ESX hosts. Moved a machine to them from the SAS drives on the HP controller. I thought it felt slow but somewhat blamed it on the SAS drives. I then storage vmotioned the VM between the two datastores that are both SSD Mirrors. It is appears to me to be very slow. Was a 90G thick provisioned VM and took 20 min to vmotion. I don't have anything to compare it to but it just seems very slow. Moving it back from the SSD to the SAS drives appears to be just as slow. I somewhat assumed with ssd's this would be pretty quick.
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
So I found I had ram in some servers that I was able to move over as well. I'm now doing testing again here the storage vmotions and the iometer tests don't really show any change. Here is what my dashboard shows. I do notice the ZFS Cache climbing quite a bit as the storage vmotion is going on. I think it hit up to 70G the vmotion finished and then it started to drop.

1657256579397.png
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
For sync writes (which is what you're doing with VMware), there's not a lot to be done to address the speed penalty other than:

Adding a SLOG (with proper PLP and performance/endurance... optane or similar), should be faster than current, but still linked to how slow your HDDs are.

Taking the "risk" of data loss due to power/connection loss and set the zvol to sync=disabled (forcing async writes where the memory can give you the "expected" boost). Maximum speed this way... but with risk.
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
Well I'm not sure. I moved the SSD pool back to raidz1 across the 4 drives. Did the Storage vmotion. 4 min down from 20-30. Even though this is supposed to be "slower". IOMeter tests show the write speeds from 2-3 on the mirrored setup to now 17-18 on the raidz1. I guess I'll leave it this way. Have a couple other drives coming to test with. Non SSD 7200 RPM 8TB WD Red Plus. We'll see how they do in a mirror.
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
Random quick question. Do you think I'd get any better performance moving from the 9200 to a 9300 HBA?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Do you think I'd get any better performance moving from the 9200 to a 9300 HBA?
Not with HDDs, but SSDs will go faster with the later series HBA chips.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
It would have been very interesting to get a look at zpool iostat -v from the mirrored setup and also iostat a few times during the v-motion
 

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
Because I've got it like this now these are the pools as they sit. There is only 1 VM running on the SSD pool. Its a windows 10 box. Have a number of VM's running on the SAS pool. Once I do this test I'll blow it away and put it back to mirror and post that as well.
Code:
root@truenas[~]# zpool iostat -v
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
SAS-Pool-01  1003G  2.27T     14    178  86.4K  2.72M
  raidz1-0  1003G  2.27T     14    178  86.4K  2.72M
    gptid/1df4be27-be1e-11eb-a645-2c768a55f8b4      -      -      2     23  11.7K   356K
    gptid/1e408944-be1e-11eb-a645-2c768a55f8b4      -      -      1     21  9.92K   341K
    gptid/1e4b5002-be1e-11eb-a645-2c768a55f8b4      -      -      2     23  11.7K   356K
    gptid/1ec19b78-be1e-11eb-a645-2c768a55f8b4      -      -      1     21  9.91K   340K
    gptid/1eba0d32-be1e-11eb-a645-2c768a55f8b4      -      -      2     22  11.8K   356K
    gptid/1ed4de77-be1e-11eb-a645-2c768a55f8b4      -      -      1     21  9.86K   341K
    gptid/1ece2df5-be1e-11eb-a645-2c768a55f8b4      -      -      2     22  11.7K   356K
    gptid/1f27e644-be1e-11eb-a645-2c768a55f8b4      - 


SSD-Pool-01  41.2G  1.76T      0     43      4   509K
  raidz1-0  41.2G  1.76T      0     43      4   509K
    gptid/ec47c66f-febe-11ec-8d6c-a0369f4a8344      -      -      0     111   130K
    gptid/ec1a40cd-febe-11ec-8d6c-a0369f4a8344      -      -      0     101   125K
    gptid/ec632463-febe-11ec-8d6c-a0369f4a8344      -      -      0     101   130K
    gptid/ec59e54a-febe-11ec-8d6c-a0369f4a8344      -      -      0     101   124K
    


However since I have it this way here is what I see when I vmotion from SSD to SAS in the current setup. iostat on the 4 SSD's. In this scenario it would be high read on the SSD and write on the SAS. The migration took just shy of 8 minutes. Which I guess I don't feel is to bad.
Code:
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   18   0.6  31.6   18   0.6  11.3   12   0.1  10.7   12   0.1   0  0  0  0 100


Going from SAS to SSD. This took 3 min 48 seconds. I'm not sure what has changed here but this used to be really slow, before blowing away the pools etc. Seems like something changed for sure. Yet the output of iostat didn't show much?
Code:
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   19   0.6  31.6   18   0.6  11.4   12   0.1  10.7   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   19   0.6  31.6   18   0.6  11.6   12   0.1  10.8   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   19   0.6  31.5   18   0.6  11.6   12   0.1  10.8   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 31.1   19   0.6  31.4   18   0.6  11.8   12   0.1  10.9   12   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 30.9   19   0.6  31.3   19   0.6  12.0   12   0.1  11.0   13   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 30.8   19   0.6  31.1   19   0.6  12.3   12   0.1  11.2   13   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 30.6   19   0.6  30.9   19   0.6  12.5   12   0.2  11.3   13   0.1   0  0  0  0 100
root@truenas[~]# iostat da8 da9 da10 da11
       tty             da8              da9             da10             da11             cpu
 tin  tout KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  KB/t  tps  MB/s  us ni sy in id
   0     1 30.3   20   0.6  30.4   20   0.6  13.0   13   0.2  11.7   14   0.2   0  0  0  0 100
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
8-wide RAIDZ1 HDD pool should have pretty good throughput, but terrible IOPS. When you evacuated and recreated it you also de-fragmented it, that predictably improved performance of any following sequential operations. After some random rewrites, depending on workload, it will likely fragment again. That is why we recommend mirrors for VM storage HDD pools, they may have lower initial throughput due to higher redundancy, but less degrade with time due to much better IOPS.

SSDs have much better basic IOPS and also higher price per GB, so for them 3-5-wide RAIDZ1 make more sense.

PS: Also I suppose `iostat` with just a list of disks shows average statistics since boot, that is not very useful. You should use `iostat -w 5 ...`, for example, to see current values. Though I personally prefer `gstat -I 1s -pa`, though it shows only current speed, not the log.
 
Last edited:

BSTAMPER

Dabbler
Joined
Sep 8, 2017
Messages
32
Thank you for the additional detail on the mirror vs. raidz1 info. I've made a number of changes since my last post. 1. I upgraded to the 9300-8i for the SSD's and I got a couple 8TB drives in a mirror just to store some test/dev machines that I won't have powered on all the time. I added more ram so I'm sitting at 192GB. My OS is currently just a 500GB non SSD drive hanging on the same 9300-8i. Swapping the HBA went perfectly. I just moved everything to the SAS drives off the HP controller. Blew away the pools (might not have had to do that but oh well). Swapped the HBA out and it booted right back up and was ready to go. Re-created the pools and away we went. I then started the vmotions to move all the VM's to their respective storage. It blew up the z cache. 164GB used lol. I'm sure it'll settle down once it all falls in place but have one still finishing. We'll see how well this does once things calm down.

One question? Do I gain anything by having the OS on an SSD rather than the spindle its on? I assume not but wasn't sure if there was anything that hit that?
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
I don't like having boot and data disks on the same HBA, since some BIOSes go crazy when they see too many potential boot devices. For that reason I prefer to disable OPROM for HBAs handling data disks. For TrueNAS itself it is not important whether the boot device is HDD or SSD.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Do I gain anything by having the OS on an SSD rather than the spindle its on?
Maybe longer life, slightly lower power consumption and quieter operation when "idle". Nothing that would be of any real importance to you, I guess.
 
Top