Always keep an eye on your monitoring

Itamar Croitoru

Dabbler
Joined
Feb 2, 2017
Messages
42
I recently have been going through a hardware refresh cycle which included the two ESXi hosts and FreeNAS iSCSI for a VM cluster. During this work, the two hosts were upgraded first and that left the used hardware from them free to reuse. I took the opportunity to reuse one of those as the head for the FreeNAS iSCSI server of the cluster.

The FreeNAS head unit was a Dell PowerEdge R310 with 48 GB of ECC RAM (the max it could handle), a single socket CPU with quad core low end Xeon. The NIC being used for iSCSI were still capable of more and the HBA with external SAS connectors was the same story. It was also running FreeNAS 9.10.2-U6.

I reused the old ESXi hardware's HP DL360 G7 with 144 GB of ECC DDR3 1333 MHz, two sockets with Xeon E5620 @ 2.4 GHz and way more cache. I moved the local SSDs, NIC and HBA over and installed FreeNAS 11.2-U6 and rebuilt the iSCSI and....... expected better performance.

After moving the VMs onto the new LUN I found that performance had gotten worse. I started doing some checking in FreeNAS, I checked to make sure I configured everything right with the iSCSI and the MPIO for it. I started to think about upgrading my GB w/ Jumbo Frames on VLANs to 10 GB on its own switch. After starting to do research I decided to go back and check the switch stack performance and.... turns out I never finished adding the switch stack to the monitoring solution. In this case, Zabbix.

I finished adding what I thought would be the metrics and within a few hours I started to notice that one of the ports was passing 90% saturation and it was going to the FreeNAS' iSCSI port. I checked again and it was running at 100 Mbps, not GB. Turns out that I tugged that cable just a bit too much and one end cable out of the terminated end enough to make GB not work. So the NIC and switch decided on 100 Mbps Full Duplex instead of GB Full Duplex.

A quick swap and my iSCSI was able to pass the 100 Mbps limit again. Now performance is hovering around 50 Mbps day time and 150 Mbps during backups. A far cry from the network limit of GB and everything is performing much better.

LPT: Check your monitoring and finish setting it up if you didn't finish.
 
Top