NFS Slow In VMs

Status
Not open for further replies.

brando56894

Wizard
Joined
Feb 15, 2014
Messages
1,537
In an attempt to learn more and get a better IT job, I've been trying to get the most out of my NAS. I've added a SLOG (128 GB Intel S3700) to my storage pool and spun up a few Ubuntu VMs with VirtualBox. Originally it was only going to be one for all of my Usenet Apps and webserver duties, but I used Ubuntu 15.10 and PuppetLabs only supports up to 14.04, so I created another VM just for Puppet Enterprise, then I realized I needed a "throw-away" VM for testing. The first VM has an EXT4 drive attached via iSCSI that comes from my main storage pool, the other two are VMDKs in my main storage pool.

Everything was fine for a day or two but then I noticed that my NFS speeds were slowing to a crawl, like 1-15 MB/sec, everything else on average is about 80 MB/sec externally or about up to 110 MB/sec if using NFS between VMs or Jails. If I try to copy large chunks of data (1-50 GB)it will start out quick but quickly slow down to a crawl. During this time nothing is really taxed in the VM, CPU utilization is in the 25-50% range per core, RAM usage is about 75%. The stats of the NAS itself looks about the same, even though it's running 3 VMs and Plex, while managing my pool. Just for the hell of it, I transferred a few gigs from the NAS over NFS (from FreeNAS itself, not via a VM) to Arch Linux I have on my PC and it was going from 45-90 MB/sec.

Rebooting the VMs didn't help at all, restarting the PhpVirtualBox jail, didn't help either. Neither did restarting NFS. I removed the RSIZE and WSIZE mount options that I had and that brought it to a stable 12 MB/sec. Enabling the sync mount option didn't seem to help any.



FreeNAS/Network Layout
L.png



System Specs
NAS

AsRock C2750D4I @2.4GHz ** 32GB (4x8GB Dual Channel) Crucial ECC 1600MHz ** SilverStone DS380 case ** SilverStone 600W SFX Gold PSU

Pools
Multimedia
: 2x1 TB WD RED, 4x4TB HGST, 2x3TB WD Green attached to LSI M1015 IT HBA
FreeNAS Boot: 4GB InnoDisk SATA DOM


Test Results
Creating a file on Ubuntu Server (iSCSI)
Code:
 [root@ubuntu /home/bran]$ dd if=/dev/zero of=/test.img bs=2M count=10k
10240+0 records in
10240+0 records out
21474836480 bytes (21 GB) copied, 310.126 s, 69.2 MB/s


Transferring that file from the VM to an NFS share
Code:
 [root@ubuntu /home/bran]$ rsync -a --progress /test.img /mnt/storage/
sending incremental file list
test.img
  3,455,942,656  16%   15.95MB/s    0:18:23  ^C


Transferring from Arch to the Storage dataset over NFS was like a sine wave, with peaks as high as 250 MB/sec and valleys as low as 500 KB/sec.
Code:
[bran@ra ~]$ sudo rsync -a --progress /test.img storage/
sending incremental file list
test.img
21,474,836,480 100%  102.00MB/s    0:03:20 (xfr#1, to-chk=0/1)


Using dd
Code:
bran@ubuntu ~]$ sudo dd if=/dev/zero of=/mnt/storage/test2.img bs=2M count=10k
[sudo] password for bran:
10240+0 records in
10240+0 records out
21474836480 bytes (21 GB) copied, 743.862 s, 28.9 MB/s


Code:
[bran@ra ~]$ sudo dd if=/dev/zero of=storage/test3.img bs=2M count=10k
10240+0 records in
10240+0 records out
21474836480 bytes (21 GB) copied, 269.684 s, 79.6 MB/s


Zpool Status
Code:
[root@freenas] ~# zpool status
  pool: Jails
 state: ONLINE
  scan: scrub repaired 0 in 0h2m with 0 errors on Sun Jan 10 00:02:40 2016
config:

        NAME                                          STATE     READ WRITE CKSUM
        Jails                                         ONLINE       0     0     0
          gptid/cc405949-97df-11e5-9d1c-d050995af954  ONLINE       0     0     0

errors: No known data errors

  pool: Multimedia
 state: ONLINE
  scan: scrub repaired 0 in 10h37m with 0 errors on Wed Jan 13 13:40:28 2016
config:

        NAME                                            STATE     READ WRITE CKSUM
        Multimedia                                      ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/4fd4b885-4c4f-11e5-8d85-d050995af954  ONLINE       0     0     0
            gptid/509d70f1-4c4f-11e5-8d85-d050995af954  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/90e165c0-4dd5-11e5-8d85-d050995af954  ONLINE       0     0     0
            gptid/920769a5-4dd5-11e5-8d85-d050995af954  ONLINE       0     0     0
          mirror-2                                      ONLINE       0     0     0
            gptid/201649ba-b4f3-11e5-b564-d050995af954  ONLINE       0     0     0
            gptid/2f316e6d-b55b-11e5-b564-d050995af954  ONLINE       0     0     0
          mirror-3                                      ONLINE       0     0     0
            gptid/0a4a27a4-4e7d-11e5-8d85-d050995af954  ONLINE       0     0     0
            gptid/0b3cc4f1-4e7d-11e5-8d85-d050995af954  ONLINE       0     0     0
        logs
          gptid/3a497d13-b4f6-11e5-b564-d050995af954    ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: resilvered 619M in 0h33m with 0 errors on Thu Jan 28 17:33:39 2016
config:

        NAME                                            STATE     READ WRITE CKSUM
        freenas-boot                                    ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/6ed1154b-37f4-11e5-a114-d050995af954  ONLINE       0     0     0
            da8p2                                       ONLINE       0     0     0
errors: No known data errors


Zpool List
Code:
[root@freenas] ~# zpool list
NAME           SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Jails          101G  27.1G  73.9G         -    27%    26%  1.00x  ONLINE  /mnt
Multimedia    9.97T  7.80T  2.17T     16.0E    17%    78%  1.00x  ONLINE  /mnt
freenas-boot  3.66G   627M  3.04G         -      -    16%  1.00x  ONLINE  -
 
Last edited:

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Is NFS set to Sync?

What vdev(s) are backing the pool with the NFS share?

Do you have a SLOG on that pool?
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Oh, and a 4GB boot device is asking for trouble. 8GB is the min recommended size now.
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
Please review the forum rules in red at the top of the page and provide detailed hardware and software information.

Also provide information about your pool, as well as how full it is.


Sent from my phone
 

brando56894

Wizard
Joined
Feb 15, 2014
Messages
1,537
Is NFS set to Sync?

What vdev(s) are backing the pool with the NFS share?

Do you have a SLOG on that pool?

I had actually forget to set them all to SYNC, so I added the mount option and re-mounted them but it didn't make a difference. I had RSIZE and WSIZE set to 8K, so I removed them and that helped slightly, I'm now at about a steady 12 MB/sec. I only have two pools: my jails pool, which is just a Samsung 850 evo, and then my Multimedia Pool (main storage pool) which has the SLOG. As for the boot drive, I've been meaning to change that, it's a SATA DOM that I purchased a while ago just for the hell of it, I think I'm going to put it in my PC and go back to the 16 GB SanDisk Cruzer Fit I originally had in there. Last time I checked the boot drive wasn't even half full, also doesn't FreeNAS load everything into RAM once it's finished booting?

@gpsguy I'll try to make a good diagram of all of this and put it in the first post.
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
Since depasseg knew the size of your boot device - is this in your signature? If so, when requesting help we ask that you put the info in the body of your message.

Those of us on mobile devices and/or use Tapatalk can't see your signature information.


Sent from my phone
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
The diagram is nice, but doesn't show the pool layout. Do you have 1 pool with 4 striped mirrored vdevs an a slog? Can you post the output of zpool status and zfs list in code tags?

And sync isn't going to help, it will hurt performance if it's enabled and the underlying pool is slow.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Don't use rsync to test performance. That is only testing the speed of rsync. You should be using dd or a benchmarking tool.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I thought synchronous writes were preferred with an SLOG since it would cache the writes then flush them out to the pool?
It's the other way around. A SLOG is preferred when required to do sync writes. :smile:
 

brando56894

Wizard
Joined
Feb 15, 2014
Messages
1,537
Using DD to write to a share in the VM was only about 30 MB/sec, but from Arch it was about 80 MB/sec.

Sent from my Pixel C using Tapatalk
 
Last edited:

random003

Dabbler
Joined
Sep 5, 2015
Messages
15
Try disabling sync temporarily to see if you get better performance.

zfs set sync=disabled tank/dataset

To re-enable-
zfs set sync=standard tank/dataset
 

jde

Explorer
Joined
Aug 1, 2015
Messages
93
I don't host VMs so I can't speak from experience and I may be completely missing the issue, but it seems jgreco is constantly warning folks not to exceed 50% of the pool capacity lest they face painful slowdowns in this type of configuration. OP is at 78%. Might this be the problem?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I don't host VMs so I can't speak from experience and I may be completely missing the issue, but it seems jgreco is constantly warning folks not to exceed 50% of the pool capacity lest they face painful slowdowns in this type of configuration. OP is at 78%. Might this be the problem?

By "Usenet apps" I'm guessing some sort of automatic downloading software that is writing lots of stuff to the pool. This sort of usage can cause lots of fragmentation which in turn causes slowness. At 78%, @jde is right to be concerned. What's the pool fragmentation at?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
By "Usenet apps" I'm guessing some sort of automatic downloading software that is writing lots of stuff to the pool. This sort of usage can cause lots of fragmentation which in turn causes slowness. At 78%, @jde is right to be concerned. What's the pool fragmentation at?

Never mind, found it. Sorry, complex thread to read on tapatalk. 17%, not awful but not great.

Guessing you did the mirrors due to the various drive sizes.

Thought: your current pool is only good for 12TB. You have VM storage on it. How big is that? If it isn't a lot of storage, like maybe would easily fit on a mirror of the 1TB drives, I note that a RAIDZ2 of the remaining drives would also be 12TB (effectively 6 3TB). Or if you got two more 4TB disks, six 4TB drives in Z2 is 16TB plus recycle the 3TB's for VM storage, that's 19TB of space.

I'm not necessarily convinced fragmentation is your main issue, but as the pool fills it becomes much harder for ZFS to allocate space and it is probably at least a contributor.

In the meantime try boosting the NFS read/write sizes to at least 32K.
 

brando56894

Wizard
Joined
Feb 15, 2014
Messages
1,537
Try disabling sync temporarily to see if you get better performance.

zfs set sync=disabled tank/dataset

To re-enable-
zfs set sync=standard tank/dataset

I'll give it a try and report back.

Edit: Just tried it on the vm-data dataset while copying data from within the dataset and it made no difference. Should I try it on the Downloads or TV dataset?

I don't host VMs so I can't speak from experience and I may be completely missing the issue, but it seems jgreco is constantly warning folks not to exceed 50% of the pool capacity lest they face painful slowdowns in this type of configuration. OP is at 78%. Might this be the problem?


IIRC the 50% cap is only for iSCSI and my iSCSI ZVOL is only about 20% used, the whole pool itself is at 78%, the cap for that is 80% (FreeNAS warns you if usage is above 80%) and it will be less than that in a few days when my other replacement HDD comes in and I can swap out the 3 TB green for a 4 TB HGST drive.

Never mind, found it. Sorry, complex thread to read on tapatalk. 17%, not awful but not great.

Guessing you did the mirrors due to the various drive sizes.

Thought: your current pool is only good for 12TB. You have VM storage on it. How big is that? If it isn't a lot of storage, like maybe would easily fit on a mirror of the 1TB drives, I note that a RAIDZ2 of the remaining drives would also be 12TB (effectively 6 3TB). Or if you got two more 4TB disks, six 4TB drives in Z2 is 16TB plus recycle the 3TB's for VM storage, that's 19TB of space.

I'm not necessarily convinced fragmentation is your main issue, but as the pool fills it becomes much harder for ZFS to allocate space and it is probably at least a contributor.

In the meantime try boosting the NFS read/write sizes to at least 32K.

Yea sorry about this, I'm trying to put all relevant info in the OP so people don't have to search through the thread for it. I originally had three pools setup: Multimedia which was RAIDZ, Downloads which was just a striped pair of 1 TB drives (temp storage so redundancy wasn't a thought), and then my single SSD for jails. I read in places here and on the internet that Striped Mirrors was better performance wise than RAIDZ, also the fact that I only had to upgrade two drives to change the size of the pool instead of all the drives in the pool. I've always though about doing RAID10 instead of RAID5 (when I was using mdadm on Linux) but sacrificing half of my drives and storage space at the time (around 6 TB total) seemed unreasonable for me since my main focus at the time was large amounts of contiguous space and not necessarily performance.

Regarding your thought of breaking out the vm dataset from the main pool: In the performance thread that I linked to I was told (and saw for myself) that it was better to write data to a larger pool rather than a smaller one with only one or two drives, since the writes can be devided among multiple spindles, instead of just one or two. I had one single HDD that I was using for seeding torrents and when trying to copy data from that drive to my pool would max out around 12 MB/sec also. I guess the ideal solution would be to get another SSD for the vm dataset since it's not really that much data, the dataset itself is 70 GB. (Edit: I actually accidentally already had that done since one VM HDD image was already on the SSD and it didn't make a difference.)

Regarding NFS, I though the default for NFSv4 was already 32k? I set it back to the default and that increased my speed a little, but only by a few Megs.
 
Last edited:

jde

Explorer
Joined
Aug 1, 2015
Messages
93
IIRC the 50% cap is only for iSCSI and my iSCSI ZVOL is only about 20% used, the whole pool itself is at 78%

My understanding is just a bit different than yours. My understanding is that the pool on which the iSCSI resides should not exceed 50%. Will your pool be at less than 50% once you get your bigger drives incorporated? If so, let us know if performance improves.
 

brando56894

Wizard
Joined
Feb 15, 2014
Messages
1,537
I just noticed that my Ubuntu-Test VMDK resides on the SSD (by mistake) so I mounted my storage dataset over NFS and proceeded to dd a 1.6 GB image to it and the transfer rate was still only about 16 MB/sec. Since the transfer rate is the same among VMs no matter where the OS storage resides (iSCSI ZVOL in the pool, VMDK in the pool or the SSD), it's not VM storage that is the bottleneck. I was thinking that maybe the bottleneck was the SATA controller for the SSD since it's using the onboard Intel controller and not the HBA but that clearly isn't the problem since I just dd'd a 1.6 GB image to it in 3 seconds (about 0.5 GB/sec) from within FreeNAS itself.

My understanding is just a bit different than yours. My understanding is that the pool on which the iSCSI resides should not exceed 50%. Will your pool be at less than 50% once you get your bigger drives incorporated? If so, let us know if performance improves.

We can let one of the gurus weigh in on this one, no it won't be. I added one a day or so ago and usage has dropped to 73%, so I'm going to guess after adding the next one it will be down to about 65%.
 
Last edited:
Status
Not open for further replies.
Top