Last Straw! ZFS/Disk Performance Issues

Status
Not open for further replies.

TravisT

Patron
Joined
May 29, 2011
Messages
297
Hey guys, I've posted a couple times about problems I've been having with my freenas box, but since lots has happened, I thought it was more appropriate to start a new thread.

A little background. I was running FreeNAS as a VM on ESXi. It worked ok, but the whole point of ZFS is flawed if there is an underlying file system on the disk, which there is in ESXi. Also, in the event of a problem with ESXi, my shares can only be mounted by another VMWare solution, then I need to run a guest OS capable of ZFS. Nightmare waiting to happen.

I've also experienced my share of performance issues. To remedy some of that, I decided to break apart the FreeNAS server from the ESXi. I had some old hardware laying around that I've decided to put to use (more below) - at least temporarily.

Before I go too far, I want to benchmark this server, make sure it is performing properly, then migrate the data from my VM-based FreeNAS box over to the new, standalone NAS. I also want to do iSCSI to enable backing up my VMs easily. I'd like to downgrade my ESXi box to the bare minimum hardware required to run the few servers I have on it (maybe an AMD 350-based board...), cutting down on space of two servers as well as power requirements.

My current hardware is:
MSI 870A-G54
AMD Athlon II X2 4400e (Single Core unlocked to dual cores)
4GB RAM
Various hard drives

I'm not getting consistent transfers when testing using the following command:

dd if=/dev/zero of=testfile bs=2048k count=2k

This seems to be a pretty agreed upon way to test disk read/write performance. This is done in a ZFS volume created on a single disk.

I'd really like to get this working reliably so I can move forward, instead of backwards.

Am I even on the right track here?
 

pfonseca

Dabbler
Joined
Jul 27, 2012
Messages
33
Hi,

I'm not a specialist in this area (neither in English language too:), but where from are you executing the dd command relatively to the FreeNAS machine?

Regards,
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Sounds like you are on the right track. I don't know anyone that recommends you put FreeNAS on a virtual machine for production use. It's great for experimenting, but nothing more.

Your transfers are probably not consistent because of your RAM. Per the manual you really should have 6GB of RAM to start, plus 1GB for each TB of hard drive space.

If you think something is wrong because your transfers aren't consistent, are we talking 5MB/sec versus 300MB/sec or 60MB/sec instead of 80MB/sec?

To put it bluntly, do you think something is wrong or are you just wanting validation that what you are doing makes sense?

If you want validation, what you are doing makes sense. More RAM would be my only recommendation at this point.
 

pfonseca

Dabbler
Joined
Jul 27, 2012
Messages
33
Again, not from an expert.... I think that RAM is a myth when we are talking about a domestic NAS. RAM is important and critical when you have to share a NAS with hundreds or thousands of users. That's my experience
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
You only test write performance of copressible data.

I suggest you get a copy of Bonnie ( or iozone) and start running a consistent test

( compile on a freebsd : gcc -O2 -o bonnie-static -static bonnie.c

place on filer, then login to filer and :
./bonnie-static -s 2000 -d .
File './Bonnie.46589', size: 2097152000
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 3...Seeker 2...start 'em...done...done...done...
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
2000 110938 94.1 173018 57.2 158419 57.3 69060 85.4 434823 59.3 586.0 3.6

( i did test 2GB to save time )
This will give write speed and read speed(s) in a predictable manner

For bonnie i guess that ./Bonnie -s 4000 -d <filesystem-to-test> would be better
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Hi,

I'm not a specialist in this area (neither in English language too:), but where from are you executing the dd command relatively to the FreeNAS machine?

Regards,

I'm running dd in the volume on the disk that I am attempting to test.

Sounds like you are on the right track. I don't know anyone that recommends you put FreeNAS on a virtual machine for production use. It's great for experimenting, but nothing more.

Your transfers are probably not consistent because of your RAM. Per the manual you really should have 6GB of RAM to start, plus 1GB for each TB of hard drive space.

If you think something is wrong because your transfers aren't consistent, are we talking 5MB/sec versus 300MB/sec or 60MB/sec instead of 80MB/sec?

To put it bluntly, do you think something is wrong or are you just wanting validation that what you are doing makes sense?

If you want validation, what you are doing makes sense. More RAM would be my only recommendation at this point.

As for the VM, I have 8GB allocated to the machine, and I have 3x 2TB drives in a raidZ. I'd like to move these to the dedicated hardware, but want to get that working reliably before attempting to migrate all that data. One problem I'm having is that I bought 3x Supermicro hot-swap SATA 2 enclosures, which would give me the capability for 15 drives in my case. I was really stoked about that, no digging in the case to add drives, etc. Then I realized that with the backplane installed, my disk transfers when running dd went from about 120MB/s to ~10MB/s. It's doing it on all 3 backplanes. That's problem one.

I also would like to get my samba performance to increase. Currently on the VM, I'm getting in the neighborhood of 30-40MB/s of throughput on a gigabit network. Once I get the disk transfer stuff worked out, I'll start benchmarking the new system over the network.

To throw one more wrench in the mix, I'd like to move to iSCSI for my VMs, and hope that they will perform the same or better than what they do currently running off of a single local SATA drive. This may be a longshot, and I'm willing to take a small hit here.

So yes, I think something is wrong, and yes I also want validation that what I'm doing is logical. Any suggestions on what to try/where to go are greatly welcomed :)

Again, not from an expert.... I think that RAM is a myth when we are talking about a domestic NAS. RAM is important and critical when you have to share a NAS with hundreds or thousands of users. That's my experience

I'm not sure the right answer here, but I know that 4GB is probably on the low side for what I'm using it for. It is only accessed by 1-2 users at a time, but if I throw in the iSCSI stuff on a ZFS store, I'm sure it would love more RAM, even though they are very low use VMs. I had 4GB in the spare mobo I used, so I went with that. If this works out, I'll likely max the board out with RAM or upgrade. 16GB is probably in my future.

You only test write performance of copressible data.

I suggest you get a copy of Bonnie ( or iozone) and start running a consistent test

( compile on a freebsd : gcc -O2 -o bonnie-static -static bonnie.c

place on filer, then login to filer and :
./bonnie-static -s 2000 -d .
File './Bonnie.46589', size: 2097152000
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 3...Seeker 2...start 'em...done...done...done...
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
2000 110938 94.1 173018 57.2 158419 57.3 69060 85.4 434823 59.3 586.0 3.6

( i did test 2GB to save time )
This will give write speed and read speed(s) in a predictable manner

For bonnie i guess that ./Bonnie -s 4000 -d <filesystem-to-test> would be better

I'll check into that shortly. I understand that this tests both read and write. Does dd not do this? I'm not very familiar with either, but I see that dd shows records in and out. Either way, I assume that I should get dd running consistently before moving further, right?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
DD is kind of the "established standard" on the FreeNAS forum. Do something like 50GB and see how it does. You need something big so it doesn't get cached. It'll say you get 1GB/sec+, which is pretty much impossible.

You said in the first post your machine had 4GB. But in your last message you said the VM has 8GB. I'm a little confused. Your VM is your current FreeNAS server and your "new" FreeNAS server has 4GB, right? You also said if this works you'll be upgrading it, right? Just making sure I'm not confused ;) Just for a comparison my "test" FreeNAS server is a VM with 6GB of RAM and has only 300GB of hard drive space. I use it to backup my desktop and laptop as part of the weekly. Easy to take a snapshot and break FreeNAS, then click a button and it's restored.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I'll give that a try. I'm assuming that I'd have to run something like bs=2048k count=25k? Let me know if that sounds right.

Sorry for the confusion. My "new" machine has 4GB. My current VM running my main datastores (3x 2TB drives in raidZ) is allocated 8GB of RAM.

Once my data is over on the new server, I will shuffle around what I have to better utilize my hardware. I have 16GB in my ESXi server, but without FreeNAS running on it using 8GB, I can likely get away with 8GB or less in it.

I turned autotune on a little while ago and rebooted the new FreeNAS box for it to take effect. I noticed that when I walked by it had an error on the screen about not able to perform dump, rebooting in 15 seconds - but it was stuck there. I have to run now, I'll post the exact error in a little while.
 

image132

Dabbler
Joined
Apr 23, 2012
Messages
19
Again, not from an expert.... I think that RAM is a myth when we are talking about a domestic NAS. RAM is important and critical when you have to share a NAS with hundreds or thousands of users. That's my experience

Incorrect. I'm no expert but I know when I transfer large files over my network to my freeNAS box it easily uses 3.8 of my 4GB of ram. Since my motherboard only has 2 slots of DDR2 ram I'm unable to upgrade that anymore but I have no reservations that if I did upgrade it to 6/8GB it would use well over 4GB.

Oh and this is in a home environment.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
From what I've experienced with ZFS, it is RAM hungry for sure. What I can't speak to is how much of a performance hit you take when you don't have *enough* RAM. I could be wrong, but if you put 16GB in your home server, my guess is that it would make use of it. I plan to throw as much RAM as I can at this build, within reason.
 

praecorloth

Contributor
Joined
Jun 2, 2011
Messages
159
I also would like to get my samba performance to increase. Currently on the VM, I'm getting in the neighborhood of 30-40MB/s of throughput on a gigabit network. Once I get the disk transfer stuff worked out, I'll start benchmarking the new system over the network.

To throw one more wrench in the mix, I'd like to move to iSCSI for my VMs, and hope that they will perform the same or better than what they do currently running off of a single local SATA drive. This may be a longshot, and I'm willing to take a small hit here.

Just going to chime in on the pieces I have info on.

First and foremost. Samba isn't a high performer. 30-40MB/s is what you can expect with an out of the box configuration. I think people around the forum have tweaked and tweaked and been able to get better speeds, but I've never seen it with my own two eyes. I've done some tweaking and been rewarded only with worse performance. Having done some fairly thorough testing, I can assure you that Samba is the limiting factor.

Now I will go in to my warning about iSCSI and ESXi. A company I work with has deployed six ESXi 5 + FreeNAS iSCSI solutions, and all six of them have had major stability issues. Finding out that VMWare made significant changes to their software initiator, we moved to NFS, but have only done one install like that so far. That has also had major stability issues. (PSODs on the ESXi box)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I'll give that a try. I'm assuming that I'd have to run something like bs=2048k count=25k? Let me know if that sounds right.
You already know the answer to this.

Go run the tests from the [thread=981]performance sticky[/thread] and post the results here.
You don't want to wait? Oh, well. That test will let you directly compare to everyone else. Which is what you are asking to do. In this case you will want to run it against the individual drives, not in an array, one at a time, and then all at the same time. At least do 1 drive by itself first, then all drives at the same time. If anything is off then you can do the other drives individually.

Have you look at the SMART data for the drives?
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
So the error I got was
Code:
ada0:ahcich3:0:0:0 synchronize cache failed
Cannot dump. Device not defined or unavailable.


This is with the drives in the removable drive bays, so I'm hoping that is the problem. Any ideas?
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
This is with the drives in the removable drive bays, so I'm hoping that is the problem. Any ideas?
What's the model # of the drive bays? Post the output of dmesg to see what's being detected as what..
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Just going to chime in on the pieces I have info on.

First and foremost. Samba isn't a high performer. 30-40MB/s is what you can expect with an out of the box configuration. I think people around the forum have tweaked and tweaked and been able to get better speeds, but I've never seen it with my own two eyes. I've done some tweaking and been rewarded only with worse performance. Having done some fairly thorough testing, I can assure you that Samba is the limiting factor.

Now I will go in to my warning about iSCSI and ESXi. A company I work with has deployed six ESXi 5 + FreeNAS iSCSI solutions, and all six of them have had major stability issues. Finding out that VMWare made significant changes to their software initiator, we moved to NFS, but have only done one install like that so far. That has also had major stability issues. (PSODs on the ESXi box)

I hope that the iSCSI will work, but I'm not very familiar with it. Because I would not call these production servers by any means, the IOPS required would probably be much lower that what you are referencing. I will give it a shot, and if it's stable for me, I may switch to it. Ultimately, if I can get my ESXi box to be in a very small (htpc-like) case with a tiny power supply and no disks, that would be perfect. I appreciate your input and will keep this in mind.

You already know the answer to this.

You don't want to wait? Oh, well. That test will let you directly compare to everyone else. Which is what you are asking to do. In this case you will want to run it against the individual drives, not in an array, one at a time, and then all at the same time. At least do 1 drive by itself first, then all drives at the same time. If anything is off then you can do the other drives individually.

Have you look at the SMART data for the drives?

I will compare to others, and the reason that I wanted to test each drive individually is to rule out a single disk giving me nightmares when trying to test an array.

I'm still poking around with the SMART tests, because until now I haven't had the capability to even monitor that (in ESXi), since I was using virtual disks. I can figure out how to run a scheduled smart test and have the results emailed - is there another way to view that in the GUI or from the CLI? I did find one of my drives was reporting an error in SMART (offline uncorrectable errors and currently unreadable (pending) sectors). I pulled the drive and have not investigated further. Again, this was in my test box, and that may have been while it was in the backplane.

What's the model # of the drive bays? Post the output of dmesg to see what's being detected as what..

The model of the drive bays is CSE-M35T-1B. The output of dmesg is here
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I will compare to others, and the reason that I wanted to test each drive individually is to rule out a single disk giving me nightmares when trying to test an array.
Yes, I guess I wasn't that clear. I was suggesting you test the drives individually using the same dd test.

  • run the following dd commands, from the performance sticky, in 3 SSH sessions, one per session at the same time.

    [size=+1]This will destroy data on your disks.[/size] Only run it on disks not in an array.
    Code:
    dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k
    dd if=/dev/zero of=/dev/ada1 bs=2048k count=50k
    dd if=/dev/zero of=/dev/ada2 bs=2048k count=50k


    Note the times and follow it by running the tests against each drive singly. E.G.
    Code:
    dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k

I'm still poking around with the SMART tests, because until now I haven't had the capability to even monitor that (in ESXi), since I was using virtual disks. I can figure out how to run a scheduled smart test and have the results emailed - is there another way to view that in the GUI or from the CLI?
From an SSH session as root:
Code:
View SMART data:
smartctl -a /dev/adaX

Run long SMART test:
smartctl -t long /dev/adaX
Where /dev/adaX is your actual device name.
 

martijn

Dabbler
Joined
Jun 10, 2011
Messages
13
Im not sure if this also applies to iSCSI.
We had an IO problem for NFS synch writes.
ZFS constantly flushes the write buffer in this case.
To avoid this do: vfs.zfs.cache_flush_disable=1

This problem occured when using a dedicated FreeNAS box with a Citrix XenServer box. VM's are shares with NFS to the Xen box.
After i applied above setting the IO problem was solved.
It was like turning the lights on, never had a IO problem sinds then.

We are using FreeNAS 8.0.4
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Yes, I guess I wasn't that clear. I was suggesting you test the drives individually using the same dd test.


From an SSH session as root:
Code:
View SMART data:
smartctl -a /dev/adaX

Run long SMART test:
smartctl -t long /dev/adaX
Where /dev/adaX is your actual device name.

Not sure if there is a difference between running them all at once and one at a time. I ran one test last night (read and write) in the drive bays. Here are the results:

Code:
[root@freenas] /mnt/Test0# dd if=/dev/zero of=tmp.dat bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 3260.563891 secs (16465585 bytes/sec)
[root@freenas] /mnt/Test0# dd if=tmp.dat of=/dev/null bs=2048k count=50k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 1682.364831 secs (31911682 bytes/sec)
[root@freenas] /mnt/Test0#


Not too impressive. I believe one of the three disks I have is bad, so I'm not testing that one until I can investigate further. These are extra disks, so destroying data is not an issue. I'll run all three disks simultaneously now, and post the results when the tests are done.

Im not sure if this also applies to iSCSI.
We had an IO problem for NFS synch writes.
ZFS constantly flushes the write buffer in this case.
To avoid this do: vfs.zfs.cache_flush_disable=1

This problem occured when using a dedicated FreeNAS box with a Citrix XenServer box. VM's are shares with NFS to the Xen box.
After i applied above setting the IO problem was solved.
It was like turning the lights on, never had a IO problem sinds then.

We are using FreeNAS 8.0.4

Thanks for that info as well. I plan to explore the performance of iSCSI in the near future, as soon as I get my disk performance at reliable levels.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
[root@freenas] ~# dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k
dd: /dev/ada0: Operation not permitted

Looks like I can't write directly to the dev - not sure if this is normal or not.
 
Status
Not open for further replies.
Top