Performance Testing with DD

nite244 · Jun 11, 2011

Hello,

I was hoping those knowledgeable folks out there could help clarify some points and to help others in testing their FreeNAS builds.

In reading through the forums, I've been trying out the different DD commands recommended. After doing a bit more reading, I'm trying to determine which are the correct ways to test to give meaningful data.

I first tried the following, with the results included:

Code:

dd if=/dev/zero of=/mnt/storage_vol/dataset_main/ddfile bs=1024k count=20000
20971520000 bytes transferred in 59.731942 secs (351093891 bytes/sec)
20971520000 bytes transferred in 50.700838 secs (413632610 bytes/sec)
20971520000 bytes transferred in 49.925832 secs (420053490 bytes/sec)

dd if=/dev/zero of=/mnt/storage_vol/dataset_main/ddfile bs=2048k count=10000
20971520000 bytes transferred in 44.941596 secs (466639413 bytes/sec)
20971520000 bytes transferred in 43.553766 secs (481508763 bytes/sec)
20971520000 bytes transferred in 45.993391 secs (455968119 bytes/sec)

Which seems to create a 20gig file, filled with zeros. The results seem great, but almost too good to be true.

I also read that you can insert random data, instead of just zeros, into a file - I tested this using:

Code:

dd if=/dev/random of=/mnt/storage_vol/dataset_main/ddfile bs=2048k count=10000
20971520000 bytes transferred in 259.340353 secs (80864855 bytes/sec)
20971520000 bytes transferred in 254.590083 secs (82373672 bytes/sec)
20971520000 bytes transferred in 240.440793 secs (87221140 bytes/sec)

These results are obviously lower, but seemed like they may be more accurate for real world transfers to and from the NAS (an assumption on my part).

With all of that said, are there any limitations to the random number generator, /dev/random? Is that test actually producing useful numbers or am I testing some scenario that really isn't relevant?

Let me know what everyone thinks - am I on the right track or not?

Thanks

matthewowen01 · Jun 11, 2011

/dev/random is very cpu intensive and will give you very bad results.

Code:

dd if=/dev/zero of=/dev/null bs=1024k count=20k
20480+0 records in
20480+0 records out
21474836480 bytes transferred in 1.465271 secs (14655880405 bytes/sec)

14 GB/s /dev/zero to /dev/null

Code:

dd if=/dev/random of=/dev/null bs=1024k count=20k
^C10545+0 records in
10545+0 records out
11057233920 bytes transferred in 118.333067 secs (93441624 bytes/sec)

90MB/s /dev/random to /dev/null (i killed it after 2 minutes)

so i can only get 90MB/s of data from /dev/random.

nite244 · Jun 11, 2011

matthewowen01 said:
/dev/random is very cpu intensive and will give you very bad results.

Thanks for the feedback - this is the kind of information I was looking for. I never thought to test it against /dev/null - great idea!

It looks like the /dev/zero is the way to go with that information - am I correct that this test is testing the maximum write throughput for the pool?

Along those lines, would you say the below results are on par with what I should be getting for these specs - I honestly don't know what someone should be looking for:

- FreeNAS 8.0.1 Beta2
- Core i3-2100 with 8gb of RAM
- 6 x 2TB Seagate Barracuda Green ST2000DL003 drives
- all 6 drives used in a single RAIDZ2 configuration, 4k sectors forced (~8TB usable space)

Code:

dd if=/dev/zero of=/mnt/storage_vol/dataset_main/ddfile bs=2048k count=10000
20971520000 bytes transferred in 44.941596 secs (466639413 bytes/sec)
20971520000 bytes transferred in 43.553766 secs (481508763 bytes/sec)
20971520000 bytes transferred in 45.993391 secs (455968119 bytes/sec)

dd of=/dev/null if=/mnt/storage_vol/dataset_main/ddfile bs=2048k count=10000
20971520000 bytes transferred in 45.856610 secs (457328180 bytes/sec)
20971520000 bytes transferred in 43.445757 secs (482705827 bytes/sec)
20971520000 bytes transferred in 42.890633 secs (488953379 bytes/sec)

Also, if there are some other recommended bs= and count= values I should test with, please let me know.

Thanks!

Milhouse · Jun 11, 2011

Your results are fine. On my less powerful 4GB HP N36L (AMD 1.3GHz Neo) with LSI 9211-8i controller the same dd test yields read/write results of 250MB/320MB respectively from a zpool with two vdevs (4x2TB and 4x500GB, both RAIDZ1). I suspect my performance is now largely CPU bound.

It would be interesting to see if your write performance improves any by splitting your 6 disk RAIDZ2 vdev into two stacked 3x2TB/RAIDZ1 vdevs - this would give you double the number of IOPS.

nite244 · Jun 11, 2011

Milhouse said:
It would be interesting to see if your write performance improves any by splitting your 6 disk RAIDZ2 vdev into two stacked 3x2TB/RAIDZ1 vdevs - this would give you double the number of IOPS.

I can test that out no problem, as I don't have any data on the my FreeNAS box yet - I'm waiting for a later release (be it 8.0.1 release or 8.1) before I move my data over.

How would you recommend I set it up and test it? I'm new to FreeNAS, so I'm learning as we go.

For background, I set it up as a RAIDZ2 as I was reading that was the best way to minimize problems with disk failures (having up to 2 failures possible without loosing everything) - would the RAIDZ1 proposed above be less "safe" (eg. allowing only 1 drive failure before everything is lost)?

Thanks!

Milhouse · Jun 11, 2011

nite244 said:
How would you recommend I set it up and test it? I'm new to FreeNAS, so I'm learning as we go.

This worked for me, but there's another thread on the forum suggesting it doesn't work for everyone:

1) Delete your dataset and volume
2) Create a volume, call it "test", select the first 3 disks, select ZFS, tick 4K, select RAIDZ1, hit "Add volume" button
3) Create another volume, also call it "test", select the remaining 3 disks, select ZFS, tick 4K, select RAIDZ1, hit "Add volume" button

You should now have a zpool (volume) called "test" that contains two RAIDZ1 vdevs, each with three disks. Benchmarking this zpool by writing and reading data to a file called /mnt/test/ddfile should give hopefully improved results over a single vdev.

When writing data ZFS will commit the data on a per-vdev basis, so if writing to a vdev with 6 disks it's got to wait until the data is committed on the 6th disk before writing any more data - essentially, the IOPS of your 6-disk vdev is going to be that of a single disk. However if you have two vdevs each with 3 disks then ZFS only has to wait until the data is safely committed across the third disk in a vdev, but more importantly ZFS can write to both vdevs concurrently, meaning double the IOPS.

Taking it to a logical extreme, three vdevs each with two 2TB disks mirrored (RAID1) should give you triple the IOPS of a single six disk vdev (that's the theory anyway). However with three vdevs in a mirrored configuration you'd lose 2TB of usable storage, and with multiple vdevs you have to weigh up the redundancy pros and cons: 1xRAIDZ2 (lose any 2 disks from the single vdev) against 2xRAIDZ1 (lose at most 2 disks but no more than one disk from a single vdev) against 3xRAID1 (lost at most 3 disks but no more than one disk from a single vdev). One other advantage of having fewer disks in each vdev should be a reduction in the resilver (repair) time when replacing a failed disk.

Note that read performance also improves with additional vdevs.

nite244 · Jun 11, 2011

Milhouse said:
This worked for me, but there's another thread on the forum suggesting it doesn't work for everyone:

1) Delete your dataset and volume
2) Create a volume, call it "test", select the first 3 disks, select ZFS, tick 4K, select RAIDZ1, hit "Add volume" button
3) Create another volume, also call it "test", select the remaining 3 disks, select ZFS, tick 4K, select RAIDZ1, hit "Add volume" button

You should now have a zpool (volume) called "test" that contains two RAIDZ1 vdevs, each with three disks. Benchmarking this zpool by writing and reading data to a file called /mnt/test/ddfile should give hopefully improved results over a single vdev.

I'm not sure if it worked as planned for me either, as the test results are basically the same (+/- a bit). When I added the two, 3 disk volumes, the first volume was approximately 4TB of available space and then when I added the second volume, same name, it added it to the original volume, as there was 8TB of free space now. There's only 1 volume (vdev?) visible in the GUI - maybe this was to be expected, not sure.

None the less, here are the test results:

Code:

dd if=/dev/zero of=/mnt/test/ddfile bs=2048k count=10000
20971520000 bytes transferred in 42.265276 secs (496187935 bytes/sec)
20971520000 bytes transferred in 42.296602 secs (495820444 bytes/sec)
20971520000 bytes transferred in 42.959293 secs (488171909 bytes/sec)

dd of=/dev/null if=/mnt/test/ddfile bs=2048k count=10000
20971520000 bytes transferred in 44.112512 secs (475409790 bytes/sec)
20971520000 bytes transferred in 41.572048 secs (504462037 bytes/sec)
20971520000 bytes transferred in 41.621009 secs (503868610 bytes/sec)

Feels like it's still hitting that single disk performance wall that you mentioned, but I certainly could be wrong.

Let me know if you have any other ideas - I'd like to test this out as much as I can, as it's better to do it now than when I have important data on this NAS. I'm also hoping this data can help out others when benchmarking/creating their NAS.

Thanks!

headconnect · Jun 12, 2011

Firstly, nite - thanks for starting this thread - just what I want to see :) I wonder if we can take it in the direction of 'standardizing' dd commands for benchmarking, possibly with values tuned for both 512 and 4k drives, or perhaps get a tiny little batch script included in freenas for reportable performance testing (possibly even runnable from the ui with graphed results).

For me, I think this is one of the more interesting things for freenas as there are so many variable configurations, it would eventuallly be interesting to be able to publish the results somewhere so that they could be viewed and compared. With the AMD E-350 discussion, there are so many similar setups with similar hardware, and it would be interesting to see what works best with what.

In any event - the final bit would be to be able to simulate CIFS/NFS/AFP performance (saying simulate, because if that's possible, then the results wouldn't be reliant on cabling and switches an the like - eve nthough that is of course also interesting info, it would just not be so reliably reproducable in terms of the FreeNAS system itself).

So - any recommendation for a 'standard' FreeNAS set of dd benchmark commands? :)

Milhouse · Jun 12, 2011

nite244 said:
I'm not sure if it worked as planned for me either, as the test results are basically the same (+/- a bit). When I added the two, 3 disk volumes, the first volume was approximately 4TB of available space and then when I added the second volume, same name, it added it to the original volume, as there was 8TB of free space now. There's only 1 volume (vdev?) visible in the GUI - maybe this was to be expected, not sure.

Sounds like it's worked - if you can get to the command line (using ssh), running the command "zpool status" would confirm how your zpool (volume) is now setup and it should consist of two vdev's.

nite244 said:
None the less, here are the test results:

Not really sure why you're not getting significantly better results (I'd say you're getting about 10% better results, but that's not significant enough when you should be getting double the IOPS). This blog post on vdev performance is worth a read. Can you try a third test with three vdevs, each with two mirrored disks - mirroring is meant to be the best configuration for performance. If you still don't see a difference I'll be at a complete loss... :)

nite244 · Jun 12, 2011

headconnect said:
Firstly, nite - thanks for starting this thread - just what I want to see :) I wonder if we can take it in the direction of 'standardizing' dd commands for benchmarking, possibly with values tuned for both 512 and 4k drives, or perhaps get a tiny little batch script included in freenas for reportable performance testing (possibly even runnable from the ui with graphed results).

No problem - glad I could help. I too would like to see a standardized set of commands that people can run and compare their results against others - it would make it far easier to verify your rig to see that it's running properly. Plus it's always fun to geek out on how fast your setup is too :)

I can certainly run more tests on my rig, as I have time until a more finalized build is released. I do want to put this into "production" (it's a home environment) at some time, but I'm not quite ready yet. I guess all I ask is for input on what we'd like to see tested - I only have the equipment previously listed, but I can run whatever tests people see as relevant to build up some knowledge.

In any event - the final bit would be to be able to simulate CIFS/NFS/AFP performance

Agreed - I've been playing around with Intel's NAS Performance Toolkit, but it is a bit old. Does anyone know if this is still a relevant test with today's computers (more than 2gb of RAM for example)? It's not a pure simulation, as a person's network and client computer are involved and affect results, but it's a start unless there's something better out there.

So - any recommendation for a 'standard' FreeNAS set of dd benchmark commands?

I'll run some more tests when I get time with varied bs= variables to see if that generates any different results. If anyone has any recommendations, please post away!

Thanks!

nite244 · Jun 12, 2011

Milhouse said:
Not really sure why you're not getting significantly better results (I'd say you're getting about 10% better results, but that's not significant enough when you should be getting double the IOPS). This blog post on vdev performance is worth a read. Can you try a third test with three vdevs, each with two mirrored disks - mirroring is meant to be the best configuration for performance. If you still don't see a difference I'll be at a complete loss... :)

Ok, here are the results of the 2 x 3 disk RAIDZ1 with zpool status included - I re-ran the tests this morning as I was messing around yesterday after posting and wanted to make sure the results were similar.

To make it easier to parse what was done here, here's what these results are:

- in GUI, created a RAIDZ1 volume using 3 disks, 4k forced, using name test
- in GUI, created a second RAIDZ1 volume using 3 disks, 4k forced, using name test
- creates an 8TB volume as the end result

Code:

zpool status
  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        test                                            ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            gptid/6f484266-9516-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/6fc50701-9516-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/703d6e9d-9516-11e0-a1a7-001b218b578e  ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            ada3p2.nop                                  ONLINE       0     0     0
            ada4p2.nop                                  ONLINE       0     0     0
            ada5p2.nop                                  ONLINE       0     0     0

dd if=/dev/zero of=/mnt/test/ddfile bs=2048k count=10000
20971520000 bytes transferred in 42.356946 secs (495114072 bytes/sec)
20971520000 bytes transferred in 41.491508 secs (505441258 bytes/sec)
20971520000 bytes transferred in 45.611020 secs (459790637 bytes/sec)

dd of=/dev/null if=/mnt/test/ddfile bs=2048k count=10000
20971520000 bytes transferred in 44.158784 secs (474911626 bytes/sec)
20971520000 bytes transferred in 41.917069 secs (500309791 bytes/sec)
20971520000 bytes transferred in 41.864747 secs (500935070 bytes/sec)

Also, another test with 3 x 2disk Mirrored

- in GUI, created a mirrored volume using 2 disks, 4k forced, named test-mirror-raidz
- in GUI, created a second mirrored volume using 2 disks, 4k forced, named test-mirror-raidz
- in GUI, created a third mirrored volume using 2 disks, 4k forced, named test-mirror-raidz
- creates a 6TB volume as the end result

Code:

zpool status
  pool: test-mirror-raidz
 state: ONLINE
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        test-mirror-raidz                               ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            gptid/f561fd45-951a-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/f5dcb8f8-951a-11e0-a1a7-001b218b578e  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada2p2.nop                                  ONLINE       0     0     0
            ada3p2.nop                                  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada4p2.nop                                  ONLINE       0     0     0
            ada5p2.nop                                  ONLINE       0     0     0


dd if=/dev/zero of=/mnt/test-mirror-raidz/ddfile bs=2048k count=10000
20971520000 bytes transferred in 50.566710 secs (414729770 bytes/sec)
20971520000 bytes transferred in 52.428997 secs (399998497 bytes/sec)
20971520000 bytes transferred in 51.173580 secs (409811470 bytes/sec)

dd of=/dev/null if=/mnt/test-mirror-raidz/ddfile bs=2048k count=10000
20971520000 bytes transferred in 36.187830 secs (579518584 bytes/sec)
20971520000 bytes transferred in 34.481949 secs (608188362 bytes/sec)
20971520000 bytes transferred in 37.130047 secs (564812642 bytes/sec)

Looks like the mirrored volumes above are great at reads, but not as good at writes. I also just realized after I posted this that RAIDZ has nothing to do with these test results - sorry if my naming confused anyone.

Thanks!

matthewowen01 · Jun 12, 2011

the 3 stripped mirrors and the 2 stripped raidz's may give you better performance, but reduce your ability to survive disk failure. I'd still go with a raidz2 as your performance for your raidz2 was well above your gigabit ethernet adapter can handle. a raidz2 can always handle 2 disks of loss, while both the aforementioned setups can fail with only 2 losses.

Milhouse · Jun 12, 2011

nite244 said:

I wonder why only the first vdev is showing as gptid and the rest nop? With my setup, 4x2TB/4K plus 4x500GB/4K both vdevs are gptid:

Code:

freenas# zpool status
  pool: share
 state: ONLINE
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        share                                           ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            gptid/9919eae2-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
            gptid/9984a486-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
            gptid/99efa051-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
            gptid/9a5fdf5b-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            gptid/b7b496ae-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
            gptid/b8bc1851-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
            gptid/b9ba498d-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0
            gptid/babd1879-92b2-11e0-aa4b-001b2188359c  ONLINE       0     0     0

errors: No known data errors

It would be interesting to know the ashift on the second and third vdevs to make sure they're set correctly at 12 (4K) and not 9 (512 byte) - try running "zdb | grep ashift" (if you don't have a valid cache, enable it with "zpool set cachefile=/data/zfs/zpool.cache <volume-name>").

matthewowen01 said:
the 3 stripped mirrors and the 2 stripped raidz's may give you better performance, but reduce your ability to survive disk failure. I'd still go with a raidz2 as your performance for your raidz2 was well above your gigabit ethernet adapter can handle. a raidz2 can always handle 2 disks of loss, while both the aforementioned setups can fail with only 2 losses.

Yes and no... obviously with three mirrored vdevs you can survive three independent disk failures compared with only 2 disk failures in a single RAIDZ2 vdev, but of course you're totally screwed if you lose two disks in the same vdev. :)

It could even be argued that three mirrored vdevs gives you more redundancy, or alternatively less redundancy, depending on how many of your disks happen to fail, and in which vdev they reside. I also believe that resilvering will complete more quickly with a mirrored and/or smaller vdev, and resilvering is often pointed out as being the most stressful work load for a disk and likely to cause marginal disks to fail, so if resilvering can be completed more quickly it's another point in it's favour.

As long as you have a current backup - and as we all know NAS is not a backup! - then running a slightly higher risk for improved performance may be acceptable, horses for courses really.

But yeah, once you are saturating your network interface there's nothing more to be gained from improved disk performance. Then it just becomes more of an intellectual exercise (or you could investigate link aggregation!) :)

nite244 · Jun 12, 2011

matthewowen01 said:
the 3 stripped mirrors and the 2 stripped raidz's may give you better performance, but reduce your ability to survive disk failure. I'd still go with a raidz2 as your performance for your raidz2 was well above your gigabit ethernet adapter can handle. a raidz2 can always handle 2 disks of loss, while both the aforementioned setups can fail with only 2 losses.

Fully agreed - it appears in my scenario that RAIDZ2 is the way to go (I prefer to minimize failure rather than go full out performance), but it's always interesting to see what you can get when you have a test bed available to you.

nite244 · Jun 12, 2011

Milhouse said:
I wonder why only the first vdev is showing as gptid and the rest nop?

Good question - it looked funky to me as well, but maybe someone else can answer that, as I unfortunately don't know myself or how to check/test it further.

It would be interesting to know the ashift on the second and third vdevs to make sure they're set correctly at 12 (4K) and not 9 (512 byte) - try running "zdb | grep ashift" (if you don't have a valid cache, enable it with "zpool set cachefile=/data/zfs/zpool.cache <volume-name>").

Done and done :) Below are the results with a new volume, dual RAIDZ1:

Code:

freenas# zpool status
  pool: test-dual-raidz1
 state: ONLINE
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        test-dual-raidz1                                ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            gptid/dec7c273-9532-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/df487aac-9532-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/dfbcd95d-9532-11e0-a1a7-001b218b578e  ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            ada3p2.nop                                  ONLINE       0     0     0
            ada4p2.nop                                  ONLINE       0     0     0
            ada5p2.nop                                  ONLINE       0     0     0

errors: No known data errors
freenas# zdb | grep ashift
freenas# zpool set cachefile=/data/zfs/zpool.cache test-dual-raidz1
freenas# zdb | grep ashift
                ashift=12
                ashift=12

Looks like it's taking the 4k, which is good to know, as you never know for sure if things are working as you expect without checking.

Man I'm learning a lot from this thread - thanks everyone!

matthewowen01 · Jun 12, 2011

Milhouse said:
Yes and no... obviously with three mirrored vdevs you can survive three independent disk failures compared with only 2 disk failures in a single RAIDZ2 vdev, but of course you're totally screwed if you lose two disks in the same vdev. :)

It could even be argued that three mirrored vdevs gives you more redundancy, or alternatively less redundancy, depending on how many of your disks happen to fail, and in which vdev they reside. I also believe that resilvering will complete more quickly with a mirrored and/or smaller vdev, and resilvering is often pointed out as being the most stressful work load for a disk and likely to cause marginal disks to fail, so if resilvering can be completed more quickly it's another point in it's favour.

you've got to look at the worst case, see Murphy's Law. But as for speed of recovery, yes and no. with a mirror, the data has to be copied directly, with a raidz(2) it has to be recovered from the existing data and the parity bit(s) this is more cpu intensive than just the copy but i doubt you'd notice a difference unless you have a low end cpu.

Milhouse said:
But yeah, once you are saturating your network interface there's nothing more to be gained from improved disk performance. Then it just becomes more of an intellectual exercise (or you could investigate link aggregation!) :)

i aggregated my 2 gigabit nics, i have each port going to a different switch that contain my 2 high bandwidth devices. during testing i was able to push and pull 2 gb/s each way for a total throughput of about 4 gb/s

nite244 · Jun 12, 2011

In case anyone was wondering if changing the bs= and count= values make a difference - the answer between the range of bs=512k to bs=16M is no (checked values of bs=512k, 1M, 2M, 4M, 8M, 16M, varying the count= to still equal a 20gig file).

Here are some results for those interested (figured there was no point in posting everything, as they're all the same, +/- a small bit):

Code:

freenas# zpool status
  pool: test-raidz2-6disk
 state: ONLINE
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        test-raidz2-6disk                               ONLINE       0     0     0
          raidz2                                        ONLINE       0     0     0
            gptid/15f8fee2-9534-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/167b53d9-9534-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/16f8826f-9534-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/1774176b-9534-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/17ed3fb3-9534-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/18686634-9534-11e0-a1a7-001b218b578e  ONLINE       0     0     0

errors: No known data errors
freenas# zdb | grep ashift
freenas# zpool set cachefile=/data/zfs/zpool.cache test-raidz2-6disk
freenas# zdb | grep ashift
                ashift=12

bs=512k count=40000

Code:

dd if=/dev/zero of=/mnt/test-raidz2-6disk/ddfile bs=512k count=40000
20971520000 bytes transferred in 45.175758 secs (464220654 bytes/sec)
20971520000 bytes transferred in 41.994703 secs (499384886 bytes/sec)
20971520000 bytes transferred in 44.322343 secs (473159103 bytes/sec)

dd of=/dev/null if=/mnt/test-raidz2-6disk/ddfile bs=512k count=40000
20971520000 bytes transferred in 38.491802 secs (544830819 bytes/sec)
20971520000 bytes transferred in 38.177547 secs (549315544 bytes/sec)
20971520000 bytes transferred in 38.218222 secs (548730915 bytes/sec)

bs=2M count=10000 (my baseline of sorts)

Code:

dd if=/dev/zero of=/mnt/test-raidz2-6disk/ddfile bs=2M count=10000
20971520000 bytes transferred in 44.764873 secs (468481615 bytes/sec)
20971520000 bytes transferred in 45.457481 secs (461343646 bytes/sec)
20971520000 bytes transferred in 45.510251 secs (460808708 bytes/sec)

dd of=/dev/null if=/mnt/test-raidz2-6disk/ddfile bs=2M count=10000
20971520000 bytes transferred in 39.025759 secs (537376352 bytes/sec)
20971520000 bytes transferred in 38.214881 secs (548778891 bytes/sec)
20971520000 bytes transferred in 38.183721 secs (549226727 bytes/sec)

bs=16M count=1250

Code:

dd if=/dev/zero of=/mnt/test-raidz2-6disk/ddfile bs=16M count=1250
20971520000 bytes transferred in 44.512671 secs (471135960 bytes/sec)
20971520000 bytes transferred in 44.467920 secs (471610095 bytes/sec)
20971520000 bytes transferred in 45.621788 secs (459682115 bytes/sec)

dd of=/dev/null if=/mnt/test-raidz2-6disk/ddfile bs=16M count=1250
20971520000 bytes transferred in 39.013512 secs (537545045 bytes/sec)
20971520000 bytes transferred in 38.243565 secs (548367286 bytes/sec)
20971520000 bytes transferred in 38.158968 secs (549582998 bytes/sec)

I guess that makes sense though, as you're creating the same sized file in each instance, but I thought I'd test just in case.

nite244 · Jun 12, 2011

I was also wondering what the performance on a 5disk RAIDZ1 would be for this system as this was my original plan, so here's the results - interesting in my opinion, but I'm sure there's a reason for the results compared to the 6disk RAIDZ2:

Code:

freenas# zpool status
  pool: test-raidz1-5disk
 state: ONLINE
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        test-raidz1-5disk                               ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            gptid/aab8c476-953a-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/ab3fc32d-953a-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/abbcaed8-953a-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/ac40d943-953a-11e0-a1a7-001b218b578e  ONLINE       0     0     0
            gptid/acbee67e-953a-11e0-a1a7-001b218b578e  ONLINE       0     0     0

errors: No known data errors
freenas# zdb | grep ashift
freenas# zpool set cachefile=/data/zfs/zpool.cache test-raidz1-5disk
freenas# zdb | grep ashift
                ashift=12

dd if=/dev/zero of=/mnt/test-raidz1-5disk/ddfile bs=2M count=10000
20971520000 bytes transferred in 41.738874 secs (502445754 bytes/sec)
20971520000 bytes transferred in 41.738300 secs (502452665 bytes/sec)
20971520000 bytes transferred in 42.412636 secs (494463961 bytes/sec)

dd of=/dev/null if=/mnt/test-raidz1-5disk/ddfile bs=2M count=10000
20971520000 bytes transferred in 66.202426 secs (316778724 bytes/sec)
20971520000 bytes transferred in 65.648944 secs (319449464 bytes/sec)
20971520000 bytes transferred in 64.551458 secs (324880655 bytes/sec)

Looks like more a little more write, but noticeably less read on my rig. Seems like RAIDZ2 is the way to go not just for minimizing risk, but also for performance. As always, mileage may vary and this may or may not pertain to everyone's setup.

sjieke · Jun 16, 2011

Milhouse said:
Not really sure why you're not getting significantly better results (I'd say you're getting about 10% better results, but that's not significant enough when you should be getting double the IOPS).

I'm still reading and learning a lot about ZFS, but I think that IOPS and BandWidth (measured with dd) aren't correlated to eachother.
6 disk in 1 vdev will give you the same bandwidth as 6 disks in 2 or more vdevs. Mainly it cant be faster than the accumulated speed of the 6 disks.
A vdev has the IOPS of 1 disk, so more vdevs results in more IOPS. I think you will notice a difference with IOPS when multiple clients access the pool at the same time. I think IOPS represents the number of Input and Output operations per seconds the disk (pool) can handle. So more clients result in more IOPS.
I'm not sure, but maybe IOPS help to improve iScsi performance.

As I mentioned before, I'm still learning and the above is just my interpretation of what I have read up till now. Can someone verify or correct this?

matthewowen01 · Jun 16, 2011

quick note, dd measures throughput, not iops.

to give an accurate read on performance, block size Must be larger than the minimum amount of data that can be addressed on the pool. for example, if you have a 8 disk raidz2, you must access 6 drives at time, classically, drives are addressed in 512 Byte chunks, the new drives are being addressed in 4096 Byte chunks. so classically it would be a aggregate size of 3k Bytes, and with the newer drives it would be 12k Bytes. imagine if we tried to write in 16 Byte chunks. each write would not fill up an entire block on even 1 disk. this being ZFS, a copy on write file system, it gets even more complicated. the first write would get a new place on the disk to write to, it would write 16 Bytes to one of the disks, leaving the other's underutilized. the next write of 16 Bytes (this being copy on write) would have to read the contents of the place it just wrote (cache helps) and then write it, and an additional 16 Bytes to another spot on one of the disks, leaving the others underutilized. long story short, writing blocks that are small will not utilized your pool, you want to choose a size that will always write to all your disks. in addition to that, there is more to consider. to write you the disk, the kernel has to get involved passing off control takes time so it is best to try to get it all done at once, or at least fully utilize the time given by always having something to write. otherwise you will not be fully utilizing your access to the disks.

long story short, write in big chunks 2048k is plenty, it's 2MB's it's pretty large when it comes to file i/o operations.

example, writing 2 gigs (fits easily into my cache btw, except the last test):
dd if=/dev/zero of=/mnt/storage/users/gbates/test.txt bs=2048k count=1k
2147483648 bytes transferred in 3.880758 secs (553367054 bytes/sec)
550MB/s

dd if=/dev/zero of=/mnt/storage/users/gbates/test.txt bs=2048 count=1m
2147483648 bytes transferred in 11.030402 secs (194687706 bytes/sec)
200MB/s

dd if=/dev/zero of=/mnt/storage/users/gbates/test.txt bs=2 count=1g

44621886 bytes transferred in 173.163964 secs (257686 bytes/sec) (i decided not to wait all the way and killed it)
256 KB/s

Important Announcement for the TrueNAS Community.

Performance Testing with DD

Dabbler

Guru

Dabbler

Guru

Dabbler

Guru

Dabbler

Explorer

Guru

Dabbler

Dabbler

Guru

Guru

Dabbler

Dabbler

Guru

Dabbler

Dabbler

Contributor

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Performance Testing with DD"

Similar threads