ZFS and 4K

Status
Not open for further replies.
J

jpaetzel

Guest
We've been playing more and more with the "force 4K" option in the ZFS volume creation.

It has some caveats: L2ARC devices and hot spares don't have pool meta data on them, so they can get lost if the OS changes their device names.

It seems to provide a performance boost across a wide range of devices, not just those that have native or emulated 4K sectors.

This would be a great place for people to post some iozone stats of their pools with and without the 4K option. We are moving towards making it the default and always using it.

To keep test sizes small, I recommend putting in the following tunable:

hw.physmem 4294967296

Once you've set this tunable you'll need to reboot. Also if you have the autotuner enabled or have any kmem_max, arc_max type tunings you'll need to disable those, either that or change the value of -s to be larger than RAM, but beware, with large test sizes this test can run for a long time.

Then run the following test and post the results here:

Code:
iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Just to clarify a few things:

Can you provide the command to check if 4k is enabled. I remember someone had posted a command to check a zpool.

This won't destroy any data in the zpool, correct?

Just trying to get an idea of what you want since I can give you several data points...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
# zdb -C yourpoolname | grep ashift
ashift=9

That's 512 (2^9). ashift=12 is 4096 (2^12).
 

Stephens

Patron
Joined
Jun 19, 2012
Messages
496
It seems to provide a performance boost across a wide range of devices, not just those that have native or emulated 4K sectors. [...] We are moving towards making it the default and always using it.

I made a comment here earlier on in my FreeNAS usage that it should just be the default and people should have to disable it. The reason I said that at the time is because at least under Windows, I tested both ways and there was no performance hit to using 4K even on non-4K drives. Really all that happened is you lose a little space at the beginning of the drive (because of where it starts the partitions so that they are 4K aligned).

Unfortunately I don't have any FreeNAS systems where I can test creation with and without 4K enabled, but I look forward to reading the results. People should note which drives they're using, and even firmware if possible, because some drives use 512e (4k drive with 512-bytes/sector emulation). Microsoft had to do a few patches to Win7 before they accounted for that. That said, I'm firmly convinced the 4K option either helps or doesn't hurt.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Wow, that took like 10 hours to run just the reclen=4... the random read/write are killers.
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
Put me in the 4K camp, if only because many of us are going to eventually come across drives that have 4K blocks and it's difficult to undo the 512 decision.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Put me in the 4K camp, if only because many of us are going to eventually come across drives that have 4K blocks and it's difficult to undo the 512 decision.

This is getting me on one FreeNAS server. We thought we were 24-48 months from needing more space. Built the server and 6 months later are looking for upgrades. The fact that I didn't check the 4k block box is going to force me to create a whole new zpool and do a data copy. Not my idea of fun and not the way I was hoping to upgrade. I did know better when I built the zpool, and I thought I had checked the box because I read it wouldn't hurt performance for 512-byte sector drives. But checking I did not choose 4k. Whoops :(
 

Stephens

Patron
Joined
Jun 19, 2012
Messages
496
Don't feel bad. I enabled 4K on my first build and STILL somehow didn't enable it on my 2nd build. I've no idea how I missed it. I keep saying before I put data on my 3rd build, I'll copy all the data off build 2, redo the pool, then copy all the data back. They're the same size, so even though it'd take a long time, at least I wouldn't have to babysit it. But so far it's an investment I haven't had time for. My NAS2 itself still performs basically like NAS1 because the I/O speed is probably faster than my NIC even with 4K drives that don't have 4K enabled, but I'd imagine things like scrubs might go slower than they could.

As I said, though, given that enabling 4K basically doesn't hurt anything and has the potential to improve things, I do think it should default to enabled.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Don't feel bad. I enabled 4K on my first build and STILL somehow didn't enable it on my 2nd build. I've no idea how I missed it. I keep saying before I put data on my 3rd build, I'll copy all the data off build 2, redo the pool, then copy all the data back. They're the same size, so even though it'd take a long time, at least I wouldn't have to babysit it. But so far it's an investment I haven't had time for. My NAS2 itself still performs basically like NAS1 because the I/O speed is probably faster than my NIC even with 4K drives that don't have 4K enabled, but I'd imagine things like scrubs might go slower than they could.

As I said, though, given that enabling 4K basically doesn't hurt anything and has the potential to improve things, I do think it should default to enabled.

Great! I'm not the only idiot :D

I do hope that I can copy the data from one zpool to the other, delete the old zpool, then "rename" the new zpool to replace the old zpool without losing all of my CIFS, FTP, etc. settings. Otherwise I'm basically in a bind where the only option is to literally redo the entire setup from scratch(yuk!).
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
FreeNAS 8.3.0, fresh install.
Supermicro X9SCi-LN4F, E3-1230, 32GB (hw.physmem=4G), 4 x Barracuda 400GB on system's SATA-II ports, in RAIDZ2:

Code:
[root@freenas] /mnt/test# zdb -C test | grep ashift
                ashift: 9
[root@freenas] /mnt/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  106202   29524    46544    47849     599     334
         6291456       8   95938   28983    46254    48291     996     636
         6291456      16   93824   28818    45907    47629    1954    1248
         6291456      32   92598   28883    45874    46695    4202    2448
         6291456      64   89945   29304    44837    45010    8300    4826
         6291456     128   84256   90728    43516    43715   11333   37559

iozone test complete.


Code:
[root@freenas] /# zdb -C test | grep ashift
                ashift: 12
[root@freenas] /mnt/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  100419   28237    43718    44889     457     303
         6291456       8   86266   28166    43642    44861     880     604
         6291456      16   93075   28926    45437    45765    1815    1187
         6291456      32   86003   28960    44247    46337    3710    2322
         6291456      64   85092   29000    44196    45219    7352    4537
         6291456     128   80625   85481    41768    42905   10116   43877

iozone test complete.


Now with two 240GB SSD's in mirror mode (sadly on the SATA-II ports)

Code:
[root@freenas] /mnt/testssd# zdb -C testssd | grep ashift
                ashift: 9
[root@freenas] /mnt/testssd# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  186902   82847   177061   180873    8245    3528
         6291456       8  209090   88082   180880   182844   16660    7056
         6291456      16  186810   91870   183253   187065   33129   13736
         6291456      32  185965   90136   186950   191158   67443   26821
         6291456      64  201854   86497   192631   198157  128729   50541
         6291456     128  198310  214965   191914   192704  182395  192876

iozone test complete.


Code:
[root@freenas] /mnt/testssd# zdb -C testssd | grep ashift
                ashift: 12
[root@freenas] /mnt/testssd# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  238942   94072   174363   177584    8493    3950
         6291456       8  274843   98074   186346   188893   16733    7856
         6291456      16  271598  102027   188416   191395   34132   15501
         6291456      32  279933  104451   188529   192079   66964   30293
         6291456      64  235307   97711   197548   209848  121128   57591
         6291456     128  259758  262617   194024   194805  184132  266929

iozone test complete.


Now with a single 60GB SSD (on the SATA-III port)

Code:
[root@freenas] /mnt/test60ssd# zdb -C test60ssd | grep ashift
                ashift: 9
[root@freenas] /mnt/test60ssd# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  205333  106121   243079   241766   11250    4485
         6291456       8  208054  111421   255582   259015   25228    8975
         6291456      16  228880  110152   264196   270319   51510   17646
         6291456      32  220177  115673   266410   278327  105952   34555
         6291456      64  218114  109102   272136   272700  211696   64438
         6291456     128  224765  231820   269414   272354  274586  223875

iozone test complete.


Code:
[root@freenas] /mnt/test60ssd# zdb -C test60ssd | grep ashift
                ashift: 12
[root@freenas] /mnt/test60ssd# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  469196  128716   250168   250522   12019    6108
         6291456       8  461845  134429   256652   259380   25676   12160
         6291456      16  479646  151294   263990   275338   54212   24183
         6291456      32  479113  152255   269113   275876  108343   47277
         6291456      64  388508  154130   279044   282559  218568   89127
         6291456     128  476120  373978   277481   280950  281006  471094

iozone test complete.
 

david31262

Dabbler
Joined
Jun 1, 2011
Messages
12
As I said, though, given that enabling 4K basically doesn't hurt anything and has the potential to improve things, I do think it should default to enabled.

What happens with selecting 4k and storing small files ?
e.g. if you have 6 drives with z1 then would the smallest storeable space is 4096 * 5 = 20,480. Storing lots of small text files eats more space (though probably not a big deal).
 

Stephens

Patron
Joined
Jun 19, 2012
Messages
496
I wouldn't bet a paycheck, but I'm pretty sure RAIDZ1 doesn't work that way. If you have a 10-byte file, a 512 byte sector/block would waste 502 bytes and a 4k sector/block would waste 4086 bytes. Have 1,000 of those files and you'd be wasting a grand total of, what, 4MB? Yeah, I'd call it, "Not a big deal."

As an aside, if you were worried about storing a lot of small text files, you'd have compression enabled on that dataset, right?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Some file systems will hold files smaller than a particular size in the file system itself. Not sure if zfs does this though.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I wish I could find the link to another article that talked about ZFS dynamically adjusting sector size/alignment. This quote hints about it, but the link I'm thinking of described things in much better detail.

ZFS will dynamically adjust block size for a file, between the smallest block size the media supports and 128k or so (IIRC). That means that even if you align a partition on your 4k disk, or use the raw disk itself (so ZFS starts on an aligned sector), after the first small file is written you'll be doing un-aligned IOs.

Anyway, this one is interesting:

http://wiki.openindiana.org/pages/viewpage.action?pageId=4883847

ZFS allocates space using a logical sector size that is a power of 2 that is less than or equal to the physical sector size.

Unfortunately, some HDD manufacturers do not properly respond to the device inquiry sizes. ZFS looks to the physical sector size (aka physical block size) for its hint on how to optimize its use of the device. If the disk reports that the physical sector size is 512 bytes, then ZFS will use an internal sector size of 512 bytes. The problem is that some HDDs misrepresent 4KB sector disks as having a physical sector size of 512 bytes. The proper response should be that the logical sector size is 512 bytes and the physical sector size is 4KB. By 2011-2012, most HDDs were properly reporting logical and physical sector sizes. In some cases, the HDD vendors advertise the disks as "emulating 512 byte sectors" or "512e."

There is no functional or reliability problem with 4KB physical sectors being represented as 512 byte logical sectors. This technique has been used for decades in computer systems to allow expansion of device or address sizes. The general name for the technique is read-modify-write: when you need to write 512 bytes (or less than the physical sector size) then the device reads 4KB (the physical sector), modifies the data, and writes 4KB (because it can't write anything smaller). For HDDs, the cost can be a whole revolution, or 8.33 ms for a 7,200 rpm disk. Thus the performance impact for read-modify-write can be severe, and even worse for slower, consumer-grade, 5,400 rpm or variable speed "green" drives.
Bottom line: for best performance, the HDD needs to properly communicate the physical block size via the inquiry commands for best performance.
Inside ZFS, the kernel variable that describes the physical sector size in bytes is ashift, or the log2(physical sector size). A search of the ZFS archives, can find references to the ashift variable in discussions about the sector sizes.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Where is everyone's results? Only jgreco has posted his so far. Older 512b drives and SSDs are going to be the most interesting, IMO.

FreeNAS 8.3.0
ASUS F1A75-V Pro, AMD A6-3500, 8GB (hw.physmem=4G), 2 x ST2000DM001 2TB - Mirror:

Actual values during run:
Code:
hw.physmem: 3197485056
kstat.zfs.misc.arcstats.c: 2029002752

Code:
[root@freenas] /mnt/test# zdb -C test | grep ashift
                ashift: 9
[root@freenas] /mnt/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  100616   46633    92798   100704     761     409
         6291456       8   55423   35554    91512   105009    1531     812
         6291456      16   77826   21561    54755    88844    2907    1676
         6291456      32   45328   33088    76824    89669    6036    3422
         6291456      64   98447   46766   103489   108605   11300    8013
         6291456     128   85592   86007    92539   100243   18949   78134

iozone test complete.

Code:
[root@freenas] /mnt/test# zdb -C test | grep ashift
                ashift: 12

[root@freenas] /mnt/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
[bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  107555   50994    98546   106287     765     558
         6291456       8   89673   49022   103231   113299    1518    1110
         6291456      16  108438   46409    94900    99484    2973    2206
         6291456      32   87339   47717    96337    97294    6119    4357
         6291456      64  115421   48290   101637   109240   11293    8384
         6291456     128   99709  140365   100016   108643   17919  128010

iozone test complete.
With hw.physmem set to 4G prefetch is also disabled. Perhaps that's desired? I was curious so, with prefetch not disabled:
Code:
hw.physmem: 3197485056
kstat.zfs.misc.arcstats.c: 2029002752
vfs.zfs.prefetch_disable: 0

Code:
[root@freenas] /mnt/test# zdb -C test | grep ashift
                ashift: 9
[root@freenas] /mnt/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        [bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  101359   45712   178495   179385     762     409
         6291456       8   61468   43156   175079   182614    1512     832
         6291456      16   67257   23144   150330   142905    2895    1673
         6291456      32   44611   29356   149251   151285    5993    3436
         6291456      64   97131   44316   179723   180889   12993    8026
         6291456     128  104533  112411   172668   173066   17381   98941

iozone test complete.

Code:
[root@freenas] /mnt/test# zdb -C test | grep ashift
                ashift: 12
[root@freenas] /mnt/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
       [bla bla bla snipped]
        Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4  106214   50939   189460   190588     765     558
         6291456       8   90025   47677   188077   191035    1507    1109
         6291456      16  110534   46325   163616   169415    2998    2207
         6291456      32   85315   46538   167575   178849    6119    4368
         6291456      64  130062   51141   177690   184148   12478    8499
         6291456     128  132394  120901   179141   181863   17565  121734

iozone test complete.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Where is everyone's results? Only jgreco has posted his so far. Older 512b drives and SSDs are going to be the most interesting, IMO.

Ja, I was wondering that too. Your results tend to validate my suspicions: 4K should be the default. On modern drives, anywhere that 4K happens to be slower, it's by fractions of a hair, but where it's faster, it may be by a large margin. On older drives, the 512 is a better choice, possibly by a large margin.

However, soon no one will be buying 512 drives, and even if you have a system where it performs better, if you decide to expand your pool size by replace/resilver/repeat, you'd be better off taking the short-term performance hit of 4K on your old cruddy drives so that you wind up in happyland later.

It seems very likely that it would be a better idea to default to 4K and then allow users who really want it to select 512. I'd also love to see the language changed; as it is, it isn't really clear that the "4K" option actually works for 512-byte drives as well.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Soon I'll have access to 8 1.5TB drives that are 512 byte sectors and I was going to run all of these tests. Just waiting for the disks to be available.
 

tingo

Contributor
Joined
Nov 5, 2011
Messages
137
I can't re-do my pool now, so this is just the way it is currently set up, ok?
FreeNAS 8.3.0, Asus C60M1-I, 16 GB RAM, 6 x ST3000DM001 , RAIDZ1

values:
Code:
[root@kg-f5] /mnt/z5/h-tingo/test# sysctl hw.physmem
hw.physmem: 16742060032
[root@kg-f5] /mnt/z5/h-tingo/test# sysctl kstat.zfs.misc.arcstats.c
kstat.zfs.misc.arcstats.c: 11838246179

test
Code:
[root@kg-f5] /mnt/z5/h-tingo/test# zdb -C z5 | grep ashift
                ashift: 12
[root@kg-f5] /mnt/z5/h-tingo/test# iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2

    Command line used: iozone -r 4k -r 8k -r 16k -r 32k -r 64k -r 128k -s 6g -i 0 -i 1 -i 2
    Output is in Kbytes/sec
    Time Resolution = 0.000002 seconds.
    Processor cache size set to 1024 Kbytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                  
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456       4   63605   56935   230524   148833  173698    5654                                                         
         6291456       8   92586  104812   397195   403990  307068   10628                                                         
         6291456      16  115676  141576   570036   578388  475816   20038                                                         
         6291456      32  128819  158171   746870   796512  696080   49353                                                         
         6291456      64  143069  149511   956121   972336  908076  100482                                                         
         6291456     128  188834  191536  1048040  1095608 1062198  197302                                                         

iozone test complete.

HTH
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So how long should these tests take? I did a DD on my zpool and I got 825MB/sec on 18x2TB on RAIDZ2 with 4k sector selected and 4k sector disks. I ran the iozone command and I got the first line of results(reclen=4) but that's all and the test has been running for 3 hours.. LOL
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So how long should these tests take? I did a DD on my zpool and I got 825MB/sec on 18x2TB on RAIDZ2 with 4k sector selected and 4k sector disks. I ran the iozone command and I got the first line of results(reclen=4) but that's all and the test has been running for 3 hours.. LOL

Given the nature of what is being asked for, it is unclear that there is any value to doing these tests unless you can do it with both ashift values. Without something to compare to, I would think it difficult to judge the impact of the setting on performance.

In any case, the iozone test will take forEVER because it is doing a large number of seeks. If you look at tingo's output records two notes above:

The first line, where reclen is 4, means that iozone worked on a file that was 6291456KB (6GB) with records that were 4K. It first blasted the file out with 4K write requests, got a speed of 63605KB/sec. Did the same thing with a REwrite (that is, overwriting the file in-place). Then it read the entire file. Then it REread the entire file (which'd fly if you had it in cache). All those should be relatively fast. But the two last ones are killer. It then reads the entire file picking random records, which means that if it isn't in cache, ARC, or L2ARC, you're seeking all around to acquire blocks. You'll notice in tingo's results that his random read speed was pretty good, but that's because he didn't lower hw.physmen as requested so that the file couldn't fit in ARC. Then the same thing with writes. This will take a really long time because there's a lot more work involved in replacing random records within the file than there is in just doing it sequentially.

It took two or three days to run on my old Barracuda 400GB pool (it's okay, laugh).
 
Status
Not open for further replies.
Top