Horrible performance, even with sync=disabled

Status
Not open for further replies.

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Hello everyone!

First of all, I'm new to the forum, so if I make any mistakes that seem to have obvious solutions, please forgive me.

I own a small hosting company providing virtual machines and webhosting and some other services. Most of these are run from a dedicated storage machine which needs replacemens.

Current Config:

Supermicro DIS821
2x E2620, 6cores with HT
128GB ram
LSI9207-8i connected to 6x 256GB ssd for cache or zil
2x LSI9208-8e

Supermicro 45bay JBOD
10 Western Digital near line SAS disks of 2TB in front of the jbod, jbod has 2 connections to either hba in the head. Backplanes are not cascaded.

FreeNas 9.2.1 is installed.

When running iozone, sync set to 'always' here are the results:
Code:
        Auto Mode
        Using maximum file size of 131072 kilobytes.
        Command line used: iozone -a -g 128m
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd  record  stride
              KB  reclen  write rewrite    read    reread    read  write    read  rewrite    read  fwrite frewrite  fread  freread
              64      4  164099  492717  1310683  1484662 1336792  557144 1124074  507625  1526887  551422  576282 1232453  1562436
              64      8  581273 1066042  3022727  3022727 2006158 1087638 2467108  1087638  2892445  1105556  1143223 1780008  2561267
              64      16  998622 2067979  5283570  5283570 4274062 2067979 3541098  1991276  4564786  1991276  2133730 1828508  4018152
              64      32 2133730 3738358  7940539  9318832 7100397 3541098 4897948  3363612  8182586  3363612  3738358 1392258  5283570
              64      64 3738358 5735102 12902017  7940539 9006179 3958892 5860307  5735102  9318832  5735102  5735102 3958892  5860307
            128      4  144483  609522  1598754  1727352 1526043  574312 1346196  576780  1455701  592699  603357 1543594  1543594
            128      8  427851 1142750  2985839  3199360 2499332 1111981 2511022  1114289  2843510  1142750  1142750 2464907  2673584
            128      16  826202 2098744  5122535  5122535 4407601 2035099 3895854  1911894  4717434  2035099  2132084 3657016  3895854
            128      32 1967960 3657016  8036304  8036304 6812590 3657016 5545860  3657016  7082197  3657016  3895854 4596273  5122535
            128      64 3359523 6114306 11720614 11720614 9129573 6114306 7476717  5784891  9795896  5784891  6114306 4889281  6045455
            128    128 3277486 7476717 12842051 1420079410567140 8414153 8036304  8548124 10567140  8036304  8548124 1539168  6406138
            256      4  335459  567575  1707589  1743637 1489641  544823 1506359  526652  1463250  560172  560172 1552085  1570244
            256      8  731528 1123389  3123106  2984226 2819657 1066483 2350543  1058076  2783115  1094754  1112909 2419396  2692393
            256      16 1347557 2030503  4332998  5117791 4496299 1924938 4070199  1911233  4734192  2000242  2030503 3705040  3935921
            256      32 2371308 3705040  6891544  7735574 6936061 3465855 6398720  3454704  7518900  3605511  3654598 4929815  5347168
            256      64 3823789 6107548  7791708 11091721 9114515 5841722 8271916  5687020 10245071  5810112  5569035 5569035  5569035
            256    128 6761355 8534922 12090911 1209091110651598 8271916 7735574  8271916 11569783  8271916  8534922 5320671  6554972
            256    256 6249745 7965107 12228612 1222861211207494 7735574 9518507  7735574 11569783  7735574  7965107 1570244  6398720
            512      4  423833  579804  1723770  1741948 1542947  557671 1570020  539459  1523247  572081  573916 1556366  1574625
            512      8  844865 1120344  3117691  3177664 2830083 1075459 2681653  1026126  2815243  1098566  1105351 2638816  2691737
            512      16 1476130 2032051  4827914  5019764 4485083 2007358 4228947  1947291  4411377  2047551  2032051 3684733  3970896
            512      32 2474613 3736016  7309196  7758090 6652558 3580298 6652558  3556580  7620440  3628697  3710197 5227492  5384786
            512      64 3736016 6104175  8844453 10694336 9638369 5624545 9145790  5951911 10485468  5886650  6174377 6086873  6491676
            512    128 7309196 8992598 10641343 1249949011138071 8394979 9814569  8394979 12146009  8665997  8808176 6104175  6736026
            512    256 6931711 8140399 12146009 1194335711374040 814039910485468  8140399 11877300  8109658  8234036 5566231  6571132
            512    512 6821616 8018812 12797441 1279744112215097 798898111138071  8109658 12499490  8265729  8394979 1506154  6736026
            1024      4  444024  585160  1735476  1738286 1565902  560495 1580306  544645  1563622  581831  581831 1590254  1585557
            1024      8  897438 1115632  3003881  3055164 2822283 1077836 2837198  1040496  2829721  1097669  1109579 2668003  2672984
            1024      16 1563622 2044438  4854136  4966395 4474829 1983996 4415030  1920993  4474829  2048338  2052253 3908759  3908759
            1024      32 2516377 3751699  7524394  7631350 7020149 3591693 6918376  3495237  6650556  3696803  3696803 5336651  5417427
            1024      64 3618930 6128613 10255274 10557785 9743447 5958564 9485232  5660167 11018226  6059442  6172653 6479979  6479979
            1024    128 7524394 8680108 11903823 1217374711368190 868010811132462  8610501 12348754  8840915  8914314 6569179  6918376
            1024    256 7208671 8457895 11903823 1203727211489839 831060311018226  7811791 12348754  8391792  8199542 5991815  6559147
            1024    512 7319232 8326715 12790037 1279003712492426 812201411936907  8262639 12492426  8326715  8326715 5472650  6744549
            1024    1024 7208671 8391792 13305116 1330511613142265 826263912173747  8474583 13142265  8391792  8391792 1548960  6863100
            2048      4  481314  578844  1705432  1715309 1531781  556674 1569280  539101  1556202  570160  573051 1553669  1564421
            2048      8  929666 1110699  3043048  3034448 2772014 1073365 2778290  1035954  2763987  1099890  1100454 2653011  2676988
            2048      16 1603854 2049709  4785996  4971585 4441984 1978880 4472047  1930406  4490751  2031771  2023634 3894239  3984558
            2048      32 2528827 3657149  7503399  7705319 6990474 3525079 6823876  3489282  6990474  3644736  3624742 5461535  5475461
            2048      64 3624742 6185123 10240672 10389302 9483197 6024617 9653719  5937172  9841749  6041567  6167360 6608629  6690992
            2048    128 7643611 9101380 11770166 1227474211564174 871361811320334  8713618 13118295  9024882  9024882 6649556  6945257
            2048    256 7318020 8398403 11770166 1198363011305435 832514711440955  8067138 12488897  8398403  8390200 6189580  6563185
            2048    512 7419150 8390200 12398764 1280539912561952 829299811900619  8292998 12561952  8325147  8357547 5970183  6670210
            2048    1024 7451329 8464610 13058467 1295995813058467 839020012710657  8464610 13118295  8498106  8398403 5626082  6873016
            2048    2048 6922868 8498106 13138359 1305846712959958 839840312488897  8498106 12635867  8253158  8608824 1486197  6759439
            4096      4  482963  574638  1702358  1713906 1549158  555698 1577900  538236  1548041  576741  576741 1566390  1572987
            4096      8  939468 1102842  3077833  3112404 2780451 1062466 2850576  1031712  2682340  1092255  1093715 2691164  2700046
            4096      16 1608179 2037749  4847858  4964124 4452120 1993069 4535572  1936675  4501111  2046000  2046000 3961410  4000150
            4096      32 2582737 3734100  7515892  7625987 7050159 3592757 6989921  3530734  7061751  3657777  3700320 5476867  5535098
            4096      64 3627654 6207035 10089154 10476771 9868918 6077484 9801354  6022095  9840653  6234064  6263611 6595451  6782940
            4096    128 7935986 9247398 11800935 1204923511664722 888396811539362  8770583 11736440  9062041  9183139 6716643  7073381
            4096    256 7050159 2405199 10190895 1197365511570449 602209511601703  5689064 12765490  6066754  6086096 6112079  6555186
            4096    512 3084464 8276241 12334720 1271823812194633 827624112379160  8308260 12532680  8340529  8276241 5810280  6714018
            4096    1024 7656576 8603675 13007113 1321725913086376 848051112880338  8518356 13036724  8518356  8535284 5225329  6861502
            4096    2048 7277121 8514134 13046624 1304662412997272 844300212450933  8535284 12841826  8586475  8586475 4007615  6680080
            4096    4096 3308398 7788478 12756011 1275601112718238 809300112496216  8552280 12765490  8340529  8638284  861413  5081590
            8192      4  466839  566881  1703861  1712779 1541016  551876 1574487  535525  1508938  570202  570591 1556795  1563027
            8192      8  909521 1096640  3075025  3107564 2736125 1059053 2813188  1039446  2757203  1099905  1101068 2697461  2695768
            8192      16 1582391 2058849  4833405  4943985 4464155 1997565 4520538  1958846  4469381  2048538  2049027 3932757  3949937
            8192      32 2509877 3735645  7361004  7620585 7049878 3634487 7044097  3559928  7055669  3722290  3728753 5400092  5505657
            8192      64 3459566 6099933 10015063 10381195 9812005 5984134 9848567  5932474  9786852  6099933  6094523 6655245  6758664
            8192    128 7124429 8837822 11378128 1192298911569692 877238711569692  8725605 11538609  8895021  8772387 6793408  7044097
            8192    256 6959913 8136708 11393220 1178800311683783 829984311699697  8192973 11703682  8419841  8315913 6248582  6480761
            8192    512 6879094 8192973 11699697 1233828912338289 823223212189466  8275854 12619239  8183217  8358395 5893805  6558696
            8192    1024 6464909 7930135 12432038 1244554712679779 830787012189466  8428103 13085407  8384911  8368573 5357988  6379683
            8192    2048 5554610 7678487 10981742 1197701811755738 820863111236730  8515834 12860118  8332045  8342160 4227431  5801260
            8192    4096 4039563 5897851  9728660 10151170 9977258 681496610088578  8054689 12840894  7413415  6469778 1371054  4007524
            8192    8192 2974661 3496175  6506533  6687628 6692839 3873350 6177802  3890014  6229323  3856828  3723903  758596  1726810
          16384      4  476124  182030  1499285  1555943 1417077  333116 1475116  292576  1420504  335374  328488 1415326  1445058
          16384      8  872652  989014  2559212  2662843 2503641  974429 2575711  1065421  2452007  1028768  981667 2319346  2390425
          16384      16 1320644 1588345  3803210  4053336 3822674 1694317 3861337  1953045  3783944  1785115  1649776 3244881  3385895
          16384      32 1879337  304055  4931852  5477922 5477922 1724245 5501604  2070998  5338759  1809750  1663353 4091953  4336761
          16384      64 2366959 3176338  6255296  6789921 6954141 3888870 6974609  5864034  6820923  4263310  3523568 4679950  4996397
          16384    128 3746603 3858735  6263278  6930995 7674033 4882098 7571721  8724282  7349829  5642083  4399509 4495051  4850396
          16384    256 3710193 3859602  6787239  7550092 7952901 4738030 8095305  8102942  7839493  5550036  4285644 4507434  4832999
          16384    512 3679402 3725479  7062054  7710195 8275660 4753105 7865515  8168437  8501933  5535284  4301471 4532108  4930437
          16384    1024 3585899 3633681  7370324  7911698 8423769 4524648 8282642  8295640 13087415  5437611  4145769 4430425  4978299
          16384    2048 3475629 3447729  7331011  7858319 8180105 4157306 7393320  8299648 12840428  5237842  3819912 4004910  4718834
          16384    4096 3188127 3291193  7170374  7613666 7468042 3607358 7176364  8159708 12799769  4560985  3427609 1997844  3901012
          16384    8192 2794973 2893597  5300463  5253057 5236246 2962198 4973975  3663708  5647647  3314049  2922143 1052078  1771814
          16384  16384 2640030 2711704  3312611  3261204 3295296 2672681 3256722  2695641  3272853  2713095  2724281  733556  1546836
          32768      64 1283003 2472332  5127208  5424264 5479627 2706993 5834610  5917001  5553805  2696319  2468424 3671029  2893707
          32768    128 2999095 2920206  5306358  5707643 5809945 3007363 5646674  8560078  6010150  3309564  2944354 3978570  4171541
          32768    256 2994455 2955561  5627484  6031250 6245009 3033784 5945927  8159129  6377735  3328721  2999618 4065542  4211809
          32768    512 2998571 2963016  5999394  6288153 6490070 3046224 6341832  8082834  6725728  3285356  2954164 4170528  4370243
          32768    1024 3045954 2997983  6248700  6554760 6728362 3070177 6393756  8245773  7084918  3163238  2999880 4460014  4892144
          32768    2048 3184789 3014818  6413747  6822550 6766123 3076499 6526126  8375926 13049894  3191593  2080901 4023647  4399622
          32768    4096 2933545 2953910  6402692  6705384 6690043 3024038 6117967  8044984 12337643  3520843  3020914 2564408  3718505
          32768    8192 2641948 2706940  4568232  4626202 4596496 2752919 4477158  3632894  5045148  2893707  2700716 1357813  1842251
          32768  16384 2529762 2575508  3120505  3105627 3114777 2753636 3631166  2706727  3132954  2591388  2618742  985919  1541645
          65536      64 1173830 2375169  4993258  5186786 5016129 2367804 4993984  5928206  5168062  2427439  2371562 3966331  4082437
          65536    128 2687876 2749939  5126424  5448650 5382395 2796492 1547746  5798034  6207748  1993316  2369600 3901423  3997072
          65536    256 2728186 2771902  5445520  5721163 5759402 2802965 5561875  8181911  5802685  2867200  2787700 3977926  4116614
          65536    512 2778739 2804767  6390754  6747158 6317317 2855108 5912649  8221801  6199628  2122481  2317477 4105792  4263602
          65536    1024 2769891 2849869  5950149  6209010 6252937 2846534 6038511  8406359  6360292  2863466  2813667 4162621  4316631
          65536    2048 2821668 2818861  6033077  6279219 6289420 2848894 6003692  8487080  6376966  2872233  2833740 4063007  4298876
          65536    4096 2779779 2807403  6077630  6297778 6200886 2839624 5920034  8346375 12400429  3102291  2805741 3017857  3721516
          65536    8192 2507579 2560605  4310133  4382773 4387460 2614552 4808179  2620085  4667823  1993431  2246141 1597033  1856971
          65536  16384 2408868 2459783  3001413  3016201 3009267 2446647 3001545  2415706  2999907  2464304  2465343 1207305  1535769
          131072      64 1157654 2332450  4954325  5053283 4904645 1629800 3904105  3613395  3746313  450891  597544 3870434  3969963
          131072    128 2658280 2727317  5129811  5321171 5285767 2055286 5130817  5380173  5103240  2217805  2333202 3846361  3935040
          131072    256 2664387 2744842  4863855  5993469 6003352 1956650 5605703  5666487  5672392  2369302  2350269 3938930  4018394
          131072    512 2717986 1991884  5692302  5850345 5888507 2370630 6042680  5751558  5931455  2406620  2374326 4092034  4179323
          131072    1024 2694395 2760278  5871840  6009455 6032402 2780493 5912382  8414955  6137804  2794414  2765930 4174151  4294202
          131072    2048 2723629 2761498  5944603  6043809 6040489 2784563 5894568  8524428  6122561  3006449  2120146 4071969  4189196
          131072    4096 2707039 2746749  5891410  6015504 6028235 2761498 5850096  8425272  5959425  2786624  2764067 3357024  3761900
          131072    8192 2456242 2523577  4293933  4192678 4215084 2530884 4214406  2688650  4830181  1939941  2217912 1692713  1841260
          131072  16384 2357708 2408433  2969200  3145779 3274269 1848821 2938506  2122766  2954996  2122111  2133533 1307153  1432682
 


When I connect this array to a VMware server, (10gbit cx4 connection) performance is horrible. Even with 3 ssd's as zil, performance is horrible. When I copy one large file of random data, (10gb) the transfer starts at 250 mb/s but goes down to 25 mb/s within 15 seconds... The rest of the transfer stays that slow.

Now I know the problem with VMware and NFS. But iSCSI in this case makes no difference. even when i disable sync, transfers stay this slow. Running TOP, and switching it to io-mode shows nfsd running at 100%, doing about 2000 iops. When I try iSCSI, the sync setting doesn't really matter, Top shows 7k iops but the transferspeed starts at 250 mb/s and drops to 90 after about 60 seconds all the way till it completes the copy.

Now using 10 sas disks, I would think that I should be able to reach near to 1GB/s. The iozone test shows that it's capable of going way beyond that. Still I can't even sustain 250MB/s, when copying a file within a Windows VM.

So in short, I got the feeling that the hardware should be able to go way faster, even without changing sync settings. Anyone has any idea on how I can improve this setup?

Thanks in advance.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Looking for more information here:
  1. What's your pool config? I'm assuming mirrored, but are you running RAIDZ? Remember that your pool config will affect your peak write speed.
  2. How are the LSI 8e's cabled to your JBOD?
  3. What model SSD are you using for SLOG? You probably can improve things by assigning less to allow for wear-leveling.
There's also some issues with the default txg being too large, and with 128GB of RAM and only ten spindles in the pool (in an unknown config, see #1) I could this contributing.
 

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Hi HoneyBadger,

Thanks for your reply.

1. The pool is a 10disk raid z2, not mirrored or anything. the 10 disks are all in one vdev I'm still waiting for more disks, they were on backorder. If they're in, i'll add a second vdev with 10 disks.

2. from both lsi9207-8e hba's, a cable goes to the jbod, both to their own expander in the front-backpane. The other connection from the hba goes to the read-backplane. I don't have the cables to cascade the backpanes to complete the loop but I could order those.

3. The ssd's are Samsung 840 pro, disks.


I'll look into the default txg issue. Haven't seen it yet as far as I can recall.... And I've read a lot already...
 

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Code:
  pool: ZFS
state: ONLINE
  scan: none requested
config:
 
        NAME                                            STATE    READ WRITE CKSUM
        ZFS                                            ONLINE      0    0    0
          raidz2-0                                      ONLINE      0    0    0
            gptid/a4d51430-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/a5a8bd10-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/a67e7e98-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/a758c926-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/a829d38d-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/a8fe5603-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/a9ce09e8-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/aaa65090-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/ab7b6368-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
            gptid/ac53101a-9193-11e3-a309-002590cb4e50  ONLINE      0    0    0
        cache
          gptid/b603d78d-9338-11e3-a309-002590cb4e50    ONLINE      0    0    0
          gptid/b70f9b45-9338-11e3-a309-002590cb4e50    ONLINE      0    0    0
          gptid/b8122da3-9338-11e3-a309-002590cb4e50    ONLINE      0    0    0
 
errors: No known data errors


i've removed the zil drives for some testing, but perhaps this clarifies the pool setup
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
1. The write penalty on a RAID-Z2 is huge compared to a mirror (6 vs 2). Assuming you're using RE4s which have a transfer rate of 170MB/s, 10 of them gives you 1700, but divide by your write penalty of 6 and you get 283MB/s theoretical - which matches up pretty closely with your stated "starting burst speed" of 250MB/s. A theoretical 10-drive mirror would give you 1700/2 = 850MB/s; however you'd only have 10TB usable space vs. the 16TB you get from RAID-Z2. Edit - I'm brainfried. This applies to RAID-6, RAID-Z's copy-on-write means that you'll get a full stripe every time. IOPS will suck but bandwidth will be largely unaffected. And since you're focusing on virtual machines here, you need speedy random IOPS, not big sequential bandwidth.

2. Assuming everything is negotiating at 6Gbps you've got eight lanes between the two cables which should be more than enough.

3. Good drives, but you probably have far too much allocated for ZIL. There's no way you'll need even a single full drive ZIL. I'm trying to remember the formula/rule of thumb for ZIL sizing, but I think it's down to "three transaction groups" - and with your disks being capped at 283MB/s, each txg will be roughly 1.4GB, three of them makes 4.2GB. Thanks for the zpool stat, which shows me that you've got three full 256GB disks dedicated to SLOG - so that's 768GB - 4.2GB = about 763.8GB of overkill. More busted math here, you're getting ~1.4GB/s sequential from the drives, so you want 1.4GB x 5 x 3 = 21GB of ZIL. So only 768 - 21 = 747GB of overkill.

Now normally I love overkill, but that's probably unnecessary. I'd say take two drives, cut a 8GB 24GB slice of each, let the wear-leveling algorithms on the SSD have the rest of it to prolong the flash life, add them as a log mirror, and use the other four drives to give yourself a solid 1TB of L2ARC (with 128GB of RAM you should be able to support that much)

Point #1 I think is the critical one - with a 10Gbps connection you are definitely capable of outpacing your drives.

Here's a massive bug report and discussion from jgreco with regards to oversized txg's - he was running into the issue with 8GB of RAM and a slow pool. You have 128GB of RAM and a (relatively) slow pool.
 

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Hey,

I'm aware that I have too much zil, figured it wouldn't hurt having too much, but would if i had too little.

The write penalty is huge. I figured it would be in the range of 6 to 8 times the 170 mb/s, or perhaps a little less as we have plenty of cpu and ram to do the parity and checksum math. I don't really understand that it is a factor 6 that the speed goes down. I'll test with a mirror pool tomorrow. Losing some terabytes over a huge speed increase is something more spindles will have to fix in the end. With the planned 20 disks, i'll have 20 TB in stead of the planned 29,28 but that will do for now. I bought the 45 bay jbod so i could add disks in the future if the time would come.

I'll test this out tomorrow. Thanks for your thoughts!
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Derp. I'm brainfried. ZFS's copy-on-write nature means that you should be writing a full stripe every time with a large sequential workload, so with 10 drives and 8 dedicated to data you should in theory only lose the "bandwidth" of the 2 parity drives, the rest of it will be striped to the eight remaining disks at 8x170 = 1360MB/s.

The write penalty applies for small-block/random though, because each I/O has to wait for all drives to complete; you'll basically get the IOPS of the slowest drive in the pool, and this is the part that matters. You're running virtual machines, but benchmarking by copying a single monolithic file. They're completely different workloads.

Edit - That said, you've got SSD SLOG. That should, in theory, catch the smaller random IOPS and collect them into transaction groups, which should in theory write as sequential to the pool, which would have the bandwidth to keep up per my first paragraph, except that in your tests it isn't and is stalling out ...

My head hurts. Does anyone else want to step in here?
 
J

jkh

Guest
For VMWare, he needs to configure this setup as a RAID10. That means that you want a pool consisting of vdevs which are each just a pair of drives in a mirror. Even if RAIDZ2 were somehow desired, 10 drives is too many to have in one vdev, since zfs will wait for transactions to complete in the vdev as a whole vs being able to have several I/Os outstanding to multiple vdevs in parallel. Sun always recommended 2 5 drive vdevs (RAIDZ1 or Z2) when 10 drives were involved.

In any case, as I said, RAID10 is the way to go here for IOPS. There should be 5 vdevs, each vdev containing a mirror pair of drives. If I had 3 SSDs to add to this and it was my box, I would configure one as a ZIL and the other two as a striped pair for L2ARC.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Yay, confirmation! I'm not totally off my rocker today.

@oehTie also seems to have 6x SSD, 3 for L2ARC, 3 for SLOG - I suggested above he use 4 for L2ARC (1TB total) and a 32GB partition on each of the other two as a mirrored SLOG.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
This has been discussed to death, please avail yourself of the forum search features for lots of coverage about RAIDZ not being good for VM storage.
 

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Hi Guys,

Currently trying things out with mirrors of two drives, striping across the mirrors. The raid 10 as suggested. Seems indeed better dan z2. Also iscsi seems to perform better than NFS, regardless of the sync option. It's good to get a little more insight into zfs performance with the various options. Makes sense that zfs waits for a vdev to complete the write, causing iops to suck. Didn't think of that yet, other than getting an ssd to use as slog. (But that didn't work out)

I figured that a zil would catch all the small writes and stripe them into bigger writes to disk. Seems you guys confirm that. Is there a way to force zfs to use bigger transaction groups? For example 256kb instead of the default 128? I can set a write block size for iscsi but 128k is de maximum. (probably because of the zfs write size)

Thanks for the replies!



ps, Jgreco, that's too easy. I have tried to search for suggestions and I have found numerous topics indicating that NFS would be a problem for VMware. ZFS in itself shouldn't be a problem if a ZIL and a cache ssd would be configured in combination with nfs. I decided to open a topic as adding these ssd's didn't help, and the zfs set sync=disabled option didn't help as well. As the zfs filesystem isn't filled up yet there shouldn't be much defragmentation already. Now I know you are an expert on this field, as your postcount and posts in other topics prove, and please, I'm not trying to offend anyone, but simply discouraging zfs for vmstorage...... There are also numerous cases in which it worked fine. Even though they are not on this forum, people don't tend to write about things that go well. I think i've countered a lot of issues by planning to increase the spindle count to 44 in the near future. 44 disks should be able to provide massive performance for vm's that don't read or write extremely much, except during boot. I won't be hosting 40 very heavy 10TB databases :) Wish i had those customer, I'd probably bought a whiptail then.
I'm starting with 10 as the next 10 were on backorder, but even with these 10, a single usb disk is faster.. Putting two 4500rpm green power drives in raid 0 was faster... i might as well install windows 3.11 or centos and provide an nfs export. That would render 5 times the performance, but not the ability to lose a disk and not lose data. I really love the and all those other features zfs provides. (and i need a little more storage than a single usb stick)



I'll let you guys know how it all works out :) Thanks already!


with regards,

Theo
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Not-typing-on-the-cell-phone-today gripe: Please never write "mb/s" again. What the F does that mean? Megabytes per second? Megabits per second? Motherboards per socket?

NFS isn't a problem with VMware. You just have to do it right, not every combination of randomly chosen crap will result in success.

A ZIL isn't what you think it is. You mean a SLOG, I'd guess. You always have a ZIL. It's part of your pool. The ZIL is not a cache. It's a mechanism to ensure POSIX compliance. p.s. Read That Post.

If setting "sync=disabled" doesn't help, your problem isn't sync writes.

44 spindles probably isn't going to fix your problems if you can't make a smaller number work well.

For a VM environment, the cheat sheet is as follows:

1) Lots of RAM (you appear OK)

2) Enough L2ARC to hold the working set (you could be OK there too).

3) Sufficient time after boot for the L2ARC to warm up.

At this point you go off track.

4) Use mirror vdevs. You can have 2 or 3 way mirrors to meet survivability requirements. Four vdevs of three disks each maybe. or six of two, or whatever. RAIDZ will dramatically kill your performance in a VM environment.

5) Ensure your pool never exceeds 50% full, to reduce performance loss due to fragmentation. There is a convincing argument out there to push that number as low as 10% to maintain performance over time if you have a heavy random write environment. That's right, you might actually need 40TB of storage space to deliver 4TB of heavily used space (but I can pretty much promise that it'll fly!)

6) In a heavy write environment, you may wish to use something besides SSD for SLOG. I've been having fun abusing an LSI2208 with BBU and some hard drives, used exclusively for SLOG. Unlimited endurance, RAID1 device protection, but somewhat slower (limited to the speed of the hard drives). I'm relatively convinced that this is THE way to go for a NAS if money isn't a problem.

I have said a whole lot of times that ZFS is big and piggy. If you throw the resources needed to solve a problem at it, it'll be totally awesome. If you don't, it'll be anywhere from disappointing to epic fail.

I'm also going to suggest that you should be testing individual subsystems independently to identify any other issues. See if your network is actually capable of sustaining the throughput with iperf. See if you're getting proper throughput on all your drives with dd, individually and in aggregate.

Your performance is never going to be stellar because of the ancient CPU's but the numbers you are reporting seem way low.
 

zambanini

Patron
Joined
Sep 11, 2013
Messages
479
@oehTie Hi Teo, did you solve your problem?
since old posts have many usefull information, I really would love to see how you solved that issue.
 
Status
Not open for further replies.
Top