No zil activity on iscsi write?

Status
Not open for further replies.

datnus

Contributor
Joined
Jan 25, 2013
Messages
102
Hi everyone,
I use Freenas 8.3 with SSD as ZIL
1) I have set the zvol with sync=standard
Code:
zfs get sync Data/Data2a
NAME        PROPERTY  VALUE    SOURCE
Data/Data2a  sync      standard  local


2) However, I dont see the ZIL activity on iscsi write?
The log in red is almost static.

zpool iostat -v 1
Code:
                                          capacity    operations    bandwidth
pool                                    alloc  free  read  write  read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
Data                                    2.83T  4.42T      8    497  71.9K  1.78M
  raidz2                                2.83T  4.42T      8    497  71.9K  1.78M
    gptid/5d286cbc-d12b-11e2-a991-002590acf452      -      -      5    262  12.0K  354K
    gptid/5daa47a2-d12b-11e2-a991-002590acf452      -      -      0    259      0  357K
    gptid/5e24dc3c-d12b-11e2-a991-002590acf452      -      -      0    259      0  357K
    gptid/5e9e6423-d12b-11e2-a991-002590acf452      -      -      4    259  21.0K  354K
    gptid/5f1fced4-d12b-11e2-a991-002590acf452      -      -      1    258  24.0K  346K
    gptid/5f949238-d12b-11e2-a991-002590acf452      -      -      4    258  21.0K  339K
    gptid/5fcda25d-d12b-11e2-a991-002590acf452      -      -      6    275  15.0K  360K
    gptid/5ff13459-d12b-11e2-a991-002590acf452      -      -      6    271  11.5K  359K
logs                                        -      -      -      -      -      -
  mirror                                  36K  31.7G      0      0      0      0
    gptid/324049f9-d12d-11e2-a991-002590acf452      -      -      0      0      0      0
    gptid/74ddf668-d12d-11e2-a991-002590acf452      -      -      0      0      0      0
cache                                      -      -      -      -      -      -
  gpt/cache0                            116G  7.98M    57      0  496K      0
  gpt/cache1                            116G  6.91M    48    20  432K  1.09M
--------------------------------------  -----  -----  -----  -----  -----  -----


3) The gstat shows another picture.
The ZIL in SSD is slow with only 219 IOPS???

Code:
dT: 1.001s  w: 1.000s
L(q)  ops/s    r/s  kBps  ms/r    w/s  kBps  ms/w  %busy Name
    0      7      7    56    0.3      0      0    0.0    0.2| mfid0
  10    219      2      1    9.6    215    415  32.2  85.8| mfid1  => my SSD ZIL
    0      0      0      0    0.0      0      0    0.0    0.0| mfid0p1
    0      7      7    56    0.3      0      0    0.0    0.2| mfid0p2
    0      0      0      0    0.0      0      0    0.0    0.0| mfid1p1
  10    219      2      1    9.6    215    415  32.2  85.9| mfid1p2
  10    237      2      1    7.7    233    427  30.8  89.5| mfid2
  10    224      2      1    9.7    220    356  31.2  90.6| mfid3
  10    223      1      0    4.4    220    358  30.0  85.4| mfid4
  10    230      0      0    0.0    228    420  32.3  97.7| mfid5
  10    212      0      0    0.0    210    284  32.1  87.0| mfid6


Could you please help me to get the most from SSD ZIL?
The SSD ZIL should deliver 20,000-30,000 IOPS, right?
Thanks so much.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
First, you need to stop looking at raw numbers and assume that you MUST have more. 219 OPS at 4k a piece is 876k.. at 8k it's a lot more...in that 1 second. That might be all that was needed to be committed to the ZIL. Remember that not everything is written to the ZIL and then the pool. ZFS smartly chooses when to put stuff in the ZIL to increase performance and when to flush straight to the pool.

For example, if you were to copy an 8GB file across CIFS then first 7.9GB or so would NOT be a sync write and may not be put in the ZIL at all. ZFS picks and chooses what to put in the ZIL. Very large quantities of data that are going to be a large single transaction are NOT put in the ZIL(even some sync writes). The ZIL is designed to prevent tons of small writes and if you were to fill up the ZIL with a couple of large transactions that could have been committed to the zpool then those small writes you actually want in the ZIL might not have space.

You also provided no hardware specs at all. Those things might provide some insight as to if something is wrong. My guess is everything is working just as it should.
 

datnus

Contributor
Joined
Jan 25, 2013
Messages
102
Hi cyberjock,
Thanks so much for your kind help so far. I appreciate that :)
I have learnt a lot from you and FreeNAS community.
--------------
BTW, I have 2 SSDs and 8 HDDs. The server has 16 GB RAM.
The two SSD sis partitioned to 2.
The zil log is a mirror of 2 partitions (32GB)
The cache is a strip of 2 remain partition (130 GB)

The following output is when the server is in light load.
1. top
Code:
~# top
last pid: 71733;  load averages:  0.01,  0.07,  0.11  up 36+11:32:51  11:51:17
28 processes:  1 running, 27 sleeping
CPU:  0.0% user,  0.0% nice,  6.4% system,  0.0% interrupt, 93.6% idle
Mem: 15M Active, 171M Inact, 14G Wired, 55M Cache, 164M Buf, 802M Free
ARC: 9481M Total, 2529M MFU, 2617M MRU, 336K Anon, 4511M Header, 106M Other
Swap: 18G Total, 182M Used, 18G Free
 
  PID USERNAME    THR PRI NICE  SIZE    RES STATE  C  TIME  WCPU COMMAND
1854 root        12  44    0  210M 74072K ucond  1 543:20  0.00% istgt
2181 root          7  44    0 72608K  4132K ucond  0  13:46  0.00% collectd
2059 root          6  44    0  247M 43404K uwait  0  5:17  0.00% python
1777 root          1  44    0 11672K  832K select  0  2:00  0.00% ntpd
51000 root          2  44    0 54440K  2984K select  0  0:14  0.00% python
2301 root          1  76    0  7840K  676K nanslp  0  0:12  0.00% cron
1528 root          1  44    0  6784K  592K select  1  0:10  0.00% syslogd
2501 root          1  44    0  7972K  548K select  0  0:04  0.00% rpcbind
2601 root          1  50    0 90876K  2600K wait    0  0:01  0.00% python
63608 www          1  44    0 14408K  1052K kqread  0  0:00  0.00% nginx
2118 root          1  44    0 14408K  908K pause  0  0:00  0.00% nginx
71703 root          1  44    0 33220K  3924K select  2  0:00  0.00% sshd
65667 root          1  44    0 10216K  584K ttyin  1  0:00  0.00% csh
71705 root          1  64    0 10216K  2392K ttyin  1  0:00  0.00% csh
71731 root          1  44    0 10216K  2376K pause  0  0:00  0.00% csh


2. arc_summary
Code:
 ~# arc_summary.py
System Memory:
 
        0.14%  21.53  MiB Active,    1.11%  174.63  MiB Inact
        93.39%  14.41  GiB Wired,      0.35%  55.29  MiB Cache
        5.01%  792.18  MiB Free,      0.00%  796.00  KiB Gap
 
        Real Installed:                        16.00  GiB
        Real Available:                99.74%  15.96  GiB
        Real Managed:                  96.69%  15.43  GiB
 
        Logical Total:                          16.00  GiB
        Logical Used:                  93.76%  15.00  GiB
        Logical Free:                  6.24%  1022.10 MiB
 
Kernel Memory:                                  6.19    GiB
        Data:                          99.74%  6.17    GiB
        Text:                          0.26%  16.26  MiB
 
Kernel Memory Map:                              8.00    GiB
        Size:                          63.43%  5.07    GiB
        Free:                          36.57%  2.93    GiB
                                                                Page:  1
------------------------------------------------------------------------
 
ARC Summary: (HEALTHY)
        Storage pool Version:                  28
        Filesystem Version:                    5
        Memory Throttle Count:                  0
 
ARC Misc:
        Deleted:                                492.39m
        Recycle Misses:                        91.21m
        Mutex Misses:                          1.29m
        Evict Skips:                            1.29m
 
ARC Size:                              93.81%  9.26    GiB
        Target Size: (Adaptive)        93.87%  9.26    GiB
        Min Size (Hard Limit):          12.50%  1.23    GiB
        Max Size (High Water):          8:1    9.87    GiB
 
ARC Size Breakdown:
        Recently Used Cache Size:      74.47%  6.90    GiB
        Frequently Used Cache Size:    25.53%  2.36    GiB
 
ARC Hash Breakdown:
        Elements Max:                          30.81m
        Elements Current:              65.65%  20.22m
        Collisions:                            1.46b
        Chain Max:                              186
        Chains:                                262.14k
                                                                Page:  2
------------------------------------------------------------------------
 
ARC Efficiency:                                2.28b
        Cache Hit Ratio:                93.72%  2.13b
        Cache Miss Ratio:              6.28%  142.97m
        Actual Hit Ratio:              92.52%  2.11b
 
        Data Demand Efficiency:        97.11%  2.08b
        Data Prefetch Efficiency:      32.85%  111.26m
 
        CACHE HITS BY CACHE LIST:
          Anonymously Used:            0.55%  11.81m
          Most Recently Used:          41.13%  877.95m
          Most Frequently Used:        57.59%  1.23b
          Most Recently Used Ghost:    0.34%  7.18m
          Most Frequently Used Ghost:  0.39%  8.37m
 
        CACHE HITS BY DATA TYPE:
          Demand Data:                  94.59%  2.02b
          Prefetch Data:                1.71%  36.55m
          Demand Metadata:              3.70%  78.97m
          Prefetch Metadata:            0.00%  640
 
        CACHE MISSES BY DATA TYPE:
          Demand Data:                  42.00%  60.05m
          Prefetch Data:                52.25%  74.71m
          Demand Metadata:              5.74%  8.21m
          Prefetch Metadata:            0.00%  4.19k
                                                                Page:  3
------------------------------------------------------------------------
 
L2 ARC Summary: (HEALTHY)
        Passed Headroom:                        90.96m
        Tried Lock Failures:                    112.92m
        IO In Progress:                        2.02k
        Low Memory Aborts:                      290
        Free on Write:                          290.49k
        Writes While Full:                      246.25k
        R/W Clashes:                            5.38k
        Bad Checksums:                          0
        IO Errors:                              0
        SPA Mismatch:                          0
 
L2 ARC Size: (Adaptive)                        148.75  GiB
        Header Size:                    2.69%  4.00    GiB
 
L2 ARC Evicts:
        Lock Retries:                          61.38k
        Upon Reading:                          18
 
L2 ARC Breakdown:                              142.97m
        Hit Ratio:                      36.34%  51.95m
        Miss Ratio:                    63.66%  91.02m
        Feeds:                                  3.42m
 
L2 ARC Buffer:
        Bytes Scanned:                          1.44    PiB
        Buffer Iterations:                      3.42m
        List Iterations:                        217.21m
        NULL List Iterations:                  65.59m
 
L2 ARC Writes:
        Writes Sent:                    100.00% 2.71m
                                                                Page:  4
------------------------------------------------------------------------
 
File-Level Prefetch: (HEALTHY)
 
DMU Efficiency:                                1.04b
        Hit Ratio:                      93.74%  973.76m
        Miss Ratio:                    6.26%  65.04m
 
        Colinear:                              65.04m
          Hit Ratio:                    0.10%  65.25k
          Miss Ratio:                  99.90%  64.97m
 
        Stride:                                938.00m
          Hit Ratio:                    99.99%  937.87m
          Miss Ratio:                  0.01%  131.15k
 
DMU Misc:
        Reclaim:                                64.97m
          Successes:                    4.28%  2.78m
          Failures:                    95.72%  62.19m
 
        Streams:                                35.95m
          +Resets:                      0.15%  55.59k
          -Resets:                      99.85%  35.89m
          Bogus:                                0
                                                                Page:  5
------------------------------------------------------------------------
 
                                                                Page:  6
------------------------------------------------------------------------
 
ZFS Tunable (sysctl):
        kern.maxusers                          384
        vm.kmem_size                            11773683404
        vm.kmem_size_scale                      1
        vm.kmem_size_min                        0
        vm.kmem_size_max                        14717104256
        vfs.zfs.l2c_only_size                  151942420992
        vfs.zfs.mfu_ghost_data_lsize            2063646720
        vfs.zfs.mfu_ghost_metadata_lsize        668780544
        vfs.zfs.mfu_ghost_size                  2732427264
        vfs.zfs.mfu_data_lsize                  2421637120
        vfs.zfs.mfu_metadata_lsize              0
        vfs.zfs.mfu_size                        2651654144
        vfs.zfs.mru_ghost_data_lsize            823885824
        vfs.zfs.mru_ghost_metadata_lsize        6379898368
        vfs.zfs.mru_ghost_size                  7203784192
        vfs.zfs.mru_data_lsize                  2552217600
        vfs.zfs.mru_metadata_lsize              6561792
        vfs.zfs.mru_size                        2742906880
        vfs.zfs.anon_data_lsize                0
        vfs.zfs.anon_metadata_lsize            0
        vfs.zfs.anon_size                      260096
        vfs.zfs.l2arc_norw                      1
        vfs.zfs.l2arc_feed_again                1
        vfs.zfs.l2arc_noprefetch                1
        vfs.zfs.l2arc_feed_min_ms              200
        vfs.zfs.l2arc_feed_secs                1
        vfs.zfs.l2arc_headroom                  2
        vfs.zfs.l2arc_write_boost              8388608
        vfs.zfs.l2arc_write_max                8388608
        vfs.zfs.arc_meta_limit                  2649078765
        vfs.zfs.arc_meta_used                  4966219888
        vfs.zfs.arc_min                        1324539382
        vfs.zfs.arc_max                        10596315063
        vfs.zfs.dedup.prefetch                  1
        vfs.zfs.mdcomp_disable                  0
        vfs.zfs.write_limit_override            0
        vfs.zfs.write_limit_inflated            51403763712
        vfs.zfs.write_limit_max                2141823488
        vfs.zfs.write_limit_min                33554432
        vfs.zfs.write_limit_shift              3
        vfs.zfs.no_write_throttle              0
        vfs.zfs.zfetch.array_rd_sz              1048576
        vfs.zfs.zfetch.block_cap                256
        vfs.zfs.zfetch.min_sec_reap            2
        vfs.zfs.zfetch.max_streams              8
        vfs.zfs.prefetch_disable                0
        vfs.zfs.mg_alloc_failures              8
        vfs.zfs.check_hostid                    1
        vfs.zfs.recover                        0
        vfs.zfs.txg.synctime_ms                1000
        vfs.zfs.txg.timeout                    5
        vfs.zfs.scrub_limit                    10
        vfs.zfs.vdev.cache.bshift              16
        vfs.zfs.vdev.cache.size                0
        vfs.zfs.vdev.cache.max                  16384
        vfs.zfs.vdev.write_gap_limit            4096
        vfs.zfs.vdev.read_gap_limit            32768
        vfs.zfs.vdev.aggregation_limit          131072
        vfs.zfs.vdev.ramp_rate                  2
        vfs.zfs.vdev.time_shift                6
        vfs.zfs.vdev.min_pending                4
        vfs.zfs.vdev.max_pending                10
        vfs.zfs.vdev.bio_flush_disable          0
        vfs.zfs.cache_flush_disable            0
        vfs.zfs.zil_replay_disable              0
        vfs.zfs.zio.use_uma                    0
        vfs.zfs.version.zpl                    5
        vfs.zfs.version.spa                    28
        vfs.zfs.version.acl                    1
        vfs.zfs.debug                          0
        vfs.zfs.super_owner                    0
                                                                Page:  7
------------------------------------------------------------------------
 

datnus

Contributor
Joined
Jan 25, 2013
Messages
102
3. zpool iostat 1
When the server is in load, there is a lot of 3M write per second.
Could I consider they "small" or "random" write?

I use VMware to connect to FreeNAS iSCSI, so writes will be to a VMware vmdk file.
Hence, I believe ZFS should think it as "random" or "small" write.

Code:
 ~# zpool iostat 1
              capacity    operations    bandwidth
pool        alloc  free  read  write  read  write
----------  -----  -----  -----  -----  -----  -----
Data        2.83T  4.42T    28    512  219K  2.74M
Data        2.83T  4.42T      0      0      0      0
Data        2.83T  4.42T      0  2.30K  7.99K  8.13M
Data        2.83T  4.42T      0      0  4.50K      0
Data        2.83T  4.42T    276      0  2.16M      0
Data        2.83T  4.42T      0      0  2.00K      0
Data        2.83T  4.42T    364      0  2.85M      0
Data        2.83T  4.42T      0  2.01K      0  7.01M
Data        2.83T  4.42T      0      0      0      0
Data        2.83T  4.42T      0      0      0      0
Data        2.83T  4.42T      0      0      0      0
Data        2.83T  4.42T      1      0  4.50K      0
Data        2.83T  4.42T      0  2.32K      0  9.06M
Data        2.83T  4.42T      0    36      0  454K
Data        2.83T  4.42T      0      0      0      0
Data        2.83T  4.42T    22      0  86.4K      0
 

datnus

Contributor
Joined
Jan 25, 2013
Messages
102
When I tested IOPS using IOMeter on a VM on this freeNAS server, the IOPS for the zil hit 1000 IOPS and IOMeter gave a result around 18000 IOPS.
Read seems to be fast as they are mostly cached in the L2ARC.

Are there any ways I would take advantage of fast write IOPS of SSD Zil?

Thanks so much.
 
  • Like
Reactions: EdM

datnus

Contributor
Joined
Jan 25, 2013
Messages
102
Ok, I answer myself. Esxi issues iscsi commands as async, so the write wont go to zil if the sync=standard.
 

pbucher

Contributor
Joined
Oct 15, 2012
Messages
180
You got it.

Which is why some folks think iSCSI is a cure all for ESXi. It's faster because it's doing async writes and your ESXi data isn't being safe guarded on non-volatile storage like it is with default nfs.
 
Status
Not open for further replies.
Top