How to view effectiveness of vdev

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
How can I determine how effective the write caching added by the mirrored NVME or the metadata on the mirrored SSD?


I created a pool as this.
1648784189015.png


The optane NVME log mirrored pair do not seem to be used when streaming large files to a share on the hddpool
1648784501991.png



The metadata mirrored pair see some activity
1648784629455.png


The 6 data disks in the vdev seem busy
1648784688249.png


The network seems to show fairly consistant performance if not particularly amazing fast.
1648784861208.png
 

Attachments

  • 1648784456968.png
    1648784456968.png
    336.5 KB · Views: 285

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
zpool list -v will give you the usage of the special VDEV

zpool iostat -v hddpool while doing some activity will show you what's happening.

The optane NVME log mirrored pair do not seem to be used when streaming large files to a share on the hddpool
You should really understand more about what SLOG is and why you would use it before expecting to see anything "improved" by using it.

Sync writes are the only things that will be "helped" by SLOG. Apparently your large file streaming is async.

The typical use-case for SLOG is when you must do sync writes and you don't want to lose anything in the event of a power cut/failure.

Since you're using RAIDZ1, you're clearly not trying to do block storage (at least not the right way), so you probably don't need a SLOG.
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
Thanks for your advice. I hear you that there could be an issue with my understanding, but it still seems like something is not working correctly. Despite what must be 50-100 TB of reads / writes, I have 0.00% of my log vdev used. Am I wrong to expect it to be used even if I didn't benefit from its existence?


Code:
hddpool                                   76.8T  39.8T  37.0T        -         -                          0%    51%  1.00x    ONLINE  /mnt
  raidz1                                  76.4T  39.8T  36.6T        -         -                          0%  52.1%      -    ONLINE
    ced853bc-6a94-4f93-a41e-c616d1f42ff9      -      -      -        -         -                           -      -      -    ONLINE
    da87c3a8-566d-4ccb-a765-7f81f0efeecc      -      -      -        -         -                           -      -      -    ONLINE
    ddf2d446-63bb-4b54-8539-ed4312724f7a      -      -      -        -         -                           -      -      -    ONLINE
    422c04e7-31d3-43bb-89b6-ec33fd90a193      -      -      -        -         -                           -      -      -    ONLINE
    d9514c2c-e371-4be2-922e-591181609e4a      -      -      -        -         -                           -      -      -    ONLINE
    ca7c8aed-7935-43e3-bbb3-055f1bd37304      -      -      -        -         -                           -      -      -    ONLINE
special                                       -      -      -        -         -                           -      -      -  -
  mirror                                   464G  22.9G   441G        -         -                          2%  4.93%      -    ONLINE
    2b572e30-e228-46ca-a718-268257bbad4e      -      -      -        -         -                           -      -      -    ONLINE
    49621eb9-e03a-4383-a633-220534931866      -      -      -        -         -                           -      -      -    ONLINE
logs                                          -      -      -        -         -                           -      -      -  -
  mirror                                   260G      0   260G        -         -                          0%  0.00%      -    ONLINE
    02ae475f-5a53-4523-b8ba-6a422fe6d90c      -      -      -        -         -                           -      -      -    ONLINE
    869e8a58-2c00-42d2-b6f4-4dd12b8ca452      -      -      -        -         -                           -      -      -    ONLINE


Here is the zpool iostat -v hddpool (while writing a few TB to the SMB share)
Code:
pool                                      alloc   free   read  write   read  write
----------------------------------------  -----  -----  -----  -----  -----  -----
hddpool                                   40.0T  36.9T      0      7  1.27K  2.05M
  raidz1                                  40.0T  36.4T      0      7     45  2.04M
    ced853bc-6a94-4f93-a41e-c616d1f42ff9      -      -      0      1      7   348K
    da87c3a8-566d-4ccb-a765-7f81f0efeecc      -      -      0      1      7   348K
    ddf2d446-63bb-4b54-8539-ed4312724f7a      -      -      0      1      7   348K
    422c04e7-31d3-43bb-89b6-ec33fd90a193      -      -      0      1      7   347K
    d9514c2c-e371-4be2-922e-591181609e4a      -      -      0      1      7   348K
    ca7c8aed-7935-43e3-bbb3-055f1bd37304      -      -      0      1      7   348K
special                                       -      -      -      -      -      -
  mirror                                  23.0G   441G      0      0  1.21K  7.59K
    2b572e30-e228-46ca-a718-268257bbad4e      -      -      0      0    615  3.80K
    49621eb9-e03a-4383-a633-220534931866      -      -      0      0    624  3.80K
logs                                          -      -      -      -      -      -
  mirror                                      0   260G      0      0     11     22
    02ae475f-5a53-4523-b8ba-6a422fe6d90c      -      -      0      0      5     11
    869e8a58-2c00-42d2-b6f4-4dd12b8ca452      -      -      0      0      5     11
----------------------------------------  -----  -----  -----  -----  -----  -----
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
SLOG only needs to maintain a few transaction groups worth of data. You will never have a lot of data in it, never more than fifteen seconds worth.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Here is the zpool iostat -v hddpool (while writing a few TB to the SMB share)
SMB does not do synchronous writes so your SLOG device is not used for SMB sharing.
An SLOG is not a write cache. An SLOG is not a write cache. An SLOG ...
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
SLOG only needs to maintain a few transaction groups worth of data. You will never have a lot of data in it, never more than fifteen seconds worth.
I thought it was 5 seconds, by default
 
Last edited:

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
SLOG only needs to maintain a few transaction groups worth of data. You will never have a lot of data in it, never more than fifteen seconds worth.
I read that in your 2013 article. very short lifespan in the log volume. I understand that doesn't build up over time as you explained it pretty well. I guess I just thought I would see some evidence of it being used.

If 0.00% is what I will always see, then that is fine, I need to continue to look to find somewhere that shows it is correctly configured and can be used. That was really why I created this post to begin with.
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
SMB does not do synchronous writes so your SLOG device is not used for SMB sharing.
An SLOG is not a write cache. An SLOG is not a write cache. An SLOG ...
Thank you for correcting my post. I was wrong when I said write caching, I now understand I should have said write enhancing. If I got it right, The write performance will be enhanced by diverting ZIL writes from the HDDs to the Optane volume. This frees up the spindles leading to overall better write performance.

Also thank you for the clarification that SMB does not do synchronous writes.
I had not yet picked up on that fact. if you add that to this from the ars technical article
(Adding a LOG vdev to a pool absolutely cannot and will not directly improve asynchronous write performance)

It is pretty clear that I am not getting a write benefit that I hoped for.
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
From your use case having the SLOG is a waste of time - you might as well remove it. Its doing nothing. In general only NFS or iSCSI traffic needs / wants / uses sync traffic. Just about everything else is async where the SLOG has no purpose.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
If I got it right, The write performance will be enhanced by diverting ZIL writes from the HDDs to the optane volume. This frees up the spindles leading to overall better write performance.
Yes. But only for synchronous writes. Which SMB sharing simply never does.

SLOG only enhances things like iSCSI block storage. You SMB writes simply end up in memory and will eventually be written out to the HDDs never touching the SLOG.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Thank you for correcting my post. I was wrong when I said write caching, I now understand I should have said write enhancing. If I got it right, The write performance will be enhanced by diverting ZIL writes from the HDDs to the optane volume. This frees up the spindles leading to overall better write performance.
Nope - its only in the case of sync writes. iSCSI or NFS. Some Mac traffic I believe - but as I don't use Mac's I am unsure
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Oh and just in case you aren't confused enough. Sync writes are slow. Sync writes to a SLOG (properly specified SLOG) are faster but async writes are faster still. A SLOG is for data security, not nessesarily speed - its just makes safe faster, but not as fast as unsafe
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
Oh and just in case you aren't confused enough. Sync writes are slow. Sync writes to a SLOG (properly specified SLOG) are faster but async writes are faster still. A SLOG is for data security, not nessesarily speed - its just makes safe faster, but not as fast as unsafe

I appreciate the feedback. It's been quite a journey to be honest :) I tried an earlier beta version of scale, and at that time I made an HDD only vdev. While I could write to it (via SMB) at an OK rate (350-550 MB/sec) it was not as good as what I was used to / hoped to achieve. I also noticed that things like search a volume for *.mkv, then sort it by the title (not filename) piece of metadata was much slower.
when I tried the current release, I hoped to improve the first via log, and the second via a special metadata volume. Obviously, I didn't do enough research ... but sometimes you just need to try stuff :)
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The metadata vdev makes perfect sense specifically for SMB file sharing. Just the SLOG doesn't.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Now the special vdev will be much more useful to you. By default it cache's metadata only but you can by tuning a datasets settings also load small files onto the vdev. If (as it should be) the vdev is made of smaller faster drives then this can make the whole pool feel snappier when searching and even return some files (small files) faster.
Warning: The special vdev is pool critical. If it fails then say bye bye to your whole pool. I assume you are using your 2*500Gb Micron SSD's
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
Now the special vdev will be much more useful to you. By default it cache's metadata only but you can by tuning a datasets settings also load small files onto the vdev. If (as it should be) the vdev is made of smaller faster drives then this can make the whole pool feel snappier when searching and even return some files (small files) faster.
Warning: The special vdev is pool critical. If it fails then say bye bye to your whole pool. I assume you are using your 2*500Gb Micron SSD's

yes for now it is on those disks in a mirror. I am not sure what I will use eventually, but for this test box, that's what I had. They are not the highest performing, nor do they likely have the power loss features that I would want long term. If I can see that their performance will hold me back I might be able to justify NVME.

I am evaluating if Scale is right for me. Nothing seems a perfect fit, so

PS, I don't have to be paranoid about my data, I tend to keep three seperate copies of it. I use redundancy to avoid downtime associated with restores, plus any error detect / repair benefits that come along with it.
 

mervincm

Contributor
Joined
Mar 21, 2014
Messages
157
yikes. The GUI refers to the log volume as a write-cache.
1649031213773.png
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
It does, and its sort of wrong - its very difficult to describe a SLOG in what is effectively a sound bite. The first sentence is correct, the second is iffy although it can be removed.

BTW - you don't need PLP for a special vdev - normal drives are fine - they are SSD's in an HDD pool. You do need PLP for your SLOG (that you don't need)
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
BTW - on the subject of performance - you have a RAIDZ1 array which has the IOPS of a single disk. However because of the way that TN cache's write transactions in memory it can do better when streaming data. Try configuring mirrors and see if the performance is any better
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I thought it was 5 seconds, by default

A transaction group is five seconds. But you have the transaction group that is currently being built (in RAM), and the transaction group that is being committed to the pool, which is two transaction groups. I seem to recall discussion that suggested that the block free and TRIM activity was only done after a full txg commit, so that's three transaction groups worth of storage. Am I forgetting something? It's too early and I'm not sufficiently caffeinated. I would expect a "handful" of txg's worth of data to be hanging around the SLOG. Someone will come in and correct me as to where I've erred in some technical gotcha but the overall point is correct.
 
Top