How to view ZFS I/O scheduler outstanding I/O's by class?

Status
Not open for further replies.

scurrier

Patron
Joined
Jan 2, 2014
Messages
297
ZFS has an IO scheduler that dispatches IO requests to leaf vdevs depending on the class of the file system IO to be satisfied. The class of filesystem IO is one of the following and is prioritized in this order:
sync read
sync write
async read
async write
scrub/resilver

Each class has a corresponding min and max active outstanding requests number that is sought to be fulfilled by the scheduler.

Is there any diagnostic tool for viewing the number of each class of these requests on a running system?

-- For Posterity --
This scheme is more fully described in the source code comments here.

These tunables control the balance of scheduling:
vfs.zfs.vdev.max_active
vfs.zfs.vdev.sync_read_min_active
vfs.zfs.vdev.sync_read_max_active
vfs.zfs.vdev.sync_write_min_active
vfs.zfs.vdev.sync_write_max_active
vfs.zfs.vdev.async_read_min_active
vfs.zfs.vdev.async_read_max_active
vfs.zfs.vdev.async_write_min_active
vfs.zfs.vdev.async_write_max_active
vfs.zfs.vdev.scrub_min_active
vfs.zfs.vdev.scrub_max_active
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm sure this information is accessible in a barely-readable format, but I don't know if there are scripts that pretty-print it. I'll ask around and see if anyone knows.
 

scurrier

Patron
Joined
Jan 2, 2014
Messages
297
I think this info would be super useful to anyone troubleshooting performance. Throughput is bad? Oh look, queue depth is very low on that IO class. Latency is bad? Oh look, queue depth is too high on that IO class.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Haven't had much luck. The best option that we've come up with would be to use dtrace. You may be able to use a new utility, dwatch, to help in using dtrace, but that's not in FreeNAS yet.

By the way, default queue depth is apparently 10. It's a tunable somewhere.
 

scurrier

Patron
Joined
Jan 2, 2014
Messages
297
Thanks for trying. I found the tunables, they're listed in the first post. You're right, many default to 10. I find that strange since to my understanding most disks controllers can handle an NCQ queue of 32, meaning that by default maybe some throughput is being left on the table.

I'm also not sure what queues gstat and iostat are showing. Is that active IO that the disk already has, or if those values are taken at a higher level queue than the disk controller.

Also, there is a discrepancy because if I add up the vfs.zfs.vdev.max_* values, it adds up to less than what I have seen outstanding on some of the disks in these tools. I have seen queue lengths of 70+ in both tools, which is more than the 40-50 that should be possible according to the zfs tunables.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
Thanks for trying. I found the tunables, they're listed in the first post. You're right, many default to 10. I find that strange since to my understanding most disks controllers can handle an NCQ queue of 32, meaning that by default maybe some throughput is being left on the table.

I'm also not sure what queues gstat and iostat are showing. Is that active IO that the disk already has, or if those values are taken at a higher level queue than the disk controller.

Also, there is a discrepancy because if I add up the vfs.zfs.vdev.max_* values, it adds up to less than what I have seen outstanding on some of the disks in these tools. I have seen queue lengths of 70+ in both tools, which is more than the 40-50 that should be possible according to the zfs tunables.

Lots of stuff buried in dtrace..

This might be a good starting spot:
https://gist.github.com/szaydel/bffe0e637ef3c0c884ad3f6136d2ae16
 
Status
Not open for further replies.
Top