Usually our TrueNAS and FreeNAS servers perform well with the %b column from "iostat -x"showing hundreds of megabytes/second read or written with the %b ( %busy or same as %util in Linux) at only a couple of percent for each disk. But every few months performance goes to hell, with total throughput only 1 or 2 mbs and %b for group of disks at 99% or 100%. While this is happening a simple ls can take 5 minutes. I assume this is because a client is doing a lot of random I/O that keeps the heads moving for very little data transfer. How do I locate that job among the many jobs from many users on many nfs clients? On the client computer I can find out how many bytes are transferred by each process, but that number is small for all jobs - the one doing random I/O doesn't get more bytes than the jobs doing sequential I/O, it just exercises the heads more. I need this information to contact the user doing random I/O and work with them to do something else.
thanks
dan feenberg
NBER
thanks
dan feenberg
NBER
Last edited by a moderator: