Just an observation - I/O patterns in combined NAS/VM applications

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I have become a bit obsessed with Grafana lately and noticed a pattern that made me look twice in my disk I/O.
For all my disks a typical graph at any particular time looks like this:

Bildschirmfoto 2021-02-21 um 22.37.51.png


This is true for my spinning disks where I store "stuff" as well as for my SSDs where all my VMs and jails reside. I was so irritated by the completely flat line for "read" that I quickly fired up dd if=/dev/ada0 of=/dev/null bs=1m for a minute to check if I had not misconfigured my graphing and was missing on all the read operations that surely must take place. No?

My system is expanded almost to the max for a home server, see below for specs. With 64G of RAM there's a lot of space for ARC. And after some thinking I came up with this explanation:
  • everything that is read is in ARC already
  • so all visible disk access is writes
  • blocks that are written are kept in ARC, because subsequent reads are common
  • so even then we still see no reads
What is "puzzling" about this to an old guy like me - and probably some of you, too - is that we (I?) think of the applications I run as "read mostly". Web applications with databases. The surprising fact is that with modern architectures the access patterns we see at the device level are completely the other way round.

Just wanted to share an "heureka" moment.

Kind regards,
Patrick
 

RegularJoe

Patron
Joined
Aug 19, 2013
Messages
330
ZFS is said to live and die by the cache....

So here it is living, and kicking ass and taking names.

I bet if you start to graph at the same time your ZFS stats you would see your reads. 90-99% hit is a lot. It has been just recently that Dell Perc cards go 4gb of NV cache, you have 50-60gig of cache in your system depending on apps/jals and such. If a hypervisor like VMware is in there they do a ton of buffering disk writes like no other hypervisor, there are times that vmware will post better write numbers than bare metal, due to caching. I find that entertaining.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
As to read vs. write in the context of web servers: I remember an article from the last 1990s in the German magazine iX, when they had to upgrade their then-inhouse web server to something really beefy (Sun E5000?). The main reason mentioned was that contrary to common belief a web server creates more write (for access logging) than read IO. Of course that was when CGI was mostly done with Perl.

To completely wonder off (I am just in the mood :wink: ) w.r.t Perl: Back in 2003 I wrote a program for a large email provider (30 million+ accounts) to automatically remove mails from trash, once the retention period was expired. Probably my first program where error handling and logging made up about 80% of the overall code. Those were the days ... :cool:
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
w.r.t Perl: Back in 2003 I wrote a program for a large email provider [...]
Hehe ... my last project in Perl was an email accounting script. I got from the customer's in-house implementation that took O(n^2) to one that took O(n) in runtime - n being the number of emails or lines in the Sendmail log. Plus O(m) in memory, m being the number of distinct addresses to account for. Still quite proud of that achievement. Perl hashes rocked back then.
 
Top