setup : TrueNAS-12.0-U5|Mobo:HP DL380e G8|CPU: 2 x E52450|RAM: 128GB|boot HDD 1 x 3TB IBMESXS|data HDD:12 x 10TB Seagate Ironwolf Pro|SSD Cache: 1 x 2TB Samsung 860 Pro|HBA : LSI SAS 9207 8i PCI3|NIC: onboard 1gb + 10Gb SolarFlare SFN7002F SFP+ Dual-Port 10GbE Flareon PCIe 3.0|1 Pool consisting of 2 x 6 drive vdevs , dedupe is off, recordsize=1MB| NFS v4
I have a truenas server built as above. It is dedicated for a specific workload, which mostly is read heavy and has benefited reasonably from the RAM and SSD caching so far. However, there is also a write heavy workload and since optimizing the client cpu efficiency of the write heavy workload the truenas' write throughput is now a serious bottleneck. When running 100+ writer processes they all block severely on writing to the nas, in this state each of the 12 hdd's are reporting a write rate of about 10MB/second so in total 120MB/sec, and averaging 1-5 pending disk operations.
The shape of the pool consisting of two vdevs each with 6 drives on raidz2 was chosen to get pretty good storage capacity - accepting a bit of a compromise on IO throughput. However 120MB/sec is worse that i was expecting and just simple tests either via nfs or locally on the server demonstrates the server is capable of better IO. Just copying 6 files concurrently over nfs tops out at about 500MB/sec on the writes (files were already cached so no reads). A dd test locally on the truenas server yields 2.5GB/sec write and 5GB/sec read on a 400GB file.
To find out the write requirements of my workload i temporarily pointed the workload writes to a ramdisk instead of truenas to see how fast it needed to write to disk when blocked on cpu rather than disk. In this test the workload sure enough was cpu bound and wrote to the ramdisk at rate of 0.3MB /second across 48 separate files. Hence to run say 100 processes doing the same thing I calculate it needs a write throughput of only 30MB/sec but of course across 4800 files - hhmm.
Is this just an obvious case of thrashing the heads to do random writes ? The individual writes themselves should be sequential but likely ruined by the concurrency.
The question then is how to refactor my workload to use truenas throughput more efficiently ?
The options I am thinking about are introducing middle tier to only write concurrently across say 4 threads, or re-designing the file layout so that each writer only writes to one file or some mix of both. The files are hdf5 tables - not sure if it would achieve anything if the 48 tables were contained in one large hdf. If required I could consider nas rebuild if there was a strong case for it.
thanks for any suggestions.
I have a truenas server built as above. It is dedicated for a specific workload, which mostly is read heavy and has benefited reasonably from the RAM and SSD caching so far. However, there is also a write heavy workload and since optimizing the client cpu efficiency of the write heavy workload the truenas' write throughput is now a serious bottleneck. When running 100+ writer processes they all block severely on writing to the nas, in this state each of the 12 hdd's are reporting a write rate of about 10MB/second so in total 120MB/sec, and averaging 1-5 pending disk operations.
The shape of the pool consisting of two vdevs each with 6 drives on raidz2 was chosen to get pretty good storage capacity - accepting a bit of a compromise on IO throughput. However 120MB/sec is worse that i was expecting and just simple tests either via nfs or locally on the server demonstrates the server is capable of better IO. Just copying 6 files concurrently over nfs tops out at about 500MB/sec on the writes (files were already cached so no reads). A dd test locally on the truenas server yields 2.5GB/sec write and 5GB/sec read on a 400GB file.
To find out the write requirements of my workload i temporarily pointed the workload writes to a ramdisk instead of truenas to see how fast it needed to write to disk when blocked on cpu rather than disk. In this test the workload sure enough was cpu bound and wrote to the ramdisk at rate of 0.3MB /second across 48 separate files. Hence to run say 100 processes doing the same thing I calculate it needs a write throughput of only 30MB/sec but of course across 4800 files - hhmm.
Is this just an obvious case of thrashing the heads to do random writes ? The individual writes themselves should be sequential but likely ruined by the concurrency.
The question then is how to refactor my workload to use truenas throughput more efficiently ?
The options I am thinking about are introducing middle tier to only write concurrently across say 4 threads, or re-designing the file layout so that each writer only writes to one file or some mix of both. The files are hdf5 tables - not sure if it would achieve anything if the 48 tables were contained in one large hdf. If required I could consider nas rebuild if there was a strong case for it.
thanks for any suggestions.