more than 1GB/s write (yeah!!) but only 200-530MB/s read

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
TrueNAS-12.0-RC1 installed on a Samsung 850 Evo SSD, no virtualization
Gigabyte X399 Aorus Extreme with Threadripper 2950X
128 GB of ECC Memory
Broadcom LSI 9305-16i SATA / SAS HBA Controller
12x Seagate Exos 16TB in one ZFS1-Pool (ST16000NM001G-2KK103 to be exact)
8x Seagate Exos 12TB in a second ZFS1-Pool (ST12000NM0008-2H to be exact)
Intel X550-T2 network adapter

My client computers I am copying files to and from the NAS are i7 or i9 on Windows 10 Pro
Intel X550-T1 network card
As working drive: 4x Crucial MX500 2TB configured as striped raid (Windows software raid), Crystal Disk Mark says approx 1800 MB/sec read and write speeds.
I am dealing with very large videofiles (from 800GB up to 2.5TB!)

So, the prelude above is much longer than my question: Writing to the NAS results in marvelous writing speeds of more than 1 GB/s (who could ask for more?), but reading from the NAS are desastrous 200-530MB/s. I'd like to get the same spped for reading as I have for writing of course.

I searched the net for hours and days, tried to fix the network-adapter settings.... nothing helped. I started with the whole project a couple of days agos using Freenas (11.3) and today switched to TrueNAS in the hope of improvement. Unfortunately, it stayed all the same.
I tried with and without file compression - no difference.

What can I do to speed up my reading from the NAS?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
What's a "ZFS1" pool?

Do you mean RAIDZ1?

Sequential read speeds from ZFS are dependent on the type of data, fragmentation levels and pool occupancy rates, and overall pool design. If you are reading large sequential data from a highly fragmented RAIDZ1 pool, I'd consider 300MBytes/sec to be in the realm of reasonable, with room for improvement.

In order --

The Intel X550-T2 is frequently present in "problematic" configurations, this is not a good choice of card IMHO. There is a 10 Gig Networking Primer available on this forum which discusses better choices.

Reading from the NAS will tend to be slower than writing, because unless data is present in the ARC or L2ARC, ZFS must actually go out to disk and retrieve your data from the disk. This breaks down into several subareas:

- Add more RAM so that more is cached and served from RAM
- Add L2ARC so that more is cached in L2ARC
- Make sure your fragmentation rates are low
- Make sure you have plenty of free space on the pool (50% or more) so that ZFS has to do less seeking to find empty space (you will be rewarded when you try to read the data back).

The nasty truth here is that while a HDD might be capable of 100-200MBytes/sec when doing sequential read, that drops to 100-200KBytes/sec (that is a K, kilo, 1000) if it is having to seek for every sector. ZFS gets a lot of its magic speed when you throw gobs of resources at it, so if you are keeping last month's unneeded video files around "just in case", try deleting all of them, then write some new files, and see if it is much faster reading those newly written files back.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Also, a note, there is absolutely nothing, not even going all SSD, that will make your system read as fast as it writes. ZFS writes to memory and then stages it out to disk. The speed at which ZFS can write your incoming data to memory is infinitely faster than any storage device.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I am dealing with very large videofiles (from 800GB up to 2.5TB!)
With this size of files and desire for NLE, I would suggest that you do your editing on local copies (on a fast NVMe device or SSD RAID) and leverage your network share as a repository/storage location. As @jgreco points out, unless your data is cached in ARC you likely won't be able to reach the 1GB/s reads you're gunning for, and your file size precludes that. Even if you did go all-SSD, you'd need to spend a sizeable amount on high-end disks in the array - and even then it's no guarantee, and would still probably be an inferior editing experience vs. doing it locally based on network latency.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
For more performance, you should have smaller VDEVs... there is a capacity/performance tradeoff you have to make.
With 20 drives, you could have 10 VDEVs in 1 pool or 2 VDEVs in 1 pool or 1 VDEV in each of 2 pool.
The latter is the slowest config (sounds like what you have) ..... the 10 VDEVS (mirrors) is the fastest config. Probably 3X-5X faster.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
For more performance, you should have smaller VDEVs... there is a capacity/performance tradeoff you have to make.
With 20 drives, you could have 10 VDEVs in 1 pool or 2 VDEVs in 1 pool or 1 VDEV in each of 2 pool.
The latter is the slowest config (sounds like what you have) ..... the 10 VDEVS (mirrors) is the fastest config. Probably 3X-5X faster.

That's merely an optimization that will help somewhat after other issues such as fragmentation have been demonstrated not to be a problem.

This hinges on what is implied by "very large videofiles". Creating more vdevs is not likely to help much when you have something like torrenting TV and movies and is actually likely to hurt; the workload results in natural fragmentation, and the reduction in available space creates more space stresses. If this is a video editing situation, RAIDZ2 is a good choice for sequential file access, but mirrors could be faster, but only if you're keeping pool occupancy at reasonable rates.

It would be better to shoot for 50-60% pool utilization, write some new files, and see if that improves the situation.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
It would be useful to confirm disk utilization 1st... but very large VDEVs do turn each 128K read into much smaller disk I/Os which are very inefficient. 400MB/s at 128K recordsize = 3200 IOPS. If the VDEV is 12W, that might be a lot of disk I/Os. Would be interesting to confirm disk IOPS seen.
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
Thank you very much for all your replies!

Firstly, please excuse me when not expressing myself 100% correctly, as this TrueNAS adventure is my first more indepth contact with Linux resp. FreeBSD. I know „ls -la“ and that there is a texteditor „nano“, but not very much more.

Is there any „Starting with FreeBSD for Dummies“-alike tutorial that you can recommend? I want to learn

I will be more precise:
  • I have configured 2 Pools: one with 8x 12TB drives and another one with 12x 16TB drives.
  • Each pool has 1(!) dataset that I access from a Windowsmachine via a SMB share. Both datasets are a RAIDZ1
  • no vdevs
  • Pool 1 is 80% full, Pool 2 is empty
Workflow:
  • One Windowscomputer is for ingesting footage in 1080p50 from a AJA capture card that is fed by a Teranex framesynchronizer onto internal SSD-Raid. When finished the videofiles will be copied onto the NAS.
  • The other Windowscomputer is my daily workstation. Here I copy the videofiles from the NAS onto the internal SSD-Raid for editing. When finished, the final video will be exported onto internal SSD.
  • After a couple of projects are completed, files will manually be archived onto a LTO8 tape. This is done on the Windowscomputer I am using for ingesting footage as well (but not at the same time while ingesting video of course).
  • To make a long story short: I do not use the TrueNAS storage server to directly work on its shares as the files are too large.
Other issues you brought up:
  • Fragmentation: Theoretically there is none as the current data on Pool 1 has been copied from an Synology-NAS, strictly one file after the other. I want to get rid of Synology, that’s the reason for using TrueNAS. But: How can I check the file fragmentation von my TrueNAS server?
  • Pool 2 is empty. There I have exactly the same speed-problems: Writing is fast, reading is a mess. So, the amount of data already stored is not responsible for my slow reading speed.
  • ARC and L2ARC: My server has 128 GB of DDR4 ECC memory. My files are all by far bigger than that. So theoretically when the disk-configuration might be slower than 1GB/s when writing a 2 TB file continously to the NAS at some point the cache had to be full and the writing speed should drop down to the defacto writespeed of the harddrives, correct? But this doesnot happen: The whole 2 TB file gets written with 1 GB/s. This implies that the drives in the configuration given are truely capeable of writing (and I assume reading as well) with the full bandwidth I can access via the network (10GBE)
  • Important fact that I didnot state clearly: While reading, the speed is continuously going up and down between approx. 200 and up to 540 MB/s irregularly.
  • network card: I ordered a Chelsio T520-CR – maybe it works bette with FreeBSD?
    I read this thread https://www.ixsystems.com/community...ough-with-truenas-12-0-rc1.87965/#post-610058
    And there thy linked to https://calomel.org/freebsd_network_tuning.html - and there they recommended the above networkcard.
Having layed out things a little more detailed - any ideas what I can do for higher read performance?
Again, many many thanks to you all!
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
Here some statistics
truenas-rwspeeds.PNG

truenas-iospeeds.PNG
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
I wrote "no vdevs" which is utterly nonsense as I found out. I have 1 vdev in each pool and that is the dataset I setup as described before.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
This pool is only 8 drives. 1 VDEV is the lowest performance configuration. Every read requests is turned into 7 disk reads....assuming an 8 Wide Z1 configuration. The read IOPS look very high...

If you really NEED the extra performance, you could use your pool 2 drives to make 1 or 2 more VDEVs...... or migrate the data from Pool-1, and then rebuild the pool.

I assume zfs recordsize is 128K... Increasing that to 512K may also help if you keep the larger VDEV. However, it only helps on new data written.
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
@morganL Thank you. I could easily use the empty pool to give it a new configurateion and copy the files from the old pool to it. Than make changes to the other pool.
But: What exactly do I have to do? To be honest, I haven't understood the concept of several vdevs and how it can improve reading speeds. Can you please help me resp. point me to readings/tutorials that helps clarify things to me?
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
Thank you @Redcoat, I jumped into the philosophy and did the following configurations:
- 12 Drives in 1 vdev as striped: 1GB/s write to NAS, approx 215-400 MB/s read from NAS
- 12 Drives in 12 vdevs as striped: 1GB/s, approx 215-400 MB/s
- 4 Drives each in 3 vdevs as Raid Z1: 1GB/s, approx 300-485 MB/s

This I cannot accept. So maybe the NIC? On wednesday I will have a Chelsio T520-CR.
If you have any suggestions how to put my reading speeds up... I havent any ideas anymore...
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Thank you @Redcoat, I jumped into the philosophy and did the following configurations:
- 12 Drives in 1 vdev as striped: 1GB/s write to NAS, approx 215-400 MB/s read from NAS
- 12 Drives in 12 vdevs as striped: 1GB/s, approx 215-400 MB/s
- 4 Drives each in 3 vdevs as Raid Z1: 1GB/s, approx 300-485 MB/s

This I cannot accept. So maybe the NIC? On wednesday I will have a Chelsio T520-CR.
If you have any suggestions how to put my reading speeds up... I havent any ideas anymore...

When you are doing the Read testing... what tool are you using?

It sounds like you may have insufficient queue depth...the client hasn't issued enough Reads to utilize the NAS. If you look at the disk utilization, IOPS should be much lower.

With Writes, the cache and low latency solve the problem. With Reads, you could run multiple tests in parallel to confirm or use fio as a test tool.
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
As a testtool, I am doing simply a copy from/to a Windows Client.

After a lot of testing I switched the hardware to an ordinary Intel i5-9600k mit 32 GB RAM (of course non-ECC) just for testing. And voilá - full bandwidth in both directions, even with the silly pool "8 drives in Z1 vdev". As I used same LSI HBA controller and 10 GbE NIC, it must have something to do with the x399 mainboard and/or AMD Threadripper integration. As I dont want to use TrueNAS without ECC memory I ordered a Supermicro X10SRL-F and a Xeon E5-2620v4. I think with these guys the problems should disappear - can you agree? :smile:

Best wishes
Rob
 

tangles

Dabbler
Joined
Jan 12, 2018
Messages
33
I wonder if it had something to do with the PCIe slots being routed through a PLX chip, where some are plumbed directly into the CPU/Memory and the others go through some PLX... Admittedly I haven't looked up at the X399 mobo but maybe swapping your HBA to another PCIe slot may fix the issue…
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
As a testtool, I am doing simply a copy from/to a Windows Client.

After a lot of testing I switched the hardware to an ordinary Intel i5-9600k mit 32 GB RAM (of course non-ECC) just for testing. And voilá - full bandwidth in both directions, even with the silly pool "8 drives in Z1 vdev". As I used same LSI HBA controller and 10 GbE NIC, it must have something to do with the x399 mainboard and/or AMD Threadripper integration. As I dont want to use TrueNAS without ECC memory I ordered a Supermicro X10SRL-F and a Xeon E5-2620v4. I think with these guys the problems should disappear - can you agree? :smile:

Copying files is generally not a good bandwidth test tool..... it is very sensitive to latency and provides poor control of queue depth. Fio is better. You also have to be very careful to to clear the read cache ahead of any testing.

Could it be AMD threadripper not playing well with TrueNAS.... its possible. IXsystems doesn't sell any threadripper based products and so we don't have performance validation for them. I'd like to hear of any comparison people have done. I'd suggest upgrading to 12.0 RELEASE.
 

Rob_HH

Dabbler
Joined
Oct 17, 2020
Messages
10
So I have upgraded to Truenas 12 and today evening I could start my test with a X10SRL-F mainboard, Xeon E5-1620v3 CPU (that I intended to changed to a E5-2620v4), 64 of regECC DRAM. I configured 12 disks in 3 vdev Raid-Z. Just out of the box 850-950 MB/sec write and steady 1,00-1,05 GB/s read.
Ok - it's the Threadripper or its mainboard for whatever reason.
Well - and the old E3-1620v3 is doing an allright job. I guess there is no need to upgrade to E5-2620v4 IMHO. What is your opinion about this?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
With small number of HDDs, there is usually not a need for many CPU cores or vCPUs. Sometimes, its RAM that is the limiter, but mostly its the drives and the VDEV configuration. The extra cores are useful for VMs or plugins/jails/containers.
 
Top