large files (150gig) not work from a 7TB SSD pcie L2ARC

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
Hi,
I have been doing a lot of playing and testing.. building a server for a specific need. It's mostly working. However it's not archiving exactly what I need so I am coming to the source of truth. I am using FreeNAS FreeNAS-11.3-U4.1

I deal with a lot of 150gig media files. They come in and then have to be munched on, QC, renders etc. Read numerous times by different application servers over 10gbe ethernet. Currently I have a 6x4tb drive SATA array. I added 2x l2arc 4TB pcie SSD (Total more like 7tb). Works great, can get over 2000meg/s reads (using dd) from files once in the l2arc.

I added the following TUNABLES
vfs.zfs.l2arc_noprefetch 0 loader # cache streaming data yes
vfs.zfs.l2arc_write_boost 400000000 loader
vfs.zfs.l2arc_write_max 400000000 loader

This did help a lot, however, VERY larger files do not appear to make it into the l2arc. Not sure what the size is, but when reading the video essence, it goes back to DISK sequential speeds. Audio, smaller files, go fast as the computer can deal with it so I imagine that are in the l2arc. Yes I have been looking at the graphs etc to check whats happening.

I have done a lot of googling all over and specifically here. But unable to fine helpful discussion on this topic.

Is there any secret source to deal with extremely large files going into l2arc. And even tho very large, 150gig, its nothing compared to 7TB.

I imagine there may be other TUNABLES I can pull on....
Any advice appreciated.

Thanks,
James
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Hi,
I have been doing a lot of playing and testing.. building a server for a specific need. It's mostly working. However it's not archiving exactly what I need so I am coming to the source of truth. I am using FreeNAS FreeNAS-11.3-U4.1

I deal with a lot of 150gig media files. They come in and then have to be munched on, QC, renders etc. Read numerous times by different application servers over 10gbe ethernet. Currently I have a 6x4tb drive SATA array. I added 2x l2arc 4TB pcie SSD (Total more like 7tb). Works great, can get over 2000meg/s reads (using dd) from files once in the l2arc.

I added the following TUNABLES
vfs.zfs.l2arc_noprefetch 0 loader # cache streaming data yes
vfs.zfs.l2arc_write_boost 400000000 loader
vfs.zfs.l2arc_write_max 400000000 loader

This did help a lot, however, VERY larger files do not appear to make it into the l2arc. Not sure what the size is, but when reading the video essence, it goes back to DISK sequential speeds. Audio, smaller files, go fast as the computer can deal with it so I imagine that are in the l2arc. Yes I have been looking at the graphs etc to check whats happening.

I have done a lot of googling all over and specifically here. But unable to fine helpful discussion on this topic.

Is there any secret source to deal with extremely large files going into l2arc. And even tho very large, 150gig, its nothing compared to 7TB.

I imagine there may be other TUNABLES I can pull on....
Any advice appreciated.

Thanks,
James
What are your system hardware specs?

In particular, how much RAM is installed in your FreeNAS server?

The rule of thumb with L2ARC is to first install the maximum amount of RAM before installing an L2ARC device -- because RAM is much faster than disk.

Also, an L2ARC device itself requires RAM for overhead; the larger it is, the more RAM it takes to administer it.
 

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
The test system I am building is a R710, only has 24meg at the moment, but 72 or 96 depending on what I have around when it goes into production.
It has an onboard SATA-HBA with the 6x4TB HD, 2x M.2 pcie cards with 4TB SSD. On one of the cards it also has a second SSD M.2 Sata I use as the boot disk into the onboard sata connector. I also have a 10gbe ethernet card. and a external SATA-HBA for connecting the 12bay M1200 unit once I am happy with its performance and reliability.
Slot 1: Half-Length (6.6" Factory Installation) / Full-Height (x8 connector), x4 link width 10Gbps, SATA HBA for 6 front disks.
Slot 2: Full-Length (12.2" Factory Installation) / Full-Height (x8 connector), x8 link width 20Gbps, M.2 nvme 4TB, M.2 SATA to on board SATA
Slot 3: Full-Length (12.2" Factory Installation) / Full-Height (x8 connector), x8 link width 20Gbps, M.2 nvme 4TB
Slot 4: Half-Length (6.6" Factory Installation) / Full-Height (x8 connector), x4 link width 10Gbps, SATA HBA for M1200 JOB (Not yet in use)
Slot 5: Half-Length (6.6" Factory Installation) / Full-Height (x8 connector), x4 link width 10Gbps, Chelsio CC2-N320E-SR 10gbe sfp+
Not sure what card is in what slot at the moment.
Has 2 xenon 4/8, total 8/16 threads. 2.2/2.4 GHz, not sure of the top of my head..

The ethernet card is Chelsio CC2-N320E-SR 10gbe sfp+ card.
I am not using jumbo frames currently, Want to avoid is possible.

Since I wrote the message this morning. I have been doing more tests. Typically I do iperf3 tests between sever and host before I do a dd read test over a nfs share. I get a typical 9.x Gbps for the 10gbe connection. but when I dd if=LARGE_FILE of=/dev/null bs=4M status=progress, I get a wide variety of results depending on the server I am on.. Even though I get iperf3 results that are acceptable.

I thought it may have been the l2arc, but after todays adventures I am not so sure.
one server gets 200MB/s (R720 vmware 6.5 ubuntu VM)
another 400MB/s (Desktop Ubuntu)
and a 3rd 500MB/s (r610 proxmox ubtuntu)
Mounting share as fstab
10.11.2.222:/mnt/testPool3/nfs_share /nfs_share nfs rw,sync,suid 0 0

On the FreeNAS server itself I get 2150MB/s.

I find it very strange I get such different numbers on all the different units.
Technically, from my understanding, I should be able to archive about 950MB/s (Including overhead) via the 10GBe share if all goes well.

Did I miss any detail you would like?

Thanks,
James
 

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
Ok, sorry to have led people down this garden path.

I just did a iperf3 test in the opposite direction and, bingo, there is the problem..

[jamieg@freenastest /mnt/testPool3/nfs_share]$ iperf3 -c 10.11.1.196
Connecting to host 10.11.1.196, port 5201
[ 5] local 10.11.1.222 port 49049 connected to 10.11.1.196 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 188 MBytes 1.57 Gbits/sec 1 98.4 KBytes
[ 5] 1.00-2.00 sec 199 MBytes 1.67 Gbits/sec 1 123 KBytes
[ 5] 2.00-3.00 sec 218 MBytes 1.83 Gbits/sec 2 68.4 KBytes
[ 5] 3.00-4.00 sec 185 MBytes 1.56 Gbits/sec 1 86.9 KBytes
[ 5] 4.00-5.00 sec 182 MBytes 1.53 Gbits/sec 1 98.4 KBytes
[ 5] 5.00-6.00 sec 196 MBytes 1.64 Gbits/sec 1 107 KBytes
[ 5] 6.00-7.00 sec 198 MBytes 1.66 Gbits/sec 2 105 KBytes
[ 5] 7.00-8.00 sec 188 MBytes 1.57 Gbits/sec 1 117 KBytes
[ 5] 8.00-9.00 sec 187 MBytes 1.57 Gbits/sec 1 124 KBytes
[ 5] 9.00-10.00 sec 158 MBytes 1.33 Gbits/sec 3 72.7 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.86 GBytes 1.59 Gbits/sec 14 sender
[ 5] 0.00-10.19 sec 1.86 GBytes 1.56 Gbits/sec receiver

iperf Done.


Now Why would I get such poor performance one way and not the other.. That's the question.
I al looking at poor selection of NIC, (Chelsio CC2-N320E-SR 10gbe sfp+).
 

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
So, its not the ethernet card, I swapped it for a Intel X540T2, and get exactly the same results..

Maybe PCI bus and how may lanes.. But unlikely as if I can push 10gbe to the server with iperf3, but only get 1.4-1.6 when I pull data from the server.. PCI lanes are symmetrical so, if I can do 10 in one direction I should be able to do 10 in the other.

I will pull out the unused HBA for now to see if that makes a difference.

My current FreeNAS is a R610 with 96gig ram. HBA with 28 SATA disk hanging on it, 10GBe network card... Its been PERFECT over 4 years. So I didn;t expect to see these problems.. I just wanted the same basic config plus 2 nvme L2ARC added..

Are there any recommendations on how to dig into this more..
 

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
Ok, for those who may want to know what the issue was in the end......

I still have to test it, as I need to get physical access to the server again, but digging into this...
I am simply running out of PCI Express bus lanes. And the Ethernet card is stuck on
Code:
lspci -s 0000:07:00.0  -vv | grep " Speed "
LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s unlimited, L1 unlimited

The Card is in a x8 slot but only getting x1 performance.

So my next step is to drop out the unused LSI card that was suppose to plug into the larger external disk array.
If this brings everything in line.. Then all I can say is, Don't use a R710 for this type of use model.

I am looking to see if I can land a R720 cheap in the future to finish this implementation. It has far more PCI Express lands and I should not have a problem.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
People in this forum have waxed lyrical about the amount of lanes that EPYC brings to the table. Something to consider.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
7 TB of L2ARC? For a 14.5 TB pool? That's gotta be a record.
 

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
7 TB of L2ARC? For a 14.5 TB pool? That's gotta be a record.
This is only the staging server. testing it all works. Its going to be more like 60-70TB when I plug in the JBOD.. But now it looks like I have to obtain a better server before I can go there.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Maybe you can swap things around, check the BIOS settings. The CPUs should have enough lanes, contingent on the platform being designed to support your scenario.
 

James Gardiner

Dabbler
Joined
Jul 14, 2017
Messages
19
Maybe you can swap things around, check the BIOS settings. The CPUs should have enough lanes, contingent on the platform being designed to support your scenario.
Tried all that,
this old R710 monster only has 36 PCI express lanes. Considering all the onboard stuff etc. It simply does not have enough to deal with 4x 8 lane cards.

Swapping stuff around did help, but only a little. In the end the lack of lanes is my problem. The r710 is just to old and the bus bandwidth of its era was not great..

Anyone want to send me a Dell R720, 80 lanes over 36, I think that will fit the bill nicely.

I an doing this for a youtube video for my cinetechgeek channel, I focus on very niche market of cinema exhibitors. They are getting smashed in this crisis, so have a topic of "Doing more with less" and looking at making servers for store films for small cinema on the cheap. So a R710 can be had for $350-$500, setup with IT mode card, drop an OS on it depending on the use model. Amazing value.
 
Top