Intel QuickAssist Technology (QAT) adapters for checksum operations

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
So we had done some looking into this a while back. The issue was that the round trips to the QAT wasn't really adding much value, since the bottleneck was never really the checksum work, vs just having it done in CPU which was usually pretty idle anyway. This was a few years ago, but I'm not sure if that dynamic has really changed at all.
 

JoeAtWork

Contributor
Joined
Aug 20, 2018
Messages
165
I have 3 LSI SAS2 HBA's running 48 7200 rpm SAS drives and doing a scrub I see a LOT of CPU.

1629216027242.png
 

awasb

Patron
Joined
Jan 11, 2021
Messages
415
Just to up this topic:

Intel QAT does work (on a C3758-SoC, that is) in TrueNAS Core 13.0-U3.1. It's not yet in kernel like AES-NI, but the modules are there. I had to set it up as tunable/loader. Post-init loading did not work (the module loaded, but no devices were attached). In my case it looks like this:

Code:
root@truenas ~ # kldstat -v | grep qat                                                                                                                     [1]
 6    1 0xffffffff82833000    20df8 qat.ko (/boot/kernel/qat.ko)
          5 pci/qat
11    1 0xffffffff83149000    a1f88 qat_c3xxxfw.ko (/boot/kernel/qat_c3xxxfw.ko)
         13 qat_c3xxxfw_fw


To check, whether it's working or not (even though, I don't exactly know concerning what software ...) check the interrupts. The stats should grow. Post-Init:

Code:
root@truenas ~ # vmstat -i | grep qat                                                                                                                      [0]
irq32: qat0                            9          0
irq33: qat0                            7          0
irq34: qat0                            8          0
irq35: qat0                            4          0
irq36: qat0                            4          0
irq37: qat0                            7          0
irq38: qat0                            8          0
irq39: qat0                            3          0


Compared to two minutes later:

Code:
root@truenas ~ # vmstat -i | grep qat                                                                                                                      [0]
irq32: qat0                            9          0
irq33: qat0                           28          0
irq34: qat0                           25          0
irq35: qat0                           17          0
irq36: qat0                           27          0
irq37: qat0                           21          0
irq38: qat0                           39          0
irq39: qat0                           38          0


Parallel scrubbing over 10 disks (onboard SATA) barely hits 18% CPU peak load. The average is 12%. Random screenshot:

htop_while_scrubbing.jpg


CAVEAT EMPTOR! Did not test it without QAT, so maybe it's "just" AES-NI and newer implementations of AES-NI are actually fast enough to speed up the scrubs from 55mins on average with my old C2750 to 45mins with the new C3758. (Even though I have some doubts. The C2750 had 2.4GHz base frequency with "TurboBoost" up to 2.6GHz, while the C3758 is fixed at 2.2GHz.)

gzip and openssl will not (yet) benefit, since the necessary engines/options are not (yet) configured/compiled in. But that will be a matter of time ... i hope.
 
Last edited:
Top