TrueNAS 12.0 BETA2 Showcases Performance Improvements

}

August 12, 2020

The merger of FreeNAS and TrueNAS into a unified software image and the new naming convention is well underway, including the new truenas.com website. FreeNAS is becoming TrueNAS CORE. TrueNAS is becoming TrueNAS Enterprise. TrueNAS 12.0 will be the first release to unveil these changes officially, and the schedule was made available on the forums. The TrueNAS 12.0 BETA1 version released in June was very successful with more than 2,000 users and only minor issues. Ars Technica provided a detailed technical walkthrough.
TrueNAS 12.0 BETA2 is now available for testing with almost no functional changes, but it is up to 30% faster for many use cases! Minor BETA1 issues have been fixed and several performance improvements to ZFS, SMB, iSCSI, and NFS have been integrated. Given the number and importance of those performance improvements, this release was called BETA2. Snapshot your pool, backup your data, and try it out! You can download it here.
For the first time, TrueNAS demonstrated over 1 Million IOPS and over 15GB/s on a single node! We’ll share more about that system and its configuration soon. This release has been stress tested in both TrueNAS CORE and Enterprise forms on all the X-Series (X10 and X20) and M-Series (M40 and M50) platforms. Below are all the performance improvements in TrueNAS 12.0 so you can see which ones are most relevant to your use case.

TrueNAS 12.0 and OpenZFS 2.0 improvements include:

NUMA Improvements: With multiple CPUs in a system, there is a need to manage Non Uniform Memory Access (NUMA). TrueNAS 12.0 does a better job of assigning cores and memory, providing performance improvements for the M50 and other dual socket architectures.
ZFS Metadata on Flash: Special SSD vdevs can be used for Metadata acceleration. This can include both file systems metadata and dedupe tables. This is one of the core features of OpenZFS 2.0.
ZFS Fusion Pools: The special SSD vdevs can also be used for data based on I/O write size. This is configurable on a per dataset basis. Users can accelerate database datasets by configuring a higher I/O size.
ZFS Persistent L2ARC: L2ARC (flash-based read cache) is typically cleared on a controller reboot or failover. For smaller systems with less than a TB of L2ARC, that can be ok. For larger systems with 10TB of L2ARC, it may take hours or even days to rehydrate the L2ARC. The persistent L2ARC option avoids clearing the cache allowing performance sensitive systems to get back to full speed without delay.
ZFS async DMU and CoW: Within ZFS is a Data Management Unit (DMU) and an algorithm for Copy-on-Write (CoW). These algorithms were implemented in a synchronous manner which required a transaction to wait until another transaction was completed. iXsystems contributed to the conversion of these algorithms to an asynchronous approach which reduces the amount of wait time and increases parallelism in OpenZFS 2.0. An added benefit is that fewer disk I/Os are needed for sequential writes. This increases drive efficiency and reduces latency in heavy workloads.
ZFS Record Size Increases: One benefit of async CoW is that larger ZFS record sizes will perform better with fewer Read-Modify-Write activities. Instead of operating with 128KB record size, a 256KB or 512KB record size may be OK for some workloads. This will increase the bandwidth of many RAIDZ1/2/3 VDEVs.
ZFS Checksum Vectorization: ZFS protects data by writing a Checksum into metadata for each block of data written to disk. These checksums are then used for scrubbing the data and verifying every READ. The calculation of these checksums can be compute intensive. Vectorization uses the accelerated instructions found in many Intel processors to reduce compute overhead and free up valuable compute cycles for other tasks.
ZFS Asynchronous TRIM: OpenZFS 2.0 includes asynchronous automatic and manual trim capabilities. Manual Trims can be scheduled overnight or each weekend to provide more performance during business hours.
Faster ZFS Boot: OpenZFS 2.0 includes a more parallel process for importing a ZFS pool with many drives. This reduces boot and failover times by over 50% for larger systems.
ZFS Dedupe: ZFS deduplication performs well if all the dedupe metadata is in DRAM, but is painfully slow if the dedupe metadata ends up on HDDs. With the addition of Fusion Pools, the dedupe metadata can be assigned to the flash VDEVs and performance is improved. There is some ongoing testing to see how much faster it will be, but we expect significant progress.
In addition to the ZFS improvements, there have been some dramatic improvements in the performance of some key services:
iSCSI Reads: iXsystems has enhanced the iSCSI target software so that a memory copy between the Ethernet NIC and ZFS is removed. This improves the high end performance limits and allows greater than 1 Million IOPS and over 15GB/s to be achieved with the right hardware.
SMB Single Client Speed: The speed of a single SMB client is important for many applications including multimedia editing where the upload and download speeds for 4K and 8K video files is important. These speeds have been increased by >20% to over 2 Gigabytes per second.
SMB Multi-Client Capacity: The number of SMB clients that can be supported is important to large organizations. The number of SMB clients that can be supported on a high-end system has been increased by more than 50%.
NFS Single Client: The NFS target has been improved to reduce latency and increase the bandwidth of a single NFS client from less than 2GB/s to over 3GB/s.
On the TrueNAS Enterprise side with the M-Series platforms, we have been testing for a high-performance system and have added support for:
Multiple NVDIMMs: Each NVDIMM can be assigned as a Write SLOG for different pools. A single system can have an All-flash pool and a Fusion or Hybrid Pool with HDDs.
20GB/s PCIe Interconnect: For High Availability (HA) systems with dual controllers, we use a high-speed PCIe interconnect to provide low latency synchronization of WRITES. This high bandwidth interconnect reduces latency and increases WRITE bandwidth by 100%.
All of these performance improvements, plus advances in processor performance, contribute to the ability to build and support larger systems well beyond 10PB in size.

Progress toward TrueNAS 12.0 RELEASE!

TrueNAS 12.0 is going through the same NIGHTLY, BETA1, BETA2, RC1, RELEASE, and UPDATE stages that FreeNAS has gone through. There is a TrueNAS 12.0 sub-forum on the Community forums for this unification process and Community feedback.
We appreciated the Community testing of the TrueNAS 12.0 BETA1 release. TrueNAS 12.0 BETA2 has also been tested on Enterprise HA systems within our labs. Please update to BETA2 and provide your feedback. Let us know whether you see the expected performance improvements. Bugs that are caught and reported early are going to have less impact on the final schedule.
TrueNAS 12.0 Documentation is Maturing
The new TrueNAS 12.0 documentation is more modular and expandable. The Community is invited to edit and contribute. Please check out the documentation even if you don’t upgrade today.
TrueNAS CORE: Still the Best Free NAS
We hope these TrueNAS 12.0 performance improvements have a positive impact on your systems. If you have any questions or comments, we’d love to hear them on the community forums, on the TrueNAS subreddit, or in response to this blog. If you need additional information on how TrueNAS can streamline, accelerate, and unify data management for your business, email us.
For those with FreeNAS 11.3 installed on your system, you can upgrade to TrueNAS 12.0 BETA with a single click! Otherwise, download TrueNAS 12.0 BETA2 and get started.

Share On Social: